github.com/dgraph-io/dgraph@v1.2.8/wiki/content/howto/index.md (about) 1 +++ 2 date = "2017-03-20T19:35:35+11:00" 3 title = "How To Guides" 4 +++ 5 6 ## Retrieving Debug Information 7 8 Each Dgraph data node exposes profile over `/debug/pprof` endpoint and metrics over `/debug/vars` endpoint. Each Dgraph data node has it's own profiling and metrics information. Below is a list of debugging information exposed by Dgraph and the corresponding commands to retrieve them. 9 10 ### Metrics Information 11 12 If you are collecting these metrics from outside the Dgraph instance you need to pass `--expose_trace=true` flag, otherwise there metrics can be collected by connecting to the instance over localhost. 13 14 ``` 15 curl http://<IP>:<HTTP_PORT>/debug/vars 16 ``` 17 18 Metrics can also be retrieved in the Prometheus format at `/debug/prometheus_metrics`. See the [Metrics]({{< relref "deploy/index.md#metrics" >}}) section for the full list of metrics. 19 20 ### Profiling Information 21 22 Profiling information is available via the `go tool pprof` profiling tool built into Go. The ["Profiling Go programs"](https://blog.golang.org/profiling-go-programs) Go blog post will help you get started with using pprof. Each Dgraph Zero and Dgraph Alpha exposes a debug endpoint at `/debug/pprof/<profile>` via the HTTP port. 23 24 ``` 25 go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/heap 26 #Fetching profile from ... 27 #Saved Profile in ... 28 ``` 29 The output of the command would show the location where the profile is stored. 30 31 In the interactive pprof shell, you can use commands like `top` to get a listing of the top functions in the profile, `web` to get a visual graph of the profile opened in a web browser, or `list` to display a code listing with profiling information overlaid. 32 33 #### CPU Profile 34 35 ``` 36 go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/profile 37 ``` 38 39 #### Memory Profile 40 41 ``` 42 go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/heap 43 ``` 44 45 #### Block Profile 46 47 Dgraph by default doesn't collect the block profile. Dgraph must be started with `--profile_mode=block` and `--block_rate=<N>` with N > 1. 48 49 ``` 50 go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/block 51 ``` 52 53 #### Goroutine stack 54 55 The HTTP page `/debug/pprof/` is available at the HTTP port of a Dgraph Zero or Dgraph Alpha. From this page a link to the "full goroutine stack dump" is available (e.g., on a Dgraph Alpha this page would be at `http://localhost:8080/debug/pprof/goroutine?debug=2`). Looking at the full goroutine stack can be useful to understand goroutine usage at that moment. 56 57 ## Using the Debug Tool 58 59 {{% notice "note" %}} 60 To debug a running Dgraph cluster, first copy the postings ("p") directory to 61 another location. If the Dgraph cluster is not running, then you can use the 62 same postings directory with the debug tool. 63 64 If the “p” directory has been encrypted, then the debug tool will need to use the --keyfile <path-to-keyfile> option. This file must contain the same key that was used to encrypt the “p” directory. 65 {{% /notice %}} 66 67 The `dgraph debug` tool can be used to inspect Dgraph's posting list structure. 68 You can use the debug tool to inspect the data, schema, and indices of your 69 Dgraph cluster. 70 71 Some scenarios where the debug tool is useful: 72 73 - Verify that mutations committed to Dgraph have been persisted to disk. 74 - Verify that indices are created. 75 - Inspect the history of a posting list. 76 77 ### Example Usage 78 79 Debug the p directory. 80 81 ```sh 82 $ dgraph debug --postings ./p 83 ``` 84 85 Debug the p directory, not opening in read-only mode. This is typically necessary when the database was not closed properly. 86 87 ```sh 88 $ dgraph debug --postings ./p --readonly=false 89 ``` 90 91 Debug the p directory, only outputing the keys for the predicate `name`. 92 93 ```sh 94 $ dgraph debug --postings ./p --readonly=false --pred=name 95 ``` 96 97 Debug the p directory, looking up a particular key: 98 99 ```sh 100 $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 101 ``` 102 103 Debug the p directory, inspecting the history of a particular key: 104 105 ```sh 106 $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 --history 107 ``` 108 109 Debug an encrypted p directory with the key in a local file at the path ./key_file: 110 111 ```sh 112 $ dgraph debug --postings ./p --keyfile ./key_file 113 ``` 114 115 116 {{% notice "note" %}} 117 The key file contains the key used to decrypt/encrypt the db. This key should be kept secret. As a best practice, 118 119 - Do not store the key file on the disk permanently. Back it up in a safe place and delete it after using it with the debug tool. 120 121 - If the above is not possible, make sure correct privileges are set on the keyfile. Only the user who owns the dgraph process should be able to read / write the key file: `chmod 600` 122 {{% /notice %}} 123 124 ### Debug Tool Output 125 126 Let's go over an example with a Dgraph cluster with the following schema with a term index, full-text index, and two separately committed mutations: 127 128 ```sh 129 $ curl localhost:8080/alter -d ' 130 name: string @index(term) . 131 url: string . 132 description: string @index(fulltext) . 133 ' 134 ``` 135 136 ```sh 137 $ curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -d '{ 138 set { 139 _:dgraph <name> "Dgraph" . 140 _:dgraph <dgraph.type> "Software" . 141 _:dgraph <url> "https://github.com/dgraph-io/dgraph" . 142 _:dgraph <description> "Fast, Transactional, Distributed Graph Database." . 143 } 144 }' 145 ``` 146 147 ```sh 148 $ curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -d '{ 149 set { 150 _:badger <name> "Badger" . 151 _:badger <dgraph.type> "Software" . 152 _:badger <url> "https://github.com/dgraph-io/badger" . 153 _:badger <description> "Embeddable, persistent and fast key-value (KV) database written in pure Go." . 154 } 155 }' 156 ``` 157 158 After stopping Dgraph, you can run the debug tool to inspect the postings directory: 159 160 {{% notice "note" %}} 161 The debug output can be very large. Typically you would redirect the debug tool to a file first for easier analysis. 162 {{% /notice %}} 163 164 ```sh 165 $ dgraph debug --postings ./p 166 ``` 167 168 ```text 169 Opening DB: ./p 170 Min commit: 1. Max commit: 5, w.r.t 18446744073709551615 171 prefix = 172 {d} {v.ok} attr: url uid: 1 key: 00000375726c000000000000000001 item: [71, b0100] ts: 3 173 {d} {v.ok} attr: url uid: 2 key: 00000375726c000000000000000002 item: [71, b0100] ts: 5 174 {d} {v.ok} attr: name uid: 1 key: 0000046e616d65000000000000000001 item: [43, b0100] ts: 3 175 {d} {v.ok} attr: name uid: 2 key: 0000046e616d65000000000000000002 item: [43, b0100] ts: 5 176 {i} {v.ok} attr: name term: [1] badger key: 0000046e616d650201626164676572 item: [30, b0100] ts: 5 177 {i} {v.ok} attr: name term: [1] dgraph key: 0000046e616d650201646772617068 item: [30, b0100] ts: 3 178 {d} {v.ok} attr: _predicate_ uid: 1 key: 00000b5f7072656469636174655f000000000000000001 item: [104, b0100] ts: 3 179 {d} {v.ok} attr: _predicate_ uid: 2 key: 00000b5f7072656469636174655f000000000000000002 item: [104, b0100] ts: 5 180 {d} {v.ok} attr: description uid: 1 key: 00000b6465736372697074696f6e000000000000000001 item: [92, b0100] ts: 3 181 {d} {v.ok} attr: description uid: 2 key: 00000b6465736372697074696f6e000000000000000002 item: [119, b0100] ts: 5 182 {i} {v.ok} attr: description term: [8] databas key: 00000b6465736372697074696f6e020864617461626173 item: [38, b0100] ts: 5 183 {i} {v.ok} attr: description term: [8] distribut key: 00000b6465736372697074696f6e0208646973747269627574 item: [40, b0100] ts: 3 184 {i} {v.ok} attr: description term: [8] embedd key: 00000b6465736372697074696f6e0208656d62656464 item: [37, b0100] ts: 5 185 {i} {v.ok} attr: description term: [8] fast key: 00000b6465736372697074696f6e020866617374 item: [35, b0100] ts: 5 186 {i} {v.ok} attr: description term: [8] go key: 00000b6465736372697074696f6e0208676f item: [33, b0100] ts: 5 187 {i} {v.ok} attr: description term: [8] graph key: 00000b6465736372697074696f6e02086772617068 item: [36, b0100] ts: 3 188 {i} {v.ok} attr: description term: [8] kei key: 00000b6465736372697074696f6e02086b6569 item: [34, b0100] ts: 5 189 {i} {v.ok} attr: description term: [8] kv key: 00000b6465736372697074696f6e02086b76 item: [33, b0100] ts: 5 190 {i} {v.ok} attr: description term: [8] persist key: 00000b6465736372697074696f6e020870657273697374 item: [38, b0100] ts: 5 191 {i} {v.ok} attr: description term: [8] pure key: 00000b6465736372697074696f6e020870757265 item: [35, b0100] ts: 5 192 {i} {v.ok} attr: description term: [8] transact key: 00000b6465736372697074696f6e02087472616e73616374 item: [39, b0100] ts: 3 193 {i} {v.ok} attr: description term: [8] valu key: 00000b6465736372697074696f6e020876616c75 item: [35, b0100] ts: 5 194 {i} {v.ok} attr: description term: [8] written key: 00000b6465736372697074696f6e02087772697474656e item: [38, b0100] ts: 5 195 {s} {v.ok} attr: url key: 01000375726c item: [13, b0001] ts: 1 196 {s} {v.ok} attr: name key: 0100046e616d65 item: [23, b0001] ts: 1 197 {s} {v.ok} attr: _predicate_ key: 01000b5f7072656469636174655f item: [31, b0001] ts: 1 198 {s} {v.ok} attr: description key: 01000b6465736372697074696f6e item: [41, b0001] ts: 1 199 {s} {v.ok} attr: dgraph.type key: 01000b6467726170682e74797065 item: [40, b0001] ts: 1 200 Found 28 keys 201 ``` 202 203 Each line in the debug output contains a prefix indicating the type of the key: `{d}`: Data key; `{i}`: Index key; `{c}`: Count key; `{r}`: Reverse key; `{s}`: Schema key. In the debug output above, we see data keys, index keys, and schema keys. 204 205 Each index key has a corresponding index type. For example, in `attr: name term: [1] dgraph` the `[1]` shows that this is the term index ([0x1][tok_term]); in `attr: description term: [8] fast`, the `[8]` shows that this is the full-text index ([0x8][tok_fulltext]). These IDs match the index IDs in [tok.go][tok]. 206 207 [tok_term]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L39 208 [tok_fulltext]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L48 209 [tok]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L37-L53 210 211 ### Key Lookup 212 213 Every key can be inspected further with the `--lookup` flag for the specific key. 214 215 ```sh 216 $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 217 ``` 218 219 ```text 220 Opening DB: ./p 221 Min commit: 1. Max commit: 5, w.r.t 18446744073709551615 222 Key: 00000b6465736372697074696f6e0208676f Length: 2 223 Uid: 1 Op: 1 224 Uid: 2 Op: 1 225 ``` 226 227 For data keys, a lookup shows its type and value. Below, we see that the key for `attr: url uid: 1` is a string value. 228 229 ```sh 230 $ dgraph debug --postings ./p --lookup 00000375726c000000000000000001 231 ``` 232 233 ```text 234 Opening DB: ./p 235 Min commit: 1. Max commit: 5, w.r.t 18446744073709551615 236 Key: 0000046e616d65000000000000000001 Length: 1 237 Uid: 18446744073709551615 Op: 1 Type: STRING. String Value: "https://github.com/dgraph-io/dgraph" 238 ``` 239 240 For index keys, a lookup shows the UIDs that are part of this index. Below, we see that the `fast` index for the `<description>` predicate has UIDs 0x1 and 0x2. 241 242 ```sh 243 $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 244 ``` 245 246 ```text 247 Opening DB: ./p 248 Min commit: 1. Max commit: 5, w.r.t 18446744073709551615 249 Key: 00000b6465736372697074696f6e0208676f Length: 2 250 Uid: 1 Op: 1 251 Uid: 2 Op: 1 252 ``` 253 254 ### Key History 255 256 You can also look up the history of values for a key using the `--history` option. 257 258 ```sh 259 $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 --history 260 ``` 261 ```text 262 Opening DB: ./p 263 Min commit: 1. Max commit: 5, w.r.t 18446744073709551615 264 ==> key: 00000b6465736372697074696f6e020866617374. PK: &{byteType:2 Attr:description Uid:0 Termfast Count:0 bytePrefix:0} 265 ts: 5 {item}{delta} 266 Uid: 2 Op: 1 267 268 ts: 3 {item}{delta} 269 Uid: 1 Op: 1 270 ``` 271 272 Above, we see that UID 0x1 was committed to this index at ts 3, and UID 0x2 was committed to this index at ts 5. 273 274 The debug output also shows UserMeta information: 275 276 - `{complete}`: Complete posting list 277 - `{uid}`: UID posting list 278 - `{delta}`: Delta posting list 279 - `{empty}`: Empty posting list 280 - `{item}`: Item posting list 281 - `{deleted}`: Delete marker 282 283 ## Using the Increment Tool 284 285 The `dgraph increment` tool increments a counter value transactionally. The 286 increment tool can be used as a health check that an Alpha is able to service 287 transactions for both queries and mutations. 288 289 ### Example Usage 290 291 Increment the default predicate (`counter.val`) once. If the predicate doesn't yet 292 exist, then it will be created starting at counter 0. 293 294 ```sh 295 $ dgraph increment 296 ``` 297 298 Increment the counter predicate against the Alpha running at address `--alpha` (default: `localhost:9080`): 299 300 ```sh 301 $ dgraph increment --alpha=192.168.1.10:9080 302 ``` 303 304 Increment the counter predicate specified by `--pred` (default: `counter.val`): 305 306 ```sh 307 $ dgraph increment --pred=counter.val.healthcheck 308 ``` 309 310 Run a read-only query for the counter predicate and does not run a mutation to increment it: 311 312 ```sh 313 $ dgraph increment --ro 314 ``` 315 316 Run a best-effort query for the counter predicate and does not run a mutation to increment it: 317 318 ```sh 319 $ dgraph increment --be 320 ``` 321 322 Run the increment tool 1000 times every 1 second: 323 324 ```sh 325 $ dgraph increment --num=1000 --wait=1s 326 ``` 327 328 ### Increment Tool Output 329 330 ```sh 331 # Run increment a few times 332 $ dgraph increment 333 0410 10:31:16.379 Counter VAL: 1 [ Ts: 1 ] 334 $ dgraph increment 335 0410 10:34:53.017 Counter VAL: 2 [ Ts: 3 ] 336 $ dgraph increment 337 0410 10:34:53.648 Counter VAL: 3 [ Ts: 5 ] 338 339 # Run read-only queries to read the counter a few times 340 $ dgraph increment --ro 341 0410 10:34:57.35 Counter VAL: 3 [ Ts: 7 ] 342 $ dgraph increment --ro 343 0410 10:34:57.886 Counter VAL: 3 [ Ts: 7 ] 344 $ dgraph increment --ro 345 0410 10:34:58.129 Counter VAL: 3 [ Ts: 7 ] 346 347 # Run best-effort query to read the counter a few times 348 $ dgraph increment --be 349 0410 10:34:59.867 Counter VAL: 3 [ Ts: 7 ] 350 $ dgraph increment --be 351 0410 10:35:01.322 Counter VAL: 3 [ Ts: 7 ] 352 $ dgraph increment --be 353 0410 10:35:02.674 Counter VAL: 3 [ Ts: 7 ] 354 355 # Run a read-only query to read the counter 5 times 356 $ dgraph increment --ro --num=5 357 0410 10:35:18.812 Counter VAL: 3 [ Ts: 7 ] 358 0410 10:35:18.813 Counter VAL: 3 [ Ts: 7 ] 359 0410 10:35:18.815 Counter VAL: 3 [ Ts: 7 ] 360 0410 10:35:18.817 Counter VAL: 3 [ Ts: 7 ] 361 0410 10:35:18.818 Counter VAL: 3 [ Ts: 7 ] 362 363 # Increment the counter 5 times 364 $ dgraph increment --num=5 365 0410 10:35:24.028 Counter VAL: 4 [ Ts: 8 ] 366 0410 10:35:24.061 Counter VAL: 5 [ Ts: 10 ] 367 0410 10:35:24.104 Counter VAL: 6 [ Ts: 12 ] 368 0410 10:35:24.145 Counter VAL: 7 [ Ts: 14 ] 369 0410 10:35:24.178 Counter VAL: 8 [ Ts: 16 ] 370 371 # Increment the counter 5 times, once every second. 372 $ dgraph increment --num=5 --wait=1s 373 0410 10:35:26.95 Counter VAL: 9 [ Ts: 18 ] 374 0410 10:35:27.975 Counter VAL: 10 [ Ts: 20 ] 375 0410 10:35:28.999 Counter VAL: 11 [ Ts: 22 ] 376 0410 10:35:30.028 Counter VAL: 12 [ Ts: 24 ] 377 0410 10:35:31.054 Counter VAL: 13 [ Ts: 26 ] 378 379 # If the Alpha is too busy or unhealthy, the tool will timeout and retry. 380 $ dgraph increment 381 0410 10:36:50.857 While trying to process counter: Query error: rpc error: code = DeadlineExceeded desc = context deadline exceeded. Retrying... 382 ``` 383 384 ## Giving Nodes a Type 385 386 It's often useful to give the nodes in a graph *types* (also commonly referred 387 to as *labels* or *kinds*). You can do so using the [type system]({{< relref "query-language/index.md#type-system" >}}). 388 389 ## Loading CSV Data 390 391 [Dgraph mutations]({{< relref "mutations/index.md" >}}) are accepted in RDF 392 N-Quad and JSON formats. To load CSV-formatted data into Dgraph, first convert 393 the dataset into one of the accepted formats and then load the resulting dataset 394 into Dgraph. This section demonstrates converting CSV into JSON. There are 395 many tools available to convert CSV to JSON. For example, you can use 396 [`d3-dsv`](https://github.com/d3/d3-dsv)'s `csv2json` tool as shown below: 397 398 ```csv 399 Name,URL 400 Dgraph,https://github.com/dgraph-io/dgraph 401 Badger,https://github.com/dgraph-io/badger 402 ``` 403 404 ```sh 405 $ csv2json names.csv --out names.json 406 $ cat names.json | jq '.' 407 [ 408 { 409 "Name": "Dgraph", 410 "URL": "https://github.com/dgraph-io/dgraph" 411 }, 412 { 413 "Name": "Badger", 414 "URL": "https://github.com/dgraph-io/badger" 415 } 416 ] 417 ``` 418 419 This JSON can be loaded into Dgraph via the programmatic clients. This follows 420 the [JSON Mutation Format]({{< relref "mutations#json-mutation-format" >}}). 421 Note that each JSON object in the list above will be assigned a unique UID since 422 the `uid` field is omitted. 423 424 [The Ratel UI (and HTTP clients) expect JSON data to be stored within the `"set"` 425 key]({{< relref "mutations/index.md#json-syntax-using-raw-http-or-ratel-ui" 426 >}}). You can use `jq` to transform the JSON into the correct format: 427 428 ```sh 429 $ cat names.json | jq '{ set: . }' 430 ``` 431 ```json 432 { 433 "set": [ 434 { 435 "Name": "Dgraph", 436 "URL": "https://github.com/dgraph-io/dgraph" 437 }, 438 { 439 "Name": "Badger", 440 "URL": "https://github.com/dgraph-io/badger" 441 } 442 ] 443 } 444 ``` 445 446 Let's say you have CSV data in a file named connects.csv that's connecting nodes 447 together. Here, the `connects` field should `uid` type. 448 449 ```csv 450 uid,connects 451 _:a,_:b 452 _:a,_:c 453 _:c,_:d 454 _:d,_:a 455 ``` 456 457 {{% notice "note" %}} 458 To reuse existing integer IDs from a CSV file as UIDs in Dgraph, use Dgraph Zero's [assign endpoint]({{< relref "deploy/index.md#more-about-dgraph-zero" >}}) before data loading to allocate a range of UIDs that can be safely assigned. 459 {{% /notice %}} 460 461 To get the correct JSON format, you can convert the CSV into JSON and use `jq` 462 to transform it in the correct format where the `connects` edge is a node uid: 463 464 ```sh 465 $ csv2json connects.csv | jq '[ .[] | { uid: .uid, connects: { uid: .connects } } ]' 466 ``` 467 468 ```json 469 [ 470 { 471 "uid": "_:a", 472 "connects": { 473 "uid": "_:b" 474 } 475 }, 476 { 477 "uid": "_:a", 478 "connects": { 479 "uid": "_:c" 480 } 481 }, 482 { 483 "uid": "_:c", 484 "connects": { 485 "uid": "_:d" 486 } 487 }, 488 { 489 "uid": "_:d", 490 "connects": { 491 "uid": "_:a" 492 } 493 } 494 ] 495 ``` 496 497 You can modify the `jq` transformation to output the mutation format accepted by 498 Ratel UI and HTTP clients: 499 500 ```sh 501 $ csv2json connects.csv | jq '{ set: [ .[] | {uid: .uid, connects: { uid: .connects } } ] }' 502 ``` 503 ```json 504 { 505 "set": [ 506 { 507 "uid": "_:a", 508 "connects": { 509 "uid": "_:b" 510 } 511 }, 512 { 513 "uid": "_:a", 514 "connects": { 515 "uid": "_:c" 516 } 517 }, 518 { 519 "uid": "_:c", 520 "connects": { 521 "uid": "_:d" 522 } 523 }, 524 { 525 "uid": "_:d", 526 "connects": { 527 "uid": "_:a" 528 } 529 } 530 ] 531 } 532 ``` 533 534 ## A Simple Login System 535 536 {{% notice "note" %}} 537 This example is based on part of the [transactions in 538 v0.9](https://blog.dgraph.io/post/v0.9/) blogpost. Error checking has been 539 omitted for brevity. 540 {{% /notice %}} 541 542 Schema is assumed to be: 543 ``` 544 // @upsert directive is important to detect conflicts. 545 email: string @index(exact) @upsert . # @index(hash) would also work 546 pass: password . 547 ``` 548 549 ``` 550 // Create a new transaction. The deferred call to Discard 551 // ensures that server-side resources are cleaned up. 552 txn := client.NewTxn() 553 defer txn.Discard(ctx) 554 555 // Create and execute a query to looks up an email and checks if the password 556 // matches. 557 q := fmt.Sprintf(` 558 { 559 login_attempt(func: eq(email, %q)) { 560 checkpwd(pass, %q) 561 } 562 } 563 `, email, pass) 564 resp, err := txn.Query(ctx, q) 565 566 // Unmarshal the response into a struct. It will be empty if the email couldn't 567 // be found. Otherwise it will contain a bool to indicate if the password matched. 568 var login struct { 569 Account []struct { 570 Pass []struct { 571 CheckPwd bool `json:"checkpwd"` 572 } `json:"pass"` 573 } `json:"login_attempt"` 574 } 575 err = json.Unmarshal(resp.GetJson(), &login); err != nil { 576 577 // Now perform the upsert logic. 578 if len(login.Account) == 0 { 579 fmt.Println("Account doesn't exist! Creating new account.") 580 mu := &protos.Mutation{ 581 SetJson: []byte(fmt.Sprintf(`{ "email": %q, "pass": %q }`, email, pass)), 582 } 583 _, err = txn.Mutate(ctx, mu) 584 // Commit the mutation, making it visible outside of the transaction. 585 err = txn.Commit(ctx) 586 } else if login.Account[0].Pass[0].CheckPwd { 587 fmt.Println("Login successful!") 588 } else { 589 fmt.Println("Wrong email or password.") 590 } 591 ``` 592 593 ## Upserts 594 595 Upsert-style operations are operations where: 596 597 1. A node is searched for, and then 598 2. Depending on if it is found or not, either: 599 - Updating some of its attributes, or 600 - Creating a new node with those attributes. 601 602 The upsert has to be an atomic operation such that either a new node is 603 created, or an existing node is modified. It's not allowed that two concurrent 604 upserts both create a new node. 605 606 There are many examples where upserts are useful. Most examples involve the 607 creation of a 1 to 1 mapping between two different entities. E.g. associating 608 email addresses with user accounts. 609 610 Upserts are common in both traditional RDBMSs and newer NoSQL databases. 611 Dgraph is no exception. 612 613 ### Upsert Procedure 614 615 In Dgraph, upsert-style behaviour can be implemented by users on top of 616 transactions. The steps are as follows: 617 618 1. Create a new transaction. 619 620 2. Query for the node. This will usually be as simple as `{ q(func: eq(email, 621 "bob@example.com")) { uid }}`. If a `uid` result is returned, then that's the 622 `uid` for the existing node. If no results are returned, then the user account 623 doesn't exist. 624 625 3. In the case where the user account doesn't exist, then a new node has to be 626 created. This is done in the usual way by making a mutation (inside the 627 transaction), e.g. the RDF `_:newAccount <email> "bob@example.com" .`. The 628 `uid` assigned can be accessed by looking up the blank node name `newAccount` 629 in the `Assigned` object returned from the mutation. 630 631 4. Now that you have the `uid` of the account (either new or existing), you can 632 modify the account (using additional mutations) or perform queries on it in 633 whichever way you wish. 634 635 ### Upsert Block 636 637 You can also use the `Upsert Block` to achieve the upsert procedure in a single 638 mutation. The request will contain both the query and the mutation as explained 639 [here]({{< relref "mutations/index.md#upsert-block" >}}). 640 641 ### Conflicts 642 643 Upsert operations are intended to be run concurrently, as per the needs of the 644 application. As such, it's possible that two concurrently running operations 645 could try to add the same node at the same time. For example, both try to add a 646 user with the same email address. If they do, then one of the transactions will 647 fail with an error indicating that the transaction was aborted. 648 649 If this happens, the transaction is rolled back and it's up to the user's 650 application logic to retry the whole operation. The transaction has to be 651 retried in its entirety, all the way from creating a new transaction. 652 653 The choice of index placed on the predicate is important for performance. 654 **Hash is almost always the best choice of index for equality checking.** 655 656 {{% notice "note" %}} 657 It's the _index_ that typically causes upsert conflicts to occur. The index is 658 stored as many key/value pairs, where each key is a combination of the 659 predicate name and some function of the predicate value (e.g. its hash for the 660 hash index). If two transactions modify the same key concurrently, then one 661 will fail. 662 {{% /notice %}} 663 664 ## Run Jepsen tests 665 666 1. Clone the jepsen repo at [https://github.com/jepsen-io/jepsen](https://github.com/jepsen-io/jepsen). 667 668 ```sh 669 git clone git@github.com:jepsen-io/jepsen.git 670 ``` 671 672 2. Run the following command to setup the instances from the repo. 673 674 ```sh 675 cd docker && ./up.sh 676 ``` 677 678 This should start 5 jepsen nodes in docker containers. 679 680 3. Now ssh into `jepsen-control` container and run the tests. 681 682 {{% notice "note" %}} 683 You can use the [transfer](https://github.com/dgraph-io/dgraph/blob/master/contrib/nightly/transfer.sh) script to build the Dgraph binary and upload the tarball to https://transfer.sh, which gives you a url that can then be used in the Jepsen tests (using --package-url flag). 684 {{% /notice %}} 685 686 687 688 ```sh 689 docker exec -it jepsen-control bash 690 ``` 691 692 ```sh 693 root@control:/jepsen# cd dgraph 694 root@control:/jepsen/dgraph# lein run test -w upsert 695 696 # Specify a --package-url 697 698 root@control:/jepsen/dgraph# lein run test --force-download --package-url https://github.com/dgraph-io/dgraph/releases/download/nightly/dgraph-linux-amd64.tar.gz -w upsert 699 ``` 700 701 ## Migrate to Dgraph v1.1 702 703 ### Schema types: scalar `uid` and list `[uid]` 704 705 The semantics of predicates of type `uid` has changed in Dgraph 1.1. Whereas before all `uid` predicates implied a one-to-many relationship, now a one-to-one relationship or a one-to-many relationship can be expressed. 706 707 ``` 708 friend: [uid] . 709 best_friend: uid . 710 ``` 711 712 In the above, the predicate `friend` allows a one-to-many relationship (i.e a person can have more than one friend) and the predicate best_friend can be at most a one-to-one relationship. 713 714 This syntactic meaning is consistent with the other types, e.g., `string` indicating a single-value string and `[string]` representing many strings. This change makes the `uid` type work similarly to other types. 715 716 To migrate existing schemas from Dgraph v1.0 to Dgraph v1.1, update the schema file from an export so all predicates of type `uid` are changed to `[uid]`. Then use the updated schema when loading data into Dgraph v1.1. For example, for the following schema: 717 718 ```text 719 name: string . 720 friend: uid . 721 ``` 722 723 becomes 724 725 ```text 726 name: string . 727 friend: [uid] . 728 ``` 729 ### Type system 730 731 The new [type system]({{< relref "query-language/index.md#type-system" >}}) introduced in Dgraph 1.1 should not affect migrating data from a previous version. However, a couple of features in the query language will not work as they did before: `expand()` and `_predicate_`. 732 733 The reason is that the internal predicate that associated each node with its predicates (called `_predicate_`) has been removed. Instead, to get the predicates that belong to a node, the type system is used. 734 735 #### `expand()` 736 737 Expand queries will not work until the type system has been properly set up. For example, the following query will return an empty result in Dgraph 1.1 if the node 0xff has no type information. 738 739 ```text 740 { 741 me(func: uid(0xff)) { 742 expand(_all_) 743 } 744 } 745 ``` 746 747 To make it work again, add a type definition via the alter endpoint. Let’s assume the node in the previous example represents a person. Then, the basic Person type could be defined as follows: 748 749 ```text 750 type Person { 751 name 752 age 753 } 754 ``` 755 756 After that, the node is associated with the type by adding the following RDF triple to Dgraph (using a mutation): 757 758 ```text 759 <0xff> <dgraph.type> "Person" . 760 ``` 761 762 After that, the results of the query in both Dgraph v1.0 and Dgraph v1.1 should be the same. 763 764 #### `_predicate_` 765 766 The other consequence of removing `_predicate_` is that it cannot be referenced explicitly in queries. In Dgraph 1.0, the following query returns the predicates of the node 0xff. 767 768 ```ql 769 { 770 me(func: uid(0xff)) { 771 _predicate_ # NOT available in Dgraph v1.1 772 } 773 } 774 ``` 775 776 **There’s no exact equivalent of this behavior in Dgraph 1.1**, but the information can be queried by first querying for the types associated with that node with the query 777 778 ```text 779 { 780 me(func: uid(0xff)) { 781 dgraph.type 782 } 783 } 784 ``` 785 786 And then retrieving the definition of each type in the results using a schema query. 787 788 ```text 789 schema(type: Person) {} 790 ``` 791 792 ### Live Loader and Bulk Loader command-line flags 793 794 #### File input flags 795 In Dgraph v1.1, both the Dgraph Live Loader and Dgraph Bulk Loader tools support loading data in either RDF format or JSON format. To simplify the command-line interface for these tools, the `-r`/`--rdfs` flag has been removed in favor of `-f/--files`. The new flag accepts file or directory paths for either data format. By default, the tools will infer the file type based on the file suffix, e.g., `.rdf` and `.rdf.gz` or `.json` and `.json.gz` for RDF data or JSON data, respectively. To ignore the filenames and set the format explicitly, the `--format` flag can be set to `rdf` or `json`. 796 797 Before (in Dgraph v1.0): 798 799 ```sh 800 dgraph live -r data.rdf.gz 801 ``` 802 803 Now (in Dgraph v1.1): 804 805 ```sh 806 dgraph live -f data.rdf.gz 807 ``` 808 809 #### Dgraph Alpha address flag 810 For Dgraph Live Loader, the flag to specify the Dgraph Alpha address (default: `127.0.0.1:9080`) has changed from `-d`/`--dgraph` to `-a`/`--alpha`. 811 812 Before (in Dgraph v1.0): 813 814 ```sh 815 dgraph live -d 127.0.0.1:9080 816 ``` 817 818 Now (in Dgraph v1.1): 819 820 ```sh 821 dgraph live -a 127.0.0.1:9080 822 ``` 823 ### HTTP API 824 825 For HTTP API users (e.g., Curl, Postman), the custom Dgraph headers have been removed in favor of standard HTTP headers and query parameters. 826 827 #### Queries 828 829 There are two accepted `Content-Type` headers for queries over HTTP: `application/graphql+-` or `application/json`. 830 831 A `Content-Type` must be set to run a query. 832 833 Before (in Dgraph v1.0): 834 835 ```sh 836 curl localhost:8080/query -d '{ 837 q(func: eq(name, "Dgraph")) { 838 name 839 } 840 }' 841 ``` 842 843 Now (in Dgraph v1.1): 844 845 ```sh 846 curl -H 'Content-Type: application/graphql+-' localhost:8080/query -d '{ 847 q(func: eq(name, "Dgraph")) { 848 name 849 } 850 }' 851 ``` 852 853 For queries using [GraphQL Variables]({{< relref "query-language/index.md#graphql-variables" >}}), the query must be sent via the `application/json` content type, with the query and variables sent in a JSON payload: 854 855 Before (in Dgraph v1.0): 856 857 ```sh 858 curl -H 'X-Dgraph-Vars: {"$name": "Alice"}' localhost:8080/query -d 'query qWithVars($name: string) { 859 q(func: eq(name, $name)) { 860 name 861 } 862 } 863 ``` 864 865 Now (in Dgraph v1.1): 866 867 ```sh 868 curl -H 'Content-Type: application/json' localhost:8080/query -d '{ 869 "query": "query qWithVars($name: string) { q(func: eq(name, $name)) { name } }", 870 "variables": {"$name": "Alice"} 871 }' 872 ``` 873 874 #### Mutations 875 876 There are two accepted Content-Type headers for mutations over HTTP: `Content-Type: application/rdf` or `Content-Type: application/json`. 877 878 A `Content-Type` must be set to run a mutation. 879 880 These Content-Type headers supercede the Dgraph v1.0.x custom header `X-Dgraph-MutationType` to set the mutation type as RDF or JSON. 881 882 To commit the mutation immediately, use the query parameter `commitNow=true`. This replaces the custom header `X-Dgraph-CommitNow: true` from Dgraph v1.0.x. 883 884 Before (in Dgraph v1.0) 885 886 ```sh 887 curl -H 'X-Dgraph-CommitNow: true' localhost:8080/mutate -d '{ 888 set { 889 _:n <name> "Alice" . 890 _:n <dgraph.type> "Person" . 891 } 892 }' 893 ``` 894 895 Now (in Dgraph v1.1): 896 897 ```sh 898 curl -H 'Content-Type: application/rdf' localhost:8080/mutate?commitNow=true -d '{ 899 set { 900 _:n <name> "Alice" . 901 _:n <dgraph.type> "Person" . 902 } 903 }' 904 ``` 905 906 For JSON mutations, set the `Content-Type` header to `application/json`. 907 908 Before (in Dgraph v1.0): 909 910 ```sh 911 curl -H 'X-Dgraph-MutationType: json' -H "X-Dgraph-CommitNow: true" locahost:8080/mutate -d '{ 912 "set": [ 913 { 914 "name": "Alice" 915 } 916 ] 917 }' 918 ``` 919 920 Now (in Dgraph v1.1): 921 922 ```sh 923 curl -H 'Content-Type: application/json' locahost:8080/mutate?commitNow=true -d '{ 924 "set": [ 925 { 926 "name": "Alice" 927 } 928 ] 929 }' 930 ```