github.com/dgraph-io/dgraph@v1.2.8/wiki/content/howto/index.md

github.com/dgraph-io/dgraph@v1.2.8/wiki/content/howto/index.md (about)

     1  +++
     2  date = "2017-03-20T19:35:35+11:00"
     3  title = "How To Guides"
     4  +++
     5  
     6  ## Retrieving Debug Information
     7  
     8  Each Dgraph data node exposes profile over `/debug/pprof` endpoint and metrics over `/debug/vars` endpoint. Each Dgraph data node has it's own profiling and metrics information. Below is a list of debugging information exposed by Dgraph and the corresponding commands to retrieve them.
     9  
    10  ### Metrics Information
    11  
    12  If you are collecting these metrics from outside the Dgraph instance you need to pass `--expose_trace=true` flag, otherwise there metrics can be collected by connecting to the instance over localhost.
    13  
    14  ```
    15  curl http://<IP>:<HTTP_PORT>/debug/vars
    16  ```
    17  
    18  Metrics can also be retrieved in the Prometheus format at `/debug/prometheus_metrics`. See the [Metrics]({{< relref "deploy/index.md#metrics" >}}) section for the full list of metrics.
    19  
    20  ### Profiling Information
    21  
    22  Profiling information is available via the `go tool pprof` profiling tool built into Go. The ["Profiling Go programs"](https://blog.golang.org/profiling-go-programs) Go blog post will help you get started with using pprof. Each Dgraph Zero and Dgraph Alpha exposes a debug endpoint at `/debug/pprof/<profile>` via the HTTP port.
    23  
    24  ```
    25  go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/heap
    26  #Fetching profile from ...
    27  #Saved Profile in ...
    28  ```
    29  The output of the command would show the location where the profile is stored.
    30  
    31  In the interactive pprof shell, you can use commands like `top` to get a listing of the top functions in the profile, `web` to get a visual graph of the profile opened in a web browser, or `list` to display a code listing with profiling information overlaid.
    32  
    33  #### CPU Profile
    34  
    35  ```
    36  go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/profile
    37  ```
    38  
    39  #### Memory Profile
    40  
    41  ```
    42  go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/heap
    43  ```
    44  
    45  #### Block Profile
    46  
    47  Dgraph by default doesn't collect the block profile. Dgraph must be started with `--profile_mode=block` and `--block_rate=<N>` with N > 1.
    48  
    49  ```
    50  go tool pprof http://<IP>:<HTTP_PORT>/debug/pprof/block
    51  ```
    52  
    53  #### Goroutine stack
    54  
    55  The HTTP page `/debug/pprof/` is available at the HTTP port of a Dgraph Zero or Dgraph Alpha. From this page a link to the "full goroutine stack dump" is available (e.g., on a Dgraph Alpha this page would be at `http://localhost:8080/debug/pprof/goroutine?debug=2`). Looking at the full goroutine stack can be useful to understand goroutine usage at that moment.
    56  
    57  ## Using the Debug Tool
    58  
    59  {{% notice "note" %}}
    60  To debug a running Dgraph cluster, first copy the postings ("p") directory to
    61  another location. If the Dgraph cluster is not running, then you can use the
    62  same postings directory with the debug tool.
    63  
    64  If the “p” directory has been encrypted, then the debug tool will need to use the --keyfile <path-to-keyfile> option. This file must contain the same key that was used to encrypt the “p” directory.
    65  {{% /notice %}}
    66  
    67  The `dgraph debug` tool can be used to inspect Dgraph's posting list structure.
    68  You can use the debug tool to inspect the data, schema, and indices of your
    69  Dgraph cluster.
    70  
    71  Some scenarios where the debug tool is useful:
    72  
    73  - Verify that mutations committed to Dgraph have been persisted to disk.
    74  - Verify that indices are created.
    75  - Inspect the history of a posting list.
    76  
    77  ### Example Usage
    78  
    79  Debug the p directory.
    80  
    81  ```sh
    82  $ dgraph debug --postings ./p
    83  ```
    84  
    85  Debug the p directory, not opening in read-only mode. This is typically necessary when the database was not closed properly.
    86  
    87  ```sh
    88  $ dgraph debug --postings ./p --readonly=false
    89  ```
    90  
    91  Debug the p directory, only outputing the keys for the predicate `name`.
    92  
    93  ```sh
    94  $ dgraph debug --postings ./p --readonly=false --pred=name
    95  ```
    96  
    97  Debug the p directory, looking up a particular key:
    98  
    99  ```sh
   100  $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374
   101  ```
   102  
   103  Debug the p directory, inspecting the history of a particular key:
   104  
   105  ```sh
   106  $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 --history
   107  ```
   108  
   109  Debug an encrypted p directory with the key in a local file at the path  ./key_file:
   110  
   111  ```sh
   112  $ dgraph debug --postings ./p --keyfile ./key_file
   113  ```
   114  
   115  
   116  {{% notice "note" %}}
   117  The key file contains the key used to decrypt/encrypt the db. This key should be kept secret. As a best practice, 
   118  
   119  - Do not store the key file on the disk permanently. Back it up in a safe place and delete it after using it with the debug tool.
   120  
   121  - If the above is not possible, make sure correct privileges are set on the keyfile. Only the user who owns the dgraph process should be able to read / write the key file: `chmod 600` 
   122  {{% /notice %}}
   123  
   124  ### Debug Tool Output
   125  
   126  Let's go over an example with a Dgraph cluster with the following schema with a term index, full-text index, and two separately committed mutations:
   127  
   128  ```sh
   129  $ curl localhost:8080/alter -d '
   130    name: string @index(term) .
   131    url: string .
   132    description: string @index(fulltext) .
   133  '
   134  ```
   135  
   136  ```sh
   137  $ curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -d '{
   138    set {
   139      _:dgraph <name> "Dgraph" .
   140      _:dgraph <dgraph.type> "Software" .
   141      _:dgraph <url> "https://github.com/dgraph-io/dgraph" .
   142      _:dgraph <description> "Fast, Transactional, Distributed Graph Database." .
   143    }
   144  }'
   145  ```
   146  
   147  ```sh
   148  $ curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -d '{
   149    set {
   150      _:badger <name> "Badger" .
   151      _:badger <dgraph.type> "Software" .
   152      _:badger <url> "https://github.com/dgraph-io/badger" .
   153      _:badger <description> "Embeddable, persistent and fast key-value (KV) database written in pure Go." .
   154    }
   155  }'
   156  ```
   157  
   158  After stopping Dgraph, you can run the debug tool to inspect the postings directory:
   159  
   160  {{% notice "note" %}}
   161  The debug output can be very large. Typically you would redirect the debug tool to a file first for easier analysis.
   162  {{% /notice %}}
   163  
   164  ```sh
   165  $ dgraph debug --postings ./p
   166  ```
   167  
   168  ```text
   169  Opening DB: ./p
   170  Min commit: 1. Max commit: 5, w.r.t 18446744073709551615
   171  prefix =
   172  {d} {v.ok} attr: url uid: 1  key: 00000375726c000000000000000001 item: [71, b0100] ts: 3
   173  {d} {v.ok} attr: url uid: 2  key: 00000375726c000000000000000002 item: [71, b0100] ts: 5
   174  {d} {v.ok} attr: name uid: 1  key: 0000046e616d65000000000000000001 item: [43, b0100] ts: 3
   175  {d} {v.ok} attr: name uid: 2  key: 0000046e616d65000000000000000002 item: [43, b0100] ts: 5
   176  {i} {v.ok} attr: name term: [1] badger  key: 0000046e616d650201626164676572 item: [30, b0100] ts: 5
   177  {i} {v.ok} attr: name term: [1] dgraph  key: 0000046e616d650201646772617068 item: [30, b0100] ts: 3
   178  {d} {v.ok} attr: _predicate_ uid: 1  key: 00000b5f7072656469636174655f000000000000000001 item: [104, b0100] ts: 3
   179  {d} {v.ok} attr: _predicate_ uid: 2  key: 00000b5f7072656469636174655f000000000000000002 item: [104, b0100] ts: 5
   180  {d} {v.ok} attr: description uid: 1  key: 00000b6465736372697074696f6e000000000000000001 item: [92, b0100] ts: 3
   181  {d} {v.ok} attr: description uid: 2  key: 00000b6465736372697074696f6e000000000000000002 item: [119, b0100] ts: 5
   182  {i} {v.ok} attr: description term: [8] databas  key: 00000b6465736372697074696f6e020864617461626173 item: [38, b0100] ts: 5
   183  {i} {v.ok} attr: description term: [8] distribut  key: 00000b6465736372697074696f6e0208646973747269627574 item: [40, b0100] ts: 3
   184  {i} {v.ok} attr: description term: [8] embedd  key: 00000b6465736372697074696f6e0208656d62656464 item: [37, b0100] ts: 5
   185  {i} {v.ok} attr: description term: [8] fast  key: 00000b6465736372697074696f6e020866617374 item: [35, b0100] ts: 5
   186  {i} {v.ok} attr: description term: [8] go  key: 00000b6465736372697074696f6e0208676f item: [33, b0100] ts: 5
   187  {i} {v.ok} attr: description term: [8] graph  key: 00000b6465736372697074696f6e02086772617068 item: [36, b0100] ts: 3
   188  {i} {v.ok} attr: description term: [8] kei  key: 00000b6465736372697074696f6e02086b6569 item: [34, b0100] ts: 5
   189  {i} {v.ok} attr: description term: [8] kv  key: 00000b6465736372697074696f6e02086b76 item: [33, b0100] ts: 5
   190  {i} {v.ok} attr: description term: [8] persist  key: 00000b6465736372697074696f6e020870657273697374 item: [38, b0100] ts: 5
   191  {i} {v.ok} attr: description term: [8] pure  key: 00000b6465736372697074696f6e020870757265 item: [35, b0100] ts: 5
   192  {i} {v.ok} attr: description term: [8] transact  key: 00000b6465736372697074696f6e02087472616e73616374 item: [39, b0100] ts: 3
   193  {i} {v.ok} attr: description term: [8] valu  key: 00000b6465736372697074696f6e020876616c75 item: [35, b0100] ts: 5
   194  {i} {v.ok} attr: description term: [8] written  key: 00000b6465736372697074696f6e02087772697474656e item: [38, b0100] ts: 5
   195  {s} {v.ok} attr: url key: 01000375726c item: [13, b0001] ts: 1
   196  {s} {v.ok} attr: name key: 0100046e616d65 item: [23, b0001] ts: 1
   197  {s} {v.ok} attr: _predicate_ key: 01000b5f7072656469636174655f item: [31, b0001] ts: 1
   198  {s} {v.ok} attr: description key: 01000b6465736372697074696f6e item: [41, b0001] ts: 1
   199  {s} {v.ok} attr: dgraph.type key: 01000b6467726170682e74797065 item: [40, b0001] ts: 1
   200  Found 28 keys
   201  ```
   202  
   203  Each line in the debug output contains a prefix indicating the type of the key: `{d}`: Data key; `{i}`: Index key; `{c}`: Count key; `{r}`: Reverse key; `{s}`: Schema key. In the debug output above, we see data keys, index keys, and schema keys.
   204  
   205  Each index key has a corresponding index type. For example, in `attr: name term: [1] dgraph` the `[1]` shows that this is the term index ([0x1][tok_term]); in `attr: description term: [8] fast`, the `[8]` shows that this is the full-text index ([0x8][tok_fulltext]). These IDs match the index IDs in [tok.go][tok].
   206  
   207  [tok_term]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L39
   208  [tok_fulltext]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L48
   209  [tok]: https://github.com/dgraph-io/dgraph/blob/ce82aaafba3d9e57cf5ea1aeb9b637193441e1e2/tok/tok.go#L37-L53
   210  
   211  ### Key Lookup
   212  
   213  Every key can be inspected further with the `--lookup` flag for the specific key.
   214  
   215  ```sh
   216  $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374
   217  ```
   218  
   219  ```text
   220  Opening DB: ./p
   221  Min commit: 1. Max commit: 5, w.r.t 18446744073709551615
   222   Key: 00000b6465736372697074696f6e0208676f Length: 2
   223   Uid: 1 Op: 1
   224   Uid: 2 Op: 1
   225  ```
   226  
   227  For data keys, a lookup shows its type and value. Below, we see that the key for `attr: url uid: 1` is a string value.
   228  
   229  ```sh
   230  $ dgraph debug --postings ./p --lookup 00000375726c000000000000000001
   231  ```
   232  
   233  ```text
   234  Opening DB: ./p
   235  Min commit: 1. Max commit: 5, w.r.t 18446744073709551615
   236   Key: 0000046e616d65000000000000000001 Length: 1
   237   Uid: 18446744073709551615 Op: 1  Type: STRING.  String Value: "https://github.com/dgraph-io/dgraph"
   238  ```
   239  
   240  For index keys, a lookup shows the UIDs that are part of this index. Below, we see that the `fast` index for the `<description>` predicate has UIDs 0x1 and 0x2.
   241  
   242  ```sh
   243  $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374
   244  ```
   245  
   246  ```text
   247  Opening DB: ./p
   248  Min commit: 1. Max commit: 5, w.r.t 18446744073709551615
   249   Key: 00000b6465736372697074696f6e0208676f Length: 2
   250   Uid: 1 Op: 1
   251   Uid: 2 Op: 1
   252  ```
   253  
   254  ### Key History
   255  
   256  You can also look up the history of values for a key using the `--history` option.
   257  
   258  ```sh
   259  $ dgraph debug --postings ./p --lookup 00000b6465736372697074696f6e020866617374 --history
   260  ```
   261  ```text
   262  Opening DB: ./p
   263  Min commit: 1. Max commit: 5, w.r.t 18446744073709551615
   264  ==> key: 00000b6465736372697074696f6e020866617374. PK: &{byteType:2 Attr:description Uid:0 Termfast Count:0 bytePrefix:0}
   265  ts: 5 {item}{delta}
   266   Uid: 2 Op: 1
   267  
   268  ts: 3 {item}{delta}
   269   Uid: 1 Op: 1
   270  ```
   271  
   272  Above, we see that UID 0x1 was committed to this index at ts 3, and UID 0x2 was committed to this index at ts 5.
   273  
   274  The debug output also shows UserMeta information:
   275  
   276  - `{complete}`: Complete posting list
   277  - `{uid}`: UID posting list
   278  - `{delta}`: Delta posting list
   279  - `{empty}`: Empty posting list
   280  - `{item}`: Item posting list
   281  - `{deleted}`: Delete marker
   282  
   283  ## Using the Increment Tool
   284  
   285  The `dgraph increment` tool increments a counter value transactionally. The
   286  increment tool can be used as a health check that an Alpha is able to service
   287  transactions for both queries and mutations.
   288  
   289  ### Example Usage
   290  
   291  Increment the default predicate (`counter.val`) once. If the predicate doesn't yet
   292  exist, then it will be created starting at counter 0.
   293  
   294  ```sh
   295  $ dgraph increment
   296  ```
   297  
   298  Increment the counter predicate against the Alpha running at address `--alpha` (default: `localhost:9080`):
   299  
   300  ```sh
   301  $ dgraph increment --alpha=192.168.1.10:9080
   302  ```
   303  
   304  Increment the counter predicate specified by `--pred` (default: `counter.val`):
   305  
   306  ```sh
   307  $ dgraph increment --pred=counter.val.healthcheck
   308  ```
   309  
   310  Run a read-only query for the counter predicate and does not run a mutation to increment it:
   311  
   312  ```sh
   313  $ dgraph increment --ro
   314  ```
   315  
   316  Run a best-effort query for the counter predicate and does not run a mutation to increment it:
   317  
   318  ```sh
   319  $ dgraph increment --be
   320  ```
   321  
   322  Run the increment tool 1000 times every 1 second:
   323  
   324  ```sh
   325  $ dgraph increment --num=1000 --wait=1s
   326  ```
   327  
   328  ### Increment Tool Output
   329  
   330  ```sh
   331  # Run increment a few times
   332  $ dgraph increment
   333  0410 10:31:16.379 Counter VAL: 1   [ Ts: 1 ]
   334  $ dgraph increment
   335  0410 10:34:53.017 Counter VAL: 2   [ Ts: 3 ]
   336  $ dgraph increment
   337  0410 10:34:53.648 Counter VAL: 3   [ Ts: 5 ]
   338  
   339  # Run read-only queries to read the counter a few times
   340  $ dgraph increment --ro
   341  0410 10:34:57.35  Counter VAL: 3   [ Ts: 7 ]
   342  $ dgraph increment --ro
   343  0410 10:34:57.886 Counter VAL: 3   [ Ts: 7 ]
   344  $ dgraph increment --ro
   345  0410 10:34:58.129 Counter VAL: 3   [ Ts: 7 ]
   346  
   347  # Run best-effort query to read the counter a few times
   348  $ dgraph increment --be
   349  0410 10:34:59.867 Counter VAL: 3   [ Ts: 7 ]
   350  $ dgraph increment --be
   351  0410 10:35:01.322 Counter VAL: 3   [ Ts: 7 ]
   352  $ dgraph increment --be
   353  0410 10:35:02.674 Counter VAL: 3   [ Ts: 7 ]
   354  
   355  # Run a read-only query to read the counter 5 times
   356  $ dgraph increment --ro --num=5
   357  0410 10:35:18.812 Counter VAL: 3   [ Ts: 7 ]
   358  0410 10:35:18.813 Counter VAL: 3   [ Ts: 7 ]
   359  0410 10:35:18.815 Counter VAL: 3   [ Ts: 7 ]
   360  0410 10:35:18.817 Counter VAL: 3   [ Ts: 7 ]
   361  0410 10:35:18.818 Counter VAL: 3   [ Ts: 7 ]
   362  
   363  # Increment the counter 5 times
   364  $ dgraph increment --num=5
   365  0410 10:35:24.028 Counter VAL: 4   [ Ts: 8 ]
   366  0410 10:35:24.061 Counter VAL: 5   [ Ts: 10 ]
   367  0410 10:35:24.104 Counter VAL: 6   [ Ts: 12 ]
   368  0410 10:35:24.145 Counter VAL: 7   [ Ts: 14 ]
   369  0410 10:35:24.178 Counter VAL: 8   [ Ts: 16 ]
   370  
   371  # Increment the counter 5 times, once every second.
   372  $ dgraph increment --num=5 --wait=1s
   373  0410 10:35:26.95  Counter VAL: 9   [ Ts: 18 ]
   374  0410 10:35:27.975 Counter VAL: 10   [ Ts: 20 ]
   375  0410 10:35:28.999 Counter VAL: 11   [ Ts: 22 ]
   376  0410 10:35:30.028 Counter VAL: 12   [ Ts: 24 ]
   377  0410 10:35:31.054 Counter VAL: 13   [ Ts: 26 ]
   378  
   379  # If the Alpha is too busy or unhealthy, the tool will timeout and retry.
   380  $ dgraph increment
   381  0410 10:36:50.857 While trying to process counter: Query error: rpc error: code = DeadlineExceeded desc = context deadline exceeded. Retrying...
   382  ```
   383  
   384  ## Giving Nodes a Type
   385  
   386  It's often useful to give the nodes in a graph *types* (also commonly referred
   387  to as *labels* or *kinds*). You can do so using the [type system]({{< relref "query-language/index.md#type-system" >}}).
   388  
   389  ## Loading CSV Data
   390  
   391  [Dgraph mutations]({{< relref "mutations/index.md" >}}) are accepted in RDF
   392  N-Quad and JSON formats. To load CSV-formatted data into Dgraph, first convert
   393  the dataset into one of the accepted formats and then load the resulting dataset
   394  into Dgraph. This section demonstrates converting CSV into JSON. There are
   395  many tools available to convert CSV to JSON. For example, you can use
   396  [`d3-dsv`](https://github.com/d3/d3-dsv)'s `csv2json` tool as shown below:
   397  
   398  ```csv
   399  Name,URL
   400  Dgraph,https://github.com/dgraph-io/dgraph
   401  Badger,https://github.com/dgraph-io/badger
   402  ```
   403  
   404  ```sh
   405  $ csv2json names.csv --out names.json
   406  $ cat names.json | jq '.'
   407  [
   408    {
   409      "Name": "Dgraph",
   410      "URL": "https://github.com/dgraph-io/dgraph"
   411    },
   412    {
   413      "Name": "Badger",
   414      "URL": "https://github.com/dgraph-io/badger"
   415    }
   416  ]
   417  ```
   418  
   419  This JSON can be loaded into Dgraph via the programmatic clients. This follows
   420  the [JSON Mutation Format]({{< relref "mutations#json-mutation-format" >}}).
   421  Note that each JSON object in the list above will be assigned a unique UID since
   422  the `uid` field is omitted.
   423  
   424  [The Ratel UI (and HTTP clients) expect JSON data to be stored within the `"set"`
   425  key]({{< relref "mutations/index.md#json-syntax-using-raw-http-or-ratel-ui"
   426  >}}). You can use `jq` to transform the JSON into the correct format:
   427  
   428  ```sh
   429  $ cat names.json | jq '{ set: . }'
   430  ```
   431  ```json
   432  {
   433    "set": [
   434      {
   435        "Name": "Dgraph",
   436        "URL": "https://github.com/dgraph-io/dgraph"
   437      },
   438      {
   439        "Name": "Badger",
   440        "URL": "https://github.com/dgraph-io/badger"
   441      }
   442    ]
   443  }
   444  ```
   445  
   446  Let's say you have CSV data in a file named connects.csv that's connecting nodes
   447  together. Here, the `connects` field should `uid` type.
   448  
   449  ```csv
   450  uid,connects
   451  _:a,_:b
   452  _:a,_:c
   453  _:c,_:d
   454  _:d,_:a
   455  ```
   456  
   457  {{% notice "note" %}}
   458  To reuse existing integer IDs from a CSV file as UIDs in Dgraph, use Dgraph Zero's [assign endpoint]({{< relref "deploy/index.md#more-about-dgraph-zero" >}}) before data loading to allocate a range of UIDs that can be safely assigned.
   459  {{% /notice %}}
   460  
   461  To get the correct JSON format, you can convert the CSV into JSON and use `jq`
   462  to transform it in the correct format where the `connects` edge is a node uid:
   463  
   464  ```sh
   465  $ csv2json connects.csv | jq '[ .[] | { uid: .uid, connects: { uid: .connects } } ]'
   466  ```
   467  
   468  ```json
   469  [
   470    {
   471      "uid": "_:a",
   472      "connects": {
   473        "uid": "_:b"
   474      }
   475    },
   476    {
   477      "uid": "_:a",
   478      "connects": {
   479        "uid": "_:c"
   480      }
   481    },
   482    {
   483      "uid": "_:c",
   484      "connects": {
   485        "uid": "_:d"
   486      }
   487    },
   488    {
   489      "uid": "_:d",
   490      "connects": {
   491        "uid": "_:a"
   492      }
   493    }
   494  ]
   495  ```
   496  
   497  You can modify the `jq` transformation to output the mutation format accepted by
   498  Ratel UI and HTTP clients:
   499  
   500  ```sh
   501  $ csv2json connects.csv | jq '{ set: [ .[] | {uid: .uid, connects: { uid: .connects } } ] }'
   502  ```
   503  ```json
   504  {
   505    "set": [
   506      {
   507        "uid": "_:a",
   508        "connects": {
   509          "uid": "_:b"
   510        }
   511      },
   512      {
   513        "uid": "_:a",
   514        "connects": {
   515          "uid": "_:c"
   516        }
   517      },
   518      {
   519        "uid": "_:c",
   520        "connects": {
   521          "uid": "_:d"
   522        }
   523      },
   524      {
   525        "uid": "_:d",
   526        "connects": {
   527          "uid": "_:a"
   528        }
   529      }
   530    ]
   531  }
   532  ```
   533  
   534  ## A Simple Login System
   535  
   536  {{% notice "note" %}}
   537  This example is based on part of the [transactions in
   538  v0.9](https://blog.dgraph.io/post/v0.9/) blogpost. Error checking has been
   539  omitted for brevity.
   540  {{% /notice %}}
   541  
   542  Schema is assumed to be:
   543  ```
   544  // @upsert directive is important to detect conflicts.
   545  email: string @index(exact) @upsert . # @index(hash) would also work
   546  pass: password .
   547  ```
   548  
   549  ```
   550  // Create a new transaction. The deferred call to Discard
   551  // ensures that server-side resources are cleaned up.
   552  txn := client.NewTxn()
   553  defer txn.Discard(ctx)
   554  
   555  // Create and execute a query to looks up an email and checks if the password
   556  // matches.
   557  q := fmt.Sprintf(`
   558      {
   559          login_attempt(func: eq(email, %q)) {
   560              checkpwd(pass, %q)
   561          }
   562      }
   563  `, email, pass)
   564  resp, err := txn.Query(ctx, q)
   565  
   566  // Unmarshal the response into a struct. It will be empty if the email couldn't
   567  // be found. Otherwise it will contain a bool to indicate if the password matched.
   568  var login struct {
   569      Account []struct {
   570          Pass []struct {
   571              CheckPwd bool `json:"checkpwd"`
   572          } `json:"pass"`
   573      } `json:"login_attempt"`
   574  }
   575  err = json.Unmarshal(resp.GetJson(), &login); err != nil {
   576  
   577  // Now perform the upsert logic.
   578  if len(login.Account) == 0 {
   579      fmt.Println("Account doesn't exist! Creating new account.")
   580      mu := &protos.Mutation{
   581          SetJson: []byte(fmt.Sprintf(`{ "email": %q, "pass": %q }`, email, pass)),
   582      }
   583      _, err = txn.Mutate(ctx, mu)
   584      // Commit the mutation, making it visible outside of the transaction.
   585      err = txn.Commit(ctx)
   586  } else if login.Account[0].Pass[0].CheckPwd {
   587      fmt.Println("Login successful!")
   588  } else {
   589      fmt.Println("Wrong email or password.")
   590  }
   591  ```
   592  
   593  ## Upserts
   594  
   595  Upsert-style operations are operations where:
   596  
   597  1. A node is searched for, and then
   598  2. Depending on if it is found or not, either:
   599      - Updating some of its attributes, or
   600      - Creating a new node with those attributes.
   601  
   602  The upsert has to be an atomic operation such that either a new node is
   603  created, or an existing node is modified. It's not allowed that two concurrent
   604  upserts both create a new node.
   605  
   606  There are many examples where upserts are useful. Most examples involve the
   607  creation of a 1 to 1 mapping between two different entities. E.g. associating
   608  email addresses with user accounts.
   609  
   610  Upserts are common in both traditional RDBMSs and newer NoSQL databases.
   611  Dgraph is no exception.
   612  
   613  ### Upsert Procedure
   614  
   615  In Dgraph, upsert-style behaviour can be implemented by users on top of
   616  transactions. The steps are as follows:
   617  
   618  1. Create a new transaction.
   619  
   620  2. Query for the node. This will usually be as simple as `{ q(func: eq(email,
   621     "bob@example.com")) { uid }}`. If a `uid` result is returned, then that's the
   622  `uid` for the existing node. If no results are returned, then the user account
   623  doesn't exist.
   624  
   625  3. In the case where the user account doesn't exist, then a new node has to be
   626     created. This is done in the usual way by making a mutation (inside the
   627  transaction), e.g.  the RDF `_:newAccount <email> "bob@example.com" .`. The
   628  `uid` assigned can be accessed by looking up the blank node name `newAccount`
   629  in the `Assigned` object returned from the mutation.
   630  
   631  4. Now that you have the `uid` of the account (either new or existing), you can
   632     modify the account (using additional mutations) or perform queries on it in
   633  whichever way you wish.
   634  
   635  ### Upsert Block
   636  
   637  You can also use the `Upsert Block` to achieve the upsert procedure in a single
   638   mutation. The request will contain both the query and the mutation as explained
   639  [here]({{< relref "mutations/index.md#upsert-block" >}}).
   640  
   641  ### Conflicts
   642  
   643  Upsert operations are intended to be run concurrently, as per the needs of the
   644  application. As such, it's possible that two concurrently running operations
   645  could try to add the same node at the same time. For example, both try to add a
   646  user with the same email address. If they do, then one of the transactions will
   647  fail with an error indicating that the transaction was aborted.
   648  
   649  If this happens, the transaction is rolled back and it's up to the user's
   650  application logic to retry the whole operation. The transaction has to be
   651  retried in its entirety, all the way from creating a new transaction.
   652  
   653  The choice of index placed on the predicate is important for performance.
   654  **Hash is almost always the best choice of index for equality checking.**
   655  
   656  {{% notice "note" %}}
   657  It's the _index_ that typically causes upsert conflicts to occur. The index is
   658  stored as many key/value pairs, where each key is a combination of the
   659  predicate name and some function of the predicate value (e.g. its hash for the
   660  hash index). If two transactions modify the same key concurrently, then one
   661  will fail.
   662  {{% /notice %}}
   663  
   664  ## Run Jepsen tests
   665  
   666  1. Clone the jepsen repo at [https://github.com/jepsen-io/jepsen](https://github.com/jepsen-io/jepsen).
   667  
   668  ```sh
   669  git clone git@github.com:jepsen-io/jepsen.git
   670  ```
   671  
   672  2. Run the following command to setup the instances from the repo.
   673  
   674  ```sh
   675  cd docker && ./up.sh
   676  ```
   677  
   678  This should start 5 jepsen nodes in docker containers.
   679  
   680  3. Now ssh into `jepsen-control` container and run the tests.
   681  
   682  {{% notice "note" %}}
   683  You can use the [transfer](https://github.com/dgraph-io/dgraph/blob/master/contrib/nightly/transfer.sh) script to build the Dgraph binary and upload the tarball to https://transfer.sh, which gives you a url that can then be used in the Jepsen tests (using --package-url flag).
   684  {{% /notice %}}
   685  
   686  
   687  
   688  ```sh
   689  docker exec -it jepsen-control bash
   690  ```
   691  
   692  ```sh
   693  root@control:/jepsen# cd dgraph
   694  root@control:/jepsen/dgraph# lein run test -w upsert
   695  
   696  # Specify a --package-url
   697  
   698  root@control:/jepsen/dgraph# lein run test --force-download --package-url https://github.com/dgraph-io/dgraph/releases/download/nightly/dgraph-linux-amd64.tar.gz -w upsert
   699  ```
   700  
   701  ## Migrate to Dgraph v1.1
   702  
   703  ### Schema types: scalar `uid` and list `[uid]`
   704  
   705  The semantics of predicates of type `uid` has changed in Dgraph 1.1. Whereas before all `uid` predicates implied a one-to-many relationship, now a one-to-one relationship or a one-to-many relationship can be expressed.
   706  
   707  ```
   708  friend: [uid] .
   709  best_friend: uid .
   710  ```
   711  
   712  In the above, the predicate `friend` allows a one-to-many relationship (i.e a person can have more than one friend) and the predicate best_friend can be at most a one-to-one relationship.
   713  
   714  This syntactic meaning is consistent with the other types, e.g., `string` indicating a single-value string and `[string]` representing many strings. This change makes the `uid` type work similarly to other types.
   715  
   716  To migrate existing schemas from Dgraph v1.0 to Dgraph v1.1, update the schema file from an export so all predicates of type `uid` are changed to `[uid]`. Then use the updated schema when loading data into Dgraph v1.1. For example, for the following schema:
   717  
   718  ```text
   719  name: string .
   720  friend: uid .
   721  ```
   722  
   723  becomes
   724  
   725  ```text
   726  name: string .
   727  friend: [uid] .
   728  ```
   729  ### Type system
   730  
   731  The new [type system]({{< relref "query-language/index.md#type-system" >}}) introduced in Dgraph 1.1 should not affect migrating data from a previous version. However, a couple of features in the query language will not work as they did before: `expand()` and `_predicate_`.
   732  
   733  The reason is that the internal predicate that associated each node with its predicates (called `_predicate_`) has been removed. Instead, to get the predicates that belong to a node, the type system is used.
   734  
   735  #### `expand()`
   736  
   737  Expand queries will not work until the type system has been properly set up. For example, the following query will return an empty result in Dgraph 1.1 if the node 0xff has no type information.
   738  
   739  ```text
   740  {
   741    me(func: uid(0xff)) {
   742      expand(_all_)
   743    }
   744  }
   745  ```
   746  
   747  To make it work again, add a type definition via the alter endpoint. Let’s assume the node in the previous example represents a person. Then, the basic Person type could be defined as follows:
   748  
   749  ```text
   750  type Person {
   751    name
   752    age
   753  }
   754  ```
   755  
   756  After that, the node is associated with the type by adding the following RDF triple to Dgraph (using a mutation):
   757  
   758  ```text
   759  <0xff> <dgraph.type> "Person" .
   760  ```
   761  
   762  After that, the results of the query in both Dgraph v1.0 and Dgraph v1.1 should be the same.
   763  
   764  #### `_predicate_`
   765  
   766  The other consequence of removing `_predicate_` is that it cannot be referenced explicitly in queries. In Dgraph 1.0, the following query returns the predicates of the node 0xff.
   767  
   768  ```ql
   769  {
   770    me(func: uid(0xff)) {
   771       _predicate_ # NOT available in Dgraph v1.1
   772    }
   773  }
   774  ```
   775  
   776  **There’s no exact equivalent of this behavior in Dgraph 1.1**, but the information can be queried by first querying for the types associated with that node with the query
   777  
   778  ```text
   779  {
   780    me(func: uid(0xff)) {
   781       dgraph.type
   782    }
   783  }
   784  ```
   785  
   786  And then retrieving the definition of each type in the results using a schema query.
   787  
   788  ```text
   789  schema(type: Person) {}
   790  ```
   791  
   792  ### Live Loader and Bulk Loader command-line flags
   793  
   794  #### File input flags
   795  In Dgraph v1.1, both the Dgraph Live Loader and Dgraph Bulk Loader tools support loading data in either RDF format or JSON format. To simplify the command-line interface for these tools, the `-r`/`--rdfs` flag has been removed in favor of `-f/--files`. The new flag accepts file or directory paths for either data format. By default, the tools will infer the file type based on the file suffix, e.g., `.rdf` and `.rdf.gz` or `.json` and `.json.gz` for RDF data or JSON data, respectively. To ignore the filenames and set the format explicitly, the `--format` flag can be set to `rdf` or `json`.
   796  
   797  Before (in Dgraph v1.0):
   798  
   799  ```sh
   800  dgraph live -r data.rdf.gz
   801  ```
   802  
   803  Now (in Dgraph v1.1):
   804  
   805  ```sh
   806  dgraph live -f data.rdf.gz
   807  ```
   808  
   809  #### Dgraph Alpha address flag
   810  For Dgraph Live Loader, the flag to specify the Dgraph Alpha address  (default: `127.0.0.1:9080`) has changed from `-d`/`--dgraph` to `-a`/`--alpha`.
   811  
   812  Before (in Dgraph v1.0):
   813  
   814  ```sh
   815  dgraph live -d 127.0.0.1:9080
   816  ```
   817  
   818  Now (in Dgraph v1.1):
   819  
   820  ```sh
   821  dgraph live -a 127.0.0.1:9080
   822  ```
   823  ### HTTP API
   824  
   825  For HTTP API users (e.g., Curl, Postman), the custom Dgraph headers have been removed in favor of standard HTTP headers and query parameters.
   826  
   827  #### Queries
   828  
   829  There are two accepted `Content-Type` headers for queries over HTTP: `application/graphql+-` or `application/json`.
   830  
   831  A `Content-Type` must be set to run a query.
   832  
   833  Before (in Dgraph v1.0):
   834  
   835  ```sh
   836  curl localhost:8080/query -d '{
   837    q(func: eq(name, "Dgraph")) {
   838      name
   839    }
   840  }'
   841  ```
   842  
   843  Now (in Dgraph v1.1):
   844  
   845  ```sh
   846  curl -H 'Content-Type: application/graphql+-' localhost:8080/query -d '{
   847    q(func: eq(name, "Dgraph")) {
   848      name
   849    }
   850  }'
   851  ```
   852  
   853  For queries using [GraphQL Variables]({{< relref "query-language/index.md#graphql-variables" >}}), the query must be sent via the `application/json` content type, with the query and variables sent in a JSON payload:
   854  
   855  Before (in Dgraph v1.0):
   856  
   857  ```sh
   858  curl -H 'X-Dgraph-Vars: {"$name": "Alice"}' localhost:8080/query -d 'query qWithVars($name: string) {
   859    q(func: eq(name, $name)) {
   860      name
   861    }
   862  }
   863  ```
   864  
   865  Now (in Dgraph v1.1):
   866  
   867  ```sh
   868  curl -H 'Content-Type: application/json' localhost:8080/query -d '{
   869    "query": "query qWithVars($name: string) { q(func: eq(name, $name)) { name } }",
   870    "variables": {"$name": "Alice"}
   871  }'
   872  ```
   873  
   874  #### Mutations
   875  
   876  There are two accepted Content-Type headers for mutations over HTTP: `Content-Type: application/rdf` or `Content-Type: application/json`.
   877  
   878  A `Content-Type` must be set to run a mutation.
   879  
   880  These Content-Type headers supercede the Dgraph v1.0.x custom header `X-Dgraph-MutationType` to set the mutation type as RDF or JSON.
   881  
   882  To commit the mutation immediately, use the query parameter `commitNow=true`. This replaces the custom header `X-Dgraph-CommitNow: true` from Dgraph v1.0.x.
   883  
   884  Before (in Dgraph v1.0)
   885  
   886  ```sh
   887  curl -H 'X-Dgraph-CommitNow: true' localhost:8080/mutate -d '{
   888    set {
   889      _:n <name> "Alice" .
   890      _:n <dgraph.type> "Person" .
   891    }
   892  }'
   893  ```
   894  
   895  Now (in Dgraph v1.1):
   896  
   897  ```sh
   898  curl -H 'Content-Type: application/rdf' localhost:8080/mutate?commitNow=true -d '{
   899    set {
   900      _:n <name> "Alice" .
   901      _:n <dgraph.type> "Person" .
   902    }
   903  }'
   904  ```
   905  
   906  For JSON mutations, set the `Content-Type` header to `application/json`.
   907  
   908  Before (in Dgraph v1.0):
   909  
   910  ```sh
   911  curl -H 'X-Dgraph-MutationType: json' -H "X-Dgraph-CommitNow: true" locahost:8080/mutate -d '{
   912    "set": [
   913      {
   914        "name": "Alice"
   915      }
   916    ]
   917  }'
   918  ```
   919  
   920  Now (in Dgraph v1.1):
   921  
   922  ```sh
   923  curl -H 'Content-Type: application/json' locahost:8080/mutate?commitNow=true -d '{
   924    "set": [
   925      {
   926        "name": "Alice"
   927      }
   928    ]
   929  }'
   930  ```