github.com/unigraph-dev/dgraph@v1.1.1-0.20200923154953-8b52b426f765/wiki/content/query-language/index.md (about)

     1  +++
     2  title = "Query Language"
     3  +++
     4  
     5  Dgraph's GraphQL+- is based on Facebook's [GraphQL](https://facebook.github.io/graphql/).  GraphQL wasn't developed for Graph databases, but its graph-like query syntax, schema validation and subgraph shaped response make it a great language choice.  We've modified the language to better support graph operations, adding and removing features to get the best fit for graph databases.  We're calling this simplified, feature rich language, ''GraphQL+-''.
     6  
     7  GraphQL+- is a work in progress. We're adding more features and we might further simplify existing ones.
     8  
     9  ## Take a Tour - https://tour.dgraph.io
    10  
    11  This document is the Dgraph query reference material.  It is not a tutorial.  It's designed as a reference for users who already know how to write queries in GraphQL+- but need to check syntax, or indices, or functions, etc.
    12  
    13  {{% notice "note" %}}If you are new to Dgraph and want to learn how to use Dgraph and GraphQL+-, take the tour - https://tour.dgraph.io{{% /notice %}}
    14  
    15  
    16  ### Running examples
    17  
    18  The examples in this reference use a database of 21 million triples about movies and actors.  The example queries run and return results.  The queries are executed by an instance of Dgraph running at https://play.dgraph.io/.  To run the queries locally or experiment a bit more, see the [Getting Started]({{< relref "get-started/index.md" >}}) guide, which also shows how to load the datasets used in the examples here.
    19  
    20  ## GraphQL+- Fundamentals
    21  
    22  A GraphQL+- query finds nodes based on search criteria, matches patterns in a graph and returns a graph as a result.
    23  
    24  A query is composed of nested blocks, starting with a query root.  The root finds the initial set of nodes against which the following graph matching and filtering is applied.
    25  
    26  {{% notice "note" %}}See more about Queries in [Queries design concept]({{< relref "design-concepts/index.md#queries" >}}) {{% /notice %}}
    27  
    28  ### Returning Values
    29  
    30  Each query has a name, specified at the query root, and the same name identifies the results.
    31  
    32  If an edge is of a value type, the value can be returned by giving the edge name.
    33  
    34  Query Example: In the example dataset, edges that link movies to directors and actors, movies have a name, release date and identifiers for a number of well known movie databases.  This query, with name `bladerunner`, and root matching a movie name, returns those values for the early 80's sci-fi classic "Blade Runner".
    35  
    36  {{< runnable >}}
    37  {
    38    bladerunner(func: eq(name@en, "Blade Runner")) {
    39      uid
    40      name@en
    41      initial_release_date
    42      netflix_id
    43    }
    44  }
    45  {{< /runnable >}}
    46  
    47  The query first searches the graph, using indexes to make the search efficient, for all nodes with a `name` edge equaling "Blade Runner".  For the found node the query then returns the listed outgoing edges.
    48  
    49  Every node had a unique 64-bit identifier.  The `uid` edge in the query above returns that identifier.  If the required node is already known, then the function `uid` finds the node.
    50  
    51  Query Example: "Blade Runner" movie data found by UID.
    52  
    53  {{< runnable >}}
    54  {
    55    bladerunner(func: uid(0x2066e)) {
    56      uid
    57      name@en
    58      initial_release_date
    59      netflix_id
    60    }
    61  }
    62  {{< /runnable >}}
    63  
    64  A query can match many nodes and return the values for each.
    65  
    66  Query Example: All nodes that have either "Blade" or "Runner" in the name.
    67  
    68  {{< runnable >}}
    69  {
    70    bladerunner(func: anyofterms(name@en, "Blade Runner")) {
    71      uid
    72      name@en
    73      initial_release_date
    74      netflix_id
    75    }
    76  }
    77  {{< /runnable >}}
    78  
    79  Multiple IDs can be specified in a list to the `uid` function.
    80  
    81  Query Example:
    82  {{< runnable >}}
    83  {
    84    movies(func: uid(0x25280, 0x707f9)) {
    85      uid
    86      name@en
    87      initial_release_date
    88      netflix_id
    89    }
    90  }
    91  {{< /runnable >}}
    92  
    93  
    94  {{% notice "note" %}} If your predicate has special characters, then you should wrap it with angular
    95  brackets while asking for it in the query. E.g. `<first:name>`{{% /notice %}}
    96  
    97  ### Expanding Graph Edges
    98  
    99  A query expands edges from node to node by nesting query blocks with `{ }`.
   100  
   101  Query Example: The actors and characters played in "Blade Runner".  The query first finds the node with name "Blade Runner", then follows  outgoing `starring` edges to nodes representing an actor's performance as a character.  From there the `performance.actor` and `performance.character` edges are expanded to find the actor names and roles for every actor in the movie.
   102  {{< runnable >}}
   103  {
   104    brCharacters(func: eq(name@en, "Blade Runner")) {
   105      name@en
   106      initial_release_date
   107      starring {
   108        performance.actor {
   109          name@en  # actor name
   110        }
   111        performance.character {
   112          name@en  # character name
   113        }
   114      }
   115    }
   116  }
   117  {{< /runnable >}}
   118  
   119  
   120  ### Comments
   121  
   122  Anything on a line following a `#` is a comment
   123  
   124  ### Applying Filters
   125  
   126  The query root finds an initial set of nodes and the query proceeds by returning values and following edges to further nodes - any node reached in the query is found by traversal after the search at root.  The nodes found can be filtered by applying `@filter`, either after the root or at any edge.
   127  
   128  Query Example: "Blade Runner" director Ridley Scott's movies released before the year 2000.
   129  {{< runnable >}}
   130  {
   131    scott(func: eq(name@en, "Ridley Scott")) {
   132      name@en
   133      initial_release_date
   134      director.film @filter(le(initial_release_date, "2000")) {
   135        name@en
   136        initial_release_date
   137      }
   138    }
   139  }
   140  {{< /runnable >}}
   141  
   142  Query Example: Movies with either "Blade" or "Runner" in the title and released before the year 2000.
   143  
   144  {{< runnable >}}
   145  {
   146    bladerunner(func: anyofterms(name@en, "Blade Runner")) @filter(le(initial_release_date, "2000")) {
   147      uid
   148      name@en
   149      initial_release_date
   150      netflix_id
   151    }
   152  }
   153  {{< /runnable >}}
   154  
   155  ### Language Support
   156  
   157  {{% notice "note" %}}A `@lang` directive must be specified in the schema to query or mutate
   158  predicates with language tags.{{% /notice %}}
   159  
   160  Dgraph supports UTF-8 strings.
   161  
   162  In a query, for a string valued edge `edge`, the syntax
   163  ```
   164  edge@lang1:...:langN
   165  ```
   166  specifies the preference order for returned languages, with the following rules.
   167  
   168  * At most one result will be returned (except in the case where the language list is set to *).
   169  * The preference list is considered left to right: if a value in given language is not found, the next language from the list is considered.
   170  * If there are no values in any of the specified languages, no value is returned.
   171  * A final `.` means that a value without a specified language is returned or if there is no value without language, a value in ''some'' language is returned.
   172  * Setting the language list value to * will return all the values for that predicate along with their language. Values without a language tag are also returned.
   173  
   174  For example:
   175  
   176  - `name`   => Look for an untagged string; return nothing if no untagged value exits.
   177  - `name@.` => Look for an untagged string, then any language.
   178  - `name@en` => Look for `en` tagged string; return nothing if no `en` tagged string exists.
   179  - `name@en:.` => Look for `en`, then untagged, then any language.
   180  - `name@en:pl` => Look for `en`, then `pl`, otherwise nothing.
   181  - `name@en:pl:.` => Look for `en`, then `pl`, then untagged, then any language.
   182  - `name@*` => Look for all the values of this predicate and return them along with their language. For example, if there are two values with languages en and hi, this query will return two keys named "name@en" and "name@hi".
   183  
   184  
   185  {{% notice "note" %}}In functions, language lists (including the `@*` notation) are not allowed. Untagged predicates, Single language tags, and `.` notation work as described above.
   186  
   187  ---
   188  
   189  In [full-text search functions]({{< relref "#full-text-search" >}}) (`alloftext`, `anyoftext`), when no language is specified (untagged or `@.`), the default (English) full-text tokenizer is used.{{% /notice %}}
   190  
   191  
   192  Query Example: Some of Bollywood director and actor Farhan Akhtar's movies have a name stored in Russian as well as Hindi and English, others do not.
   193  
   194  {{< runnable >}}
   195  {
   196    q(func: allofterms(name@en, "Farhan Akhtar")) {
   197      name@hi
   198      name@en
   199  
   200      director.film {
   201        name@ru:hi:en
   202        name@en
   203        name@hi
   204        name@ru
   205      }
   206    }
   207  }
   208  {{< /runnable >}}
   209  
   210  
   211  
   212  
   213  ## Functions
   214  
   215  {{% notice "note" %}}Functions can only be applied to [indexed]({{< relref "#indexing">}}) predicates.{{% /notice %}}
   216  
   217  Functions allow filtering based on properties of nodes or variables.  Functions can be applied in the query root or in filters.
   218  
   219  For functions on string valued predicates, if no language preference is given, the function is applied to all languages and strings without a language tag; if a language preference is given, the function is applied only to strings of the given language.
   220  
   221  
   222  ### Term matching
   223  
   224  
   225  #### allofterms
   226  
   227  Syntax Example: `allofterms(predicate, "space-separated term list")`
   228  
   229  Schema Types: `string`
   230  
   231  Index Required: `term`
   232  
   233  
   234  Matches strings that have all specified terms in any order; case insensitive.
   235  
   236  ##### Usage at root
   237  
   238  Query Example: All nodes that have `name` containing terms `indiana` and `jones`, returning the English name and genre in English.
   239  
   240  {{< runnable >}}
   241  {
   242    me(func: allofterms(name@en, "jones indiana")) {
   243      name@en
   244      genre {
   245        name@en
   246      }
   247    }
   248  }
   249  {{< /runnable >}}
   250  
   251  ##### Usage as Filter
   252  
   253  Query Example: All Steven Spielberg films that contain the words `indiana` and `jones`.  The `@filter(has(director.film))` removes nodes with name Steven Spielberg that aren't the director --- the data also contains a character in a film called Steven Spielberg.
   254  
   255  {{< runnable >}}
   256  {
   257    me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) {
   258      name@en
   259      director.film @filter(allofterms(name@en, "jones indiana"))  {
   260        name@en
   261      }
   262    }
   263  }
   264  {{< /runnable >}}
   265  
   266  
   267  #### anyofterms
   268  
   269  
   270  Syntax Example: `anyofterms(predicate, "space-separated term list")`
   271  
   272  Schema Types: `string`
   273  
   274  Index Required: `term`
   275  
   276  
   277  Matches strings that have any of the specified terms in any order; case insensitive.
   278  
   279  ##### Usage at root
   280  
   281  Query Example: All nodes that have a `name` containing either `poison` or `peacock`.  Many of the returned nodes are movies, but people like Joan Peacock also meet the search terms because without a [cascade directive]({{< relref "#cascade-directive">}}) the query doesn't require a genre.
   282  
   283  {{< runnable >}}
   284  {
   285    me(func:anyofterms(name@en, "poison peacock")) {
   286      name@en
   287      genre {
   288        name@en
   289      }
   290    }
   291  }
   292  {{< /runnable >}}
   293  
   294  
   295  ##### Usage as filter
   296  
   297  Query Example: All Steven Spielberg movies that contain `war` or `spies`.  The `@filter(has(director.film))` removes nodes with name Steven Spielberg that aren't the director --- the data also contains a character in a film called Steven Spielberg.
   298  
   299  {{< runnable >}}
   300  {
   301    me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) {
   302      name@en
   303      director.film @filter(anyofterms(name@en, "war spies"))  {
   304        name@en
   305      }
   306    }
   307  }
   308  {{< /runnable >}}
   309  
   310  
   311  ### Regular Expressions
   312  
   313  
   314  Syntax Examples: `regexp(predicate, /regular-expression/)` or case insensitive `regexp(predicate, /regular-expression/i)`
   315  
   316  Schema Types: `string`
   317  
   318  Index Required: `trigram`
   319  
   320  
   321  Matches strings by regular expression.  The regular expression language is that of [go regular expressions](https://golang.org/pkg/regexp/syntax/).
   322  
   323  Query Example: At root, match nodes with `Steven Sp` at the start of `name`, followed by any characters.  For each such matched uid, match the films containing `ryan`.  Note the difference with `allofterms`, which would match only `ryan` but regular expression search will also match within terms, such as `bryan`.
   324  
   325  {{< runnable >}}
   326  {
   327    directors(func: regexp(name@en, /^Steven Sp.*$/)) {
   328      name@en
   329      director.film @filter(regexp(name@en, /ryan/i)) {
   330        name@en
   331      }
   332    }
   333  }
   334  {{< /runnable >}}
   335  
   336  
   337  #### Technical details
   338  
   339  A Trigram is a substring of three continuous runes. For example, `Dgraph` has trigrams `Dgr`, `gra`, `rap`, `aph`.
   340  
   341  To ensure efficiency of regular expression matching, Dgraph uses [trigram indexing](https://swtch.com/~rsc/regexp/regexp4.html).  That is, Dgraph converts the regular expression to a trigram query, uses the trigram index and trigram query to find possible matches and applies the full regular expression search only to the possibles.
   342  
   343  #### Writing Efficient Regular Expressions and Limitations
   344  
   345  Keep the following in mind when designing regular expression queries.
   346  
   347  - At least one trigram must be matched by the regular expression (patterns shorter than 3 runes are not supported).  That is, Dgraph requires regular expressions that can be converted to a trigram query.
   348  - The number of alternative trigrams matched by the regular expression should be as small as possible  (`[a-zA-Z][a-zA-Z][0-9]` is not a good idea).  Many possible matches means the full regular expression is checked against many strings; where as, if the expression enforces more trigrams to match, Dgraph can make better use of the index and check the full regular expression against a smaller set of possible matches.
   349  - Thus, the regular expression should be as precise as possible.  Matching longer strings means more required trigrams, which helps to effectively use the index.
   350  - If repeat specifications (`*`, `+`, `?`, `{n,m}`) are used, the entire regular expression must not match the _empty_ string or _any_ string: for example, `*` may be used like `[Aa]bcd*` but not like `(abcd)*` or `(abcd)|((defg)*)`
   351  - Repeat specifications after bracket expressions (e.g. `[fgh]{7}`, `[0-9]+` or `[a-z]{3,5}`) are often considered as matching any string because they match too many trigrams.
   352  - If the partial result (for subset of trigrams) exceeds 1000000 uids during index scan, the query is stopped to prohibit expensive queries.
   353  
   354  
   355  ### Fuzzy matching
   356  
   357  
   358  Syntax: `match(predicate, string, distance)`
   359  
   360  Schema Types: `string`
   361  
   362  Index Required: `trigram`
   363  
   364  Matches predicate values by calculating the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) to the string,
   365  also known as _fuzzy matching_. The distance parameter must be greater than zero (0). Using a greater distance value can yield more but less accurate results.
   366  
   367  Query Example: At root, fuzzy match nodes similar to `Stephen`, with a distance value of 8.
   368  
   369  {{< runnable >}}
   370  {
   371    directors(func: match(name@en, Stephen, 8)) {
   372      name@en
   373    }
   374  }
   375  {{< /runnable >}}
   376  
   377  Same query with a Levenshtein distance of 3.
   378  
   379  {{< runnable >}}
   380  {
   381    directors(func: match(name@en, Stephen, 3)) {
   382      name@en
   383    }
   384  }
   385  {{< /runnable >}}
   386  
   387  
   388  ### Full-Text Search
   389  
   390  Syntax Examples: `alloftext(predicate, "space-separated text")` and `anyoftext(predicate, "space-separated text")`
   391  
   392  Schema Types: `string`
   393  
   394  Index Required: `fulltext`
   395  
   396  
   397  Apply full-text search with stemming and stop words to find strings matching all or any of the given text.
   398  
   399  The following steps are applied during index generation and to process full-text search arguments:
   400  
   401  1. Tokenization (according to Unicode word boundaries).
   402  1. Conversion to lowercase.
   403  1. Unicode-normalization (to [Normalization Form KC](http://unicode.org/reports/tr15/#Norm_Forms)).
   404  1. Stemming using language-specific stemmer (if supported by language).
   405  1. Stop words removal (if supported by language).
   406  
   407  Dgraph uses [bleve](https://github.com/blevesearch/bleve) for its full-text search indexing. See also the bleve language specific [stop word lists](https://github.com/blevesearch/bleve/tree/master/analysis/lang).
   408  
   409  Following table contains all supported languages, corresponding country-codes, stemming and stop words filtering support.
   410  
   411  |  Language  | Country Code | Stemming | Stop words |
   412  | :--------: | :----------: | :------: | :--------: |
   413  |   Arabic   |      ar      | &#10003; |  &#10003;  |
   414  |  Armenian  |      hy      |          |  &#10003;  |
   415  |   Basque   |      eu      |          |  &#10003;  |
   416  | Bulgarian  |      bg      |          |  &#10003;  |
   417  |  Catalan   |      ca      |          |  &#10003;  |
   418  |  Chinese   |      zh      | &#10003; |  &#10003;  |
   419  |   Czech    |      cs      |          |  &#10003;  |
   420  |   Danish   |      da      | &#10003; |  &#10003;  |
   421  |   Dutch    |      nl      | &#10003; |  &#10003;  |
   422  |  English   |      en      | &#10003; |  &#10003;  |
   423  |  Finnish   |      fi      | &#10003; |  &#10003;  |
   424  |   French   |      fr      | &#10003; |  &#10003;  |
   425  |   Gaelic   |      ga      |          |  &#10003;  |
   426  |  Galician  |      gl      |          |  &#10003;  |
   427  |   German   |      de      | &#10003; |  &#10003;  |
   428  |   Greek    |      el      |          |  &#10003;  |
   429  |   Hindi    |      hi      | &#10003; |  &#10003;  |
   430  | Hungarian  |      hu      | &#10003; |  &#10003;  |
   431  | Indonesian |      id      |          |  &#10003;  |
   432  |  Italian   |      it      | &#10003; |  &#10003;  |
   433  |  Japanese  |      ja      | &#10003; |  &#10003;  |
   434  |   Korean   |      ko      | &#10003; |  &#10003;  |
   435  | Norwegian  |      no      | &#10003; |  &#10003;  |
   436  |  Persian   |      fa      |          |  &#10003;  |
   437  | Portuguese |      pt      | &#10003; |  &#10003;  |
   438  |  Romanian  |      ro      | &#10003; |  &#10003;  |
   439  |  Russian   |      ru      | &#10003; |  &#10003;  |
   440  |  Spanish   |      es      | &#10003; |  &#10003;  |
   441  |  Swedish   |      sv      | &#10003; |  &#10003;  |
   442  |  Turkish   |      tr      | &#10003; |  &#10003;  |
   443  
   444  
   445  Query Example: All names that have `dog`, `dogs`, `bark`, `barks`, `barking`, etc.  Stop word removal eliminates `the` and `which`.
   446  
   447  {{< runnable >}}
   448  {
   449    movie(func:alloftext(name@en, "the dog which barks")) {
   450      name@en
   451    }
   452  }
   453  {{< /runnable >}}
   454  
   455  
   456  ### Inequality
   457  
   458  #### equal to
   459  
   460  Syntax Examples:
   461  
   462  * `eq(predicate, value)`
   463  * `eq(val(varName), value)`
   464  * `eq(predicate, val(varName))`
   465  * `eq(count(predicate), value)`
   466  * `eq(predicate, [val1, val2, ..., valN])`
   467  * `eq(predicate, [$var1, "value", ..., $varN])`
   468  
   469  Schema Types: `int`, `float`, `bool`, `string`, `dateTime`
   470  
   471  Index Required: An index is required for the `eq(predicate, ...)` forms (see table below).  For `count(predicate)` at the query root, the `@count` index is required. For variables the values have been calculated as part of the query, so no index is required.
   472  
   473  | Type       | Index Options |
   474  |:-----------|:--------------|
   475  | `int`      | `int`         |
   476  | `float`    | `float`       |
   477  | `bool`     | `bool`        |
   478  | `string`   | `exact`, `hash` |
   479  | `dateTime` | `dateTime`    |
   480  
   481  Test for equality of a predicate or variable to a value or find in a list of values.
   482  
   483  The boolean constants are `true` and `false`, so with `eq` this becomes, for example, `eq(boolPred, true)`.
   484  
   485  Query Example: Movies with exactly thirteen genres.
   486  
   487  {{< runnable >}}
   488  {
   489    me(func: eq(count(genre), 13)) {
   490      name@en
   491      genre {
   492        name@en
   493      }
   494    }
   495  }
   496  {{< /runnable >}}
   497  
   498  
   499  Query Example: Directors called Steven who have directed 1,2 or 3 movies.
   500  
   501  {{< runnable >}}
   502  {
   503    steve as var(func: allofterms(name@en, "Steven")) {
   504      films as count(director.film)
   505    }
   506  
   507    stevens(func: uid(steve)) @filter(eq(val(films), [1,2,3])) {
   508      name@en
   509      numFilms : val(films)
   510    }
   511  }
   512  {{< /runnable >}}
   513  
   514  
   515  #### less than, less than or equal to, greater than and greater than or equal to
   516  
   517  Syntax Examples: for inequality `IE`
   518  
   519  * `IE(predicate, value)`
   520  * `IE(val(varName), value)`
   521  * `IE(predicate, val(varName))`
   522  * `IE(count(predicate), value)`
   523  
   524  With `IE` replaced by
   525  
   526  * `le` less than or equal to
   527  * `lt` less than
   528  * `ge` greater than or equal to
   529  * `gt` greather than
   530  
   531  Schema Types: `int`, `float`, `string`, `dateTime`
   532  
   533  Index required: An index is required for the `IE(predicate, ...)` forms (see table below).  For `count(predicate)` at the query root, the `@count` index is required. For variables the values have been calculated as part of the query, so no index is required.
   534  
   535  | Type       | Index Options |
   536  |:-----------|:--------------|
   537  | `int`      | `int`         |
   538  | `float`    | `float`       |
   539  | `string`   | `exact`       |
   540  | `dateTime` | `dateTime`    |
   541  
   542  
   543  Query Example: Ridley Scott movies released before 1980.
   544  
   545  {{< runnable >}}
   546  {
   547    me(func: eq(name@en, "Ridley Scott")) {
   548      name@en
   549      director.film @filter(lt(initial_release_date, "1980-01-01"))  {
   550        initial_release_date
   551        name@en
   552      }
   553    }
   554  }
   555  {{< /runnable >}}
   556  
   557  
   558  Query Example: Movies with directors with `Steven` in `name` and have directed more than `100` actors.
   559  
   560  {{< runnable >}}
   561  {
   562    ID as var(func: allofterms(name@en, "Steven")) {
   563      director.film {
   564        num_actors as count(starring)
   565      }
   566      total as sum(val(num_actors))
   567    }
   568  
   569    dirs(func: uid(ID)) @filter(gt(val(total), 100)) {
   570      name@en
   571      total_actors : val(total)
   572    }
   573  }
   574  {{< /runnable >}}
   575  
   576  
   577  
   578  Query Example: A movie in each genre that has over 30000 movies.  Because there is no order specified on genres, the order will be by UID.  The [count index]({{< relref "#count-index">}}) records the number of edges out of nodes and makes such queries more .
   579  
   580  {{< runnable >}}
   581  {
   582    genre(func: gt(count(~genre), 30000)){
   583      name@en
   584      ~genre (first:1) {
   585        name@en
   586      }
   587    }
   588  }
   589  {{< /runnable >}}
   590  
   591  Query Example: Directors called Steven and their movies which have `initial_release_date` greater
   592  than that of the movie Minority Report.
   593  
   594  {{< runnable >}}
   595  {
   596    var(func: eq(name@en,"Minority Report")) {
   597      d as initial_release_date
   598    }
   599  
   600    me(func: eq(name@en, "Steven Spielberg")) {
   601      name@en
   602      director.film @filter(ge(initial_release_date, val(d))) {
   603        initial_release_date
   604        name@en
   605      }
   606    }
   607  }
   608  {{< /runnable >}}
   609  
   610  
   611  ### uid
   612  
   613  Syntax Examples:
   614  
   615  * `q(func: uid(<uid>)) `
   616  * `predicate @filter(uid(<uid1>, ..., <uidn>))`
   617  * `predicate @filter(uid(a))` for variable `a`
   618  * `q(func: uid(a,b))` for variables `a` and `b`
   619  
   620  
   621  Filters nodes at the current query level to only nodes in the given set of UIDs.
   622  
   623  For query variable `a`, `uid(a)` represents the set of UIDs stored in `a`.  For value variable `b`, `uid(b)` represents the UIDs from the UID to value map.  With two or more variables, `uid(a,b,...)` represents the union of all the variables.
   624  
   625  `uid(<uid>)`, like an identity function, will return the requested UID even if the node does not have any edges.
   626  
   627  Query Example: If the UID of a node is known, values for the node can be read directly.  The films of Priyanka Chopra by known UID
   628  
   629  {{< runnable >}}
   630  {
   631    films(func: uid(0x1daf5)) {
   632      name@hi
   633      actor.film {
   634        performance.film {
   635          name@hi
   636        }
   637      }
   638    }
   639  }
   640  {{< /runnable >}}
   641  
   642  
   643  
   644  Query Example: The films of Taraji Henson by genre.
   645  {{< runnable >}}
   646  {
   647    var(func: allofterms(name@en, "Taraji Henson")) {
   648      actor.film {
   649        F as performance.film {
   650          G as genre
   651        }
   652      }
   653    }
   654  
   655    Taraji_films_by_genre(func: uid(G)) {
   656      genre_name : name@en
   657      films : ~genre @filter(uid(F)) {
   658        film_name : name@en
   659      }
   660    }
   661  }
   662  {{< /runnable >}}
   663  
   664  
   665  
   666  Query Example: Taraji Henson films ordered by number of genres, with genres listed in order of how many films Taraji has made in each genre.
   667  {{< runnable >}}
   668  {
   669    var(func: allofterms(name@en, "Taraji Henson")) {
   670      actor.film {
   671        F as performance.film {
   672          G as count(genre)
   673          genre {
   674            C as count(~genre @filter(uid(F)))
   675          }
   676        }
   677      }
   678    }
   679  
   680    Taraji_films_by_genre_count(func: uid(G), orderdesc: val(G)) {
   681      film_name : name@en
   682      genres : genre (orderdesc: val(C)) {
   683        genre_name : name@en
   684      }
   685    }
   686  }
   687  {{< /runnable >}}
   688  
   689  
   690  ### uid_in
   691  
   692  
   693  Syntax Examples:
   694  
   695  * `q(func: ...) @filter(uid_in(predicate, <uid>))`
   696  * `predicate1 @filter(uid_in(predicate2, <uid>))`
   697  
   698  Schema Types: UID
   699  
   700  Index Required: none
   701  
   702  While the `uid` function filters nodes at the current level based on UID, function `uid_in` allows looking ahead along an edge to check that it leads to a particular UID.  This can often save an extra query block and avoids returning the edge.
   703  
   704  `uid_in` cannot be used at root, it accepts one UID constant as its argument (not a variable).
   705  
   706  
   707  Query Example: The collaborations of Marc Caro and Jean-Pierre Jeunet (UID 0x99706).  If the UID of Jean-Pierre Jeunet is known, querying this way removes the need to have a block extracting his UID into a variable and the extra edge traversal and filter for `~director.film`.
   708  {{< runnable >}}
   709  {
   710    caro(func: eq(name@en, "Marc Caro")) {
   711      name@en
   712      director.film @filter(uid_in(~director.film, 0x99706)) {
   713        name@en
   714      }
   715    }
   716  }
   717  {{< /runnable >}}
   718  
   719  
   720  ### has
   721  
   722  Syntax Examples: `has(predicate)`
   723  
   724  Schema Types: all
   725  
   726  Determines if a node has a particular predicate.
   727  
   728  Query Example: First five directors and all their movies that have a release date recorded.  Directors have directed at least one film --- equivalent semantics to `gt(count(director.film), 0)`.
   729  {{< runnable >}}
   730  {
   731    me(func: has(director.film), first: 5) {
   732      name@en
   733      director.film @filter(has(initial_release_date))  {
   734        initial_release_date
   735        name@en
   736      }
   737    }
   738  }
   739  {{< /runnable >}}
   740  
   741  ### Geolocation
   742  
   743  {{% notice "note" %}} As of now we only support indexing Point, Polygon and MultiPolygon [geometry types](https://github.com/twpayne/go-geom#geometry-types). However, Dgraph can store other types of gelocation data. {{% /notice %}}
   744  
   745  Note that for geo queries, any polygon with holes is replace with the outer loop, ignoring holes.  Also, as for version 0.7.7 polygon containment checks are approximate.
   746  
   747  #### Mutations
   748  
   749  To make use of the geo functions you would need an index on your predicate.
   750  ```
   751  loc: geo @index(geo) .
   752  ```
   753  
   754  Here is how you would add a `Point`.
   755  
   756  ```
   757  {
   758    set {
   759      <_:0xeb1dde9c> <loc> "{'type':'Point','coordinates':[-122.4220186,37.772318]}"^^<geo:geojson> .
   760      <_:0xeb1dde9c> <name> "Hamon Tower" .
   761    }
   762  }
   763  ```
   764  
   765  Here is how you would associate a `Polygon` with a node. Adding a `MultiPolygon` is also similar.
   766  
   767  ```
   768  {
   769    set {
   770      <_:0xf76c276b> <loc> "{'type':'Polygon','coordinates':[[[-122.409869,37.7785442],[-122.4097444,37.7786443],[-122.4097544,37.7786521],[-122.4096334,37.7787494],[-122.4096233,37.7787416],[-122.4094004,37.7789207],[-122.4095818,37.7790617],[-122.4097883,37.7792189],[-122.4102599,37.7788413],[-122.409869,37.7785442]],[[-122.4097357,37.7787848],[-122.4098499,37.778693],[-122.4099025,37.7787339],[-122.4097882,37.7788257],[-122.4097357,37.7787848]]]}"^^<geo:geojson> .
   771      <_:0xf76c276b> <name> "Best Western Americana Hotel" .
   772    }
   773  }
   774  ```
   775  
   776  The above examples have been picked from our [SF Tourism](https://github.com/dgraph-io/benchmarks/blob/master/data/sf.tourism.gz?raw=true) dataset.
   777  
   778  #### Query
   779  
   780  ##### near
   781  
   782  Syntax Example: `near(predicate, [long, lat], distance)`
   783  
   784  Schema Types: `geo`
   785  
   786  Index Required: `geo`
   787  
   788  Matches all entities where the location given by `predicate` is within `distance` meters of geojson coordinate `[long, lat]`.
   789  
   790  Query Example: Tourist destinations within 1000 meters (1 kilometer) of a point in Golden Gate Park in San Francisco.
   791  
   792  {{< runnable >}}
   793  {
   794    tourist(func: near(loc, [-122.469829, 37.771935], 1000) ) {
   795      name
   796    }
   797  }
   798  {{< /runnable >}}
   799  
   800  
   801  ##### within
   802  
   803  Syntax Example: `within(predicate, [[[long1, lat1], ..., [longN, latN]]])`
   804  
   805  Schema Types: `geo`
   806  
   807  Index Required: `geo`
   808  
   809  Matches all entities where the location given by `predicate` lies within the polygon specified by the geojson coordinate array.
   810  
   811  Query Example: Tourist destinations within the specified area of Golden Gate Park, San Francisco.
   812  
   813  {{< runnable >}}
   814  {
   815    tourist(func: within(loc, [[[-122.47266769409178, 37.769018558337926 ], [ -122.47266769409178, 37.773699921075135 ], [ -122.4651575088501, 37.773699921075135 ], [ -122.4651575088501, 37.769018558337926 ], [ -122.47266769409178, 37.769018558337926]]] )) {
   816      name
   817    }
   818  }
   819  {{< /runnable >}}
   820  
   821  
   822  ##### contains
   823  
   824  Syntax Examples: `contains(predicate, [long, lat])` or `contains(predicate, [[long1, lat1], ..., [longN, latN]])`
   825  
   826  Schema Types: `geo`
   827  
   828  Index Required: `geo`
   829  
   830  Matches all entities where the polygon describing the location given by `predicate` contains geojson coordinate `[long, lat]` or given geojson polygon.
   831  
   832  Query Example : All entities that contain a point in the flamingo enclosure of San Francisco Zoo.
   833  {{< runnable >}}
   834  {
   835    tourist(func: contains(loc, [ -122.50326097011566, 37.73353615592843 ] )) {
   836      name
   837    }
   838  }
   839  {{< /runnable >}}
   840  
   841  
   842  ##### intersects
   843  
   844  Syntax Example: `intersects(predicate, [[[long1, lat1], ..., [longN, latN]]])`
   845  
   846  Schema Types: `geo`
   847  
   848  Index Required: `geo`
   849  
   850  Matches all entities where the polygon describing the location given by `predicate` intersects the given geojson polygon.
   851  
   852  
   853  {{< runnable >}}
   854  {
   855    tourist(func: intersects(loc, [[[-122.503325343132, 37.73345766902749 ], [ -122.503325343132, 37.733903134117966 ], [ -122.50271648168564, 37.733903134117966 ], [ -122.50271648168564, 37.73345766902749 ], [ -122.503325343132, 37.73345766902749]]] )) {
   856      name
   857    }
   858  }
   859  {{< /runnable >}}
   860  
   861  
   862  
   863  ## Connecting Filters
   864  
   865  Within `@filter` multiple functions can be used with boolean connectives.
   866  
   867  ### AND, OR and NOT
   868  
   869  Connectives `AND`, `OR` and `NOT` join filters and can be built into arbitrarily complex filters, such as `(NOT A OR B) AND (C AND NOT (D OR E))`.  Note that, `NOT` binds more tightly than `AND` which binds more tightly than `OR`.
   870  
   871  Query Example : All Steven Spielberg movies that contain either both "indiana" and "jones" OR both "jurassic" and "park".
   872  
   873  {{< runnable >}}
   874  {
   875    me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) {
   876      name@en
   877      director.film @filter(allofterms(name@en, "jones indiana") OR allofterms(name@en, "jurassic park"))  {
   878        uid
   879        name@en
   880      }
   881    }
   882  }
   883  {{< /runnable >}}
   884  
   885  
   886  ## Alias
   887  
   888  Syntax Examples:
   889  
   890  * `aliasName : predicate`
   891  * `aliasName : predicate { ... }`
   892  * `aliasName : varName as ...`
   893  * `aliasName : count(predicate)`
   894  * `aliasName : max(val(varName))`
   895  
   896  An alias provides an alternate name in results.  Predicates, variables and aggregates can be aliased by prefixing with the alias name and `:`.  Aliases do not have to be different to the original predicate name, but, within a block, an alias must be distinct from predicate names and other aliases returned in the same block.  Aliases can be used to return the same predicate multiple times within a block.
   897  
   898  
   899  
   900  Query Example: Directors with `name` matching term `Steven`, their UID, English name, average number of actors per movie, total number of films, and the name of each film in English and French.
   901  {{< runnable >}}
   902  {
   903    ID as var(func: allofterms(name@en, "Steven")) @filter(has(director.film)) {
   904      director.film {
   905        num_actors as count(starring)
   906      }
   907      average as avg(val(num_actors))
   908    }
   909  
   910    films(func: uid(ID)) {
   911      director_id : uid
   912      english_name : name@en
   913      average_actors : val(average)
   914      num_films : count(director.film)
   915  
   916      films : director.film {
   917        name : name@en
   918        english_name : name@en
   919        french_name : name@fr
   920      }
   921    }
   922  }
   923  {{< /runnable >}}
   924  
   925  
   926  ## Pagination
   927  
   928  Pagination allows returning only a portion, rather than the whole, result set.  This can be useful for top-k style queries as well as to reduce the size of the result set for client side processing or to allow paged access to results.
   929  
   930  Pagination is often used with [sorting]({{< relref "#sorting">}}).
   931  
   932  {{% notice "note" %}}Without a sort order specified, the results are sorted by `uid`, which is assigned randomly. So the ordering, while deterministic, might not be what you expected.{{% /notice  %}}
   933  
   934  ### First
   935  
   936  Syntax Examples:
   937  
   938  * `q(func: ..., first: N)`
   939  * `predicate (first: N) { ... }`
   940  * `predicate @filter(...) (first: N) { ... }`
   941  
   942  For positive `N`, `first: N` retrieves the first `N` results, by sorted or UID order.
   943  
   944  For negative `N`, `first: N` retrieves the last `N` results, by sorted or UID order.  Currently, negative is only supported when no order is applied.  To achieve the effect of a negative with a sort, reverse the order of the sort and use a positive `N`.
   945  
   946  
   947  Query Example: Last two films, by UID order, directed by Steven Spielberg and the first three genres of those movies, sorted alphabetically by English name.
   948  
   949  {{< runnable >}}
   950  {
   951    me(func: allofterms(name@en, "Steven Spielberg")) {
   952      director.film (first: -2) {
   953        name@en
   954        initial_release_date
   955        genre (orderasc: name@en) (first: 3) {
   956            name@en
   957        }
   958      }
   959    }
   960  }
   961  {{< /runnable >}}
   962  
   963  
   964  
   965  Query Example: The three directors named Steven who have directed the most actors of all directors named Steven.
   966  
   967  {{< runnable >}}
   968  {
   969    ID as var(func: allofterms(name@en, "Steven")) @filter(has(director.film)) {
   970      director.film {
   971        stars as count(starring)
   972      }
   973      totalActors as sum(val(stars))
   974    }
   975  
   976    mostStars(func: uid(ID), orderdesc: val(totalActors), first: 3) {
   977      name@en
   978      stars : val(totalActors)
   979  
   980      director.film {
   981        name@en
   982      }
   983    }
   984  }
   985  {{< /runnable >}}
   986  
   987  ### Offset
   988  
   989  Syntax Examples:
   990  
   991  * `q(func: ..., offset: N)`
   992  * `predicate (offset: N) { ... }`
   993  * `predicate (first: M, offset: N) { ... }`
   994  * `predicate @filter(...) (offset: N) { ... }`
   995  
   996  With `offset: N` the first `N` results are not returned.  Used in combination with first, `first: M, offset: N` skips over `N` results and returns the following `M`.
   997  
   998  Query Example: Order Hark Tsui's films by English title, skip over the first 4 and return the following 6.
   999  
  1000  {{< runnable >}}
  1001  {
  1002    me(func: allofterms(name@en, "Hark Tsui")) {
  1003      name@zh
  1004      name@en
  1005      director.film (orderasc: name@en) (first:6, offset:4)  {
  1006        genre {
  1007          name@en
  1008        }
  1009        name@zh
  1010        name@en
  1011        initial_release_date
  1012      }
  1013    }
  1014  }
  1015  {{< /runnable >}}
  1016  
  1017  ### After
  1018  
  1019  Syntax Examples:
  1020  
  1021  * `q(func: ..., after: UID)`
  1022  * `predicate (first: N, after: UID) { ... }`
  1023  * `predicate @filter(...) (first: N, after: UID) { ... }`
  1024  
  1025  Another way to get results after skipping over some results is to use the default UID ordering and skip directly past a node specified by UID.  For example, a first query could be of the form `predicate (after: 0x0, first: N)`, or just `predicate (first: N)`, with subsequent queries of the form `predicate(after: <uid of last entity in last result>, first: N)`.
  1026  
  1027  
  1028  Query Example: The first five of Baz Luhrmann's films, sorted by UID order.
  1029  
  1030  {{< runnable >}}
  1031  {
  1032    me(func: allofterms(name@en, "Baz Luhrmann")) {
  1033      name@en
  1034      director.film (first:5) {
  1035        uid
  1036        name@en
  1037      }
  1038    }
  1039  }
  1040  {{< /runnable >}}
  1041  
  1042  The fifth movie is the Australian movie classic Strictly Ballroom.  It has UID `0x99e44`.  The results after Strictly Ballroom can now be obtained with `after`.
  1043  
  1044  {{< runnable >}}
  1045  {
  1046    me(func: allofterms(name@en, "Baz Luhrmann")) {
  1047      name@en
  1048      director.film (first:5, after: 0x99e44) {
  1049        uid
  1050        name@en
  1051      }
  1052    }
  1053  }
  1054  {{< /runnable >}}
  1055  
  1056  
  1057  ## Count
  1058  
  1059  Syntax Examples:
  1060  
  1061  * `count(predicate)`
  1062  * `count(uid)`
  1063  
  1064  The form `count(predicate)` counts how many `predicate` edges lead out of a node.
  1065  
  1066  The form `count(uid)` counts the number of UIDs matched in the enclosing block.
  1067  
  1068  Query Example: The number of films acted in by each actor with `Orlando` in their name.
  1069  
  1070  {{< runnable >}}
  1071  {
  1072    me(func: allofterms(name@en, "Orlando")) @filter(has(actor.film)) {
  1073      name@en
  1074      count(actor.film)
  1075    }
  1076  }
  1077  {{< /runnable >}}
  1078  
  1079  Count can be used at root and [aliased]({{< relref "#alias">}}).
  1080  
  1081  Query Example: Count of directors who have directed more than five films.  When used at the query root, the [count index]({{< relref "#count-index">}}) is required.
  1082  
  1083  {{< runnable >}}
  1084  {
  1085    directors(func: gt(count(director.film), 5)) {
  1086      totalDirectors : count(uid)
  1087    }
  1088  }
  1089  {{< /runnable >}}
  1090  
  1091  
  1092  Count can be assigned to a [value variable]({{< relref "#value-variables">}}).
  1093  
  1094  Query Example: The actors of Ang Lee's "Eat Drink Man Woman" ordered by the number of movies acted in.
  1095  
  1096  {{< runnable >}}
  1097  {
  1098    var(func: allofterms(name@en, "eat drink man woman")) {
  1099      starring {
  1100        actors as performance.actor {
  1101          totalRoles as count(actor.film)
  1102        }
  1103      }
  1104    }
  1105  
  1106    edmw(func: uid(actors), orderdesc: val(totalRoles)) {
  1107      name@en
  1108      name@zh
  1109      totalRoles : val(totalRoles)
  1110    }
  1111  }
  1112  {{< /runnable >}}
  1113  
  1114  
  1115  ## Sorting
  1116  
  1117  Syntax Examples:
  1118  
  1119  * `q(func: ..., orderasc: predicate)`
  1120  * `q(func: ..., orderdesc: val(varName))`
  1121  * `predicate (orderdesc: predicate) { ... }`
  1122  * `predicate @filter(...) (orderasc: N) { ... }`
  1123  * `q(func: ..., orderasc: predicate1, orderdesc: predicate2)`
  1124  
  1125  Sortable Types: `int`, `float`, `String`, `dateTime`, `default`
  1126  
  1127  Results can be sorted in ascending order (`orderasc`) or descending order (`orderdesc`) by a predicate or variable.
  1128  
  1129  For sorting on predicates with [sortable indices]({{< relref "#sortable-indices">}}), Dgraph sorts on the values and with the index in parallel and returns whichever result is computed first.
  1130  
  1131  Sorted queries retrieve up to 1000 results by default. This can be changed with [first]({{< relref "#first">}}).
  1132  
  1133  
  1134  Query Example: French director Jean-Pierre Jeunet's movies sorted by release date.
  1135  
  1136  {{< runnable >}}
  1137  {
  1138    me(func: allofterms(name@en, "Jean-Pierre Jeunet")) {
  1139      name@fr
  1140      director.film(orderasc: initial_release_date) {
  1141        name@fr
  1142        name@en
  1143        initial_release_date
  1144      }
  1145    }
  1146  }
  1147  {{< /runnable >}}
  1148  
  1149  Sorting can be performed at root and on value variables.
  1150  
  1151  Query Example: All genres sorted alphabetically and the five movies in each genre with the most genres.
  1152  
  1153  {{< runnable >}}
  1154  {
  1155    genres as var(func: has(~genre)) {
  1156      ~genre {
  1157        numGenres as count(genre)
  1158      }
  1159    }
  1160  
  1161    genres(func: uid(genres), orderasc: name@en) {
  1162      name@en
  1163      ~genre (orderdesc: val(numGenres), first: 5) {
  1164        name@en
  1165        genres : val(numGenres)
  1166      }
  1167    }
  1168  }
  1169  {{< /runnable >}}
  1170  
  1171  Sorting can also be performed by multiple predicates as shown below. If the values are equal for the
  1172  first predicate, then they are sorted by the second predicate and so on.
  1173  
  1174  Query Example: Find all nodes which have type Person, sort them by their first_name and among those
  1175  that have the same first_name sort them by last_name in descending order.
  1176  
  1177  ```
  1178  {
  1179    me(func: type("Person"), orderasc: first_name, orderdesc: last_name) {
  1180      first_name
  1181      last_name
  1182    }
  1183  }
  1184  ```
  1185  
  1186  ## Multiple Query Blocks
  1187  
  1188  Inside a single query, multiple query blocks are allowed.  The result is all blocks with corresponding block names.
  1189  
  1190  Multiple query blocks are executed in parallel.
  1191  
  1192  The blocks need not be related in any way.
  1193  
  1194  Query Example: All of Angelina Jolie's films, with genres, and Peter Jackson's films since 2008.
  1195  
  1196  {{< runnable >}}
  1197  {
  1198   AngelinaInfo(func:allofterms(name@en, "angelina jolie")) {
  1199    name@en
  1200     actor.film {
  1201      performance.film {
  1202        genre {
  1203          name@en
  1204        }
  1205      }
  1206     }
  1207    }
  1208  
  1209   DirectorInfo(func: eq(name@en, "Peter Jackson")) {
  1210      name@en
  1211      director.film @filter(ge(initial_release_date, "2008"))  {
  1212          Release_date: initial_release_date
  1213          Name: name@en
  1214      }
  1215    }
  1216  }
  1217  {{< /runnable >}}
  1218  
  1219  
  1220  If queries contain some overlap in answers, the result sets are still independent.
  1221  
  1222  Query Example: The movies Mackenzie Crook has acted in and the movies Jack Davenport has acted in.  The results sets overlap because both have acted in the Pirates of the Caribbean movies, but the results are independent and both contain the full answers sets.
  1223  
  1224  {{< runnable >}}
  1225  {
  1226    Mackenzie(func:allofterms(name@en, "Mackenzie Crook")) {
  1227      name@en
  1228      actor.film {
  1229        performance.film {
  1230          uid
  1231          name@en
  1232        }
  1233        performance.character {
  1234          name@en
  1235        }
  1236      }
  1237    }
  1238  
  1239    Jack(func:allofterms(name@en, "Jack Davenport")) {
  1240      name@en
  1241      actor.film {
  1242        performance.film {
  1243          uid
  1244          name@en
  1245        }
  1246        performance.character {
  1247          name@en
  1248        }
  1249      }
  1250    }
  1251  }
  1252  {{< /runnable >}}
  1253  
  1254  
  1255  ### Var Blocks
  1256  
  1257  Var blocks start with the keyword `var` and are not returned in the query results.
  1258  
  1259  Query Example: Angelina Jolie's movies ordered by genre.
  1260  
  1261  {{< runnable >}}
  1262  {
  1263    var(func:allofterms(name@en, "angelina jolie")) {
  1264      name@en
  1265      actor.film {
  1266        A AS performance.film {
  1267          B AS genre
  1268        }
  1269      }
  1270    }
  1271  
  1272    films(func: uid(B), orderasc: name@en) {
  1273      name@en
  1274      ~genre @filter(uid(A)) {
  1275        name@en
  1276      }
  1277    }
  1278  }
  1279  {{< /runnable >}}
  1280  
  1281  
  1282  ## Query Variables
  1283  
  1284  Syntax Examples:
  1285  
  1286  * `varName as q(func: ...) { ... }`
  1287  * `varName as var(func: ...) { ... }`
  1288  * `varName as predicate { ... }`
  1289  * `varName as predicate @filter(...) { ... }`
  1290  
  1291  Types : `uid`
  1292  
  1293  Nodes (UIDs) matched at one place in a query can be stored in a variable and used elsewhere.  Query variables can be used in other query blocks or in a child node of the defining block.
  1294  
  1295  Query variables do not affect the semantics of the query at the point of definition.  Query variables are evaluated to all nodes matched by the defining block.
  1296  
  1297  In general, query blocks are executed in parallel, but variables impose an evaluation order on some blocks.  Cycles induced by variable dependence are not permitted.
  1298  
  1299  If a variable is defined, it must be used elsewhere in the query.
  1300  
  1301  A query variable is used by extracting the UIDs in it with `uid(var-name)`.
  1302  
  1303  The syntax `func: uid(A,B)` or `@filter(uid(A,B))` means the union of UIDs for variables `A` and `B`.
  1304  
  1305  Query Example: The movies of Angelia Jolie and Brad Pitt where both have acted on movies in the same genre.  Note that `B` and `D` match all genres for all movies, not genres per movie.
  1306  {{< runnable >}}
  1307  {
  1308   var(func:allofterms(name@en, "angelina jolie")) {
  1309     actor.film {
  1310      A AS performance.film {  # All films acted in by Angelina Jolie
  1311       B As genre  # Genres of all the films acted in by Angelina Jolie
  1312      }
  1313     }
  1314    }
  1315  
  1316   var(func:allofterms(name@en, "brad pitt")) {
  1317     actor.film {
  1318      C AS performance.film {  # All films acted in by Brad Pitt
  1319       D as genre  # Genres of all the films acted in by Brad Pitt
  1320      }
  1321     }
  1322    }
  1323  
  1324   films(func: uid(D)) @filter(uid(B)) {   # Genres from both Angelina and Brad
  1325    name@en
  1326     ~genre @filter(uid(A, C)) {  # Movies in either A or C.
  1327       name@en
  1328     }
  1329   }
  1330  }
  1331  {{< /runnable >}}
  1332  
  1333  
  1334  ## Value Variables
  1335  
  1336  Syntax Examples:
  1337  
  1338  * `varName as scalarPredicate`
  1339  * `varName as count(predicate)`
  1340  * `varName as avg(...)`
  1341  * `varName as math(...)`
  1342  
  1343  Types : `int`, `float`, `String`, `dateTime`, `default`, `geo`, `bool`
  1344  
  1345  Value variables store scalar values.  Value variables are a map from the UIDs of the enclosing block to the corresponding values.
  1346  
  1347  It therefore only makes sense to use the values from a value variable in a context that matches the same UIDs - if used in a block matching different UIDs the value variable is undefined.
  1348  
  1349  It is an error to define a value variable but not use it elsewhere in the query.
  1350  
  1351  Value variables are used by extracting the values with `val(var-name)`, or by extracting the UIDs with `uid(var-name)`.
  1352  
  1353  [Facet]({{< relref "#facets-edge-attributes">}}) values can be stored in value variables.
  1354  
  1355  Query Example: The number of movie roles played by the actors of the 80's classic "The Princess Bride".  Query variable `pbActors` matches the UIDs of all actors from the movie.  Value variable `roles` is thus a map from actor UID to number of roles.  Value variable `roles` can be used in the `totalRoles` query block because that query block also matches the `pbActors` UIDs, so the actor to number of roles map is available.
  1356  
  1357  {{< runnable >}}
  1358  {
  1359    var(func:allofterms(name@en, "The Princess Bride")) {
  1360      starring {
  1361        pbActors as performance.actor {
  1362          roles as count(actor.film)
  1363        }
  1364      }
  1365    }
  1366    totalRoles(func: uid(pbActors), orderasc: val(roles)) {
  1367      name@en
  1368      numRoles : val(roles)
  1369    }
  1370  }
  1371  {{< /runnable >}}
  1372  
  1373  
  1374  Value variables can be used in place of UID variables by extracting the UID list from the map.
  1375  
  1376  Query Example: The same query as the previous example, but using value variable `roles` for matching UIDs in the `totalRoles` query block.
  1377  
  1378  {{< runnable >}}
  1379  {
  1380    var(func:allofterms(name@en, "The Princess Bride")) {
  1381      starring {
  1382        performance.actor {
  1383          roles as count(actor.film)
  1384        }
  1385      }
  1386    }
  1387    totalRoles(func: uid(roles), orderasc: val(roles)) {
  1388      name@en
  1389      numRoles : val(roles)
  1390    }
  1391  }
  1392  {{< /runnable >}}
  1393  
  1394  
  1395  ### Variable Propagation
  1396  
  1397  Like query variables, value variables can be used in other query blocks and in blocks nested within the defining block.  When used in a block nested within the block that defines the variable, the value is computed as a sum of the variable for parent nodes along all paths to the point of use.  This is called variable propagation.
  1398  
  1399  For example:
  1400  ```
  1401  {
  1402    q(func: uid(0x01)) {
  1403      myscore as math(1)          # A
  1404      friends {                   # B
  1405        friends {                 # C
  1406          ...myscore...
  1407        }
  1408      }
  1409    }
  1410  }
  1411  ```
  1412  At line A, a value variable `myscore` is defined as mapping node with UID `0x01` to value 1.  At B, the value for each friend is still 1: there is only one path to each friend.  Traversing the friend edge twice reaches the friends of friends. The variable `myscore` gets propagated such that each friend of friend will receive the sum of its parents values:  if a friend of a friend is reachable from only one friend, the value is still 1, if they are reachable from two friends, the value is two and so on.  That is, the value of `myscore` for each friend of friends inside the block marked C will be the number of paths to them.
  1413  
  1414  **The value that a node receives for a propagated variable is the sum of the values of all its parent nodes.**
  1415  
  1416  This propagation is useful, for example, in normalizing a sum across users, finding the number of paths between nodes and accumulating a sum through a graph.
  1417  
  1418  
  1419  
  1420  Query Example: For each Harry Potter movie, the number of roles played by actor Warwick Davis.
  1421  {{< runnable >}}
  1422  {
  1423      num_roles(func: eq(name@en, "Warwick Davis")) @cascade @normalize {
  1424  
  1425      paths as math(1)  # records number of paths to each character
  1426  
  1427      actor : name@en
  1428  
  1429      actor.film {
  1430        performance.film @filter(allofterms(name@en, "Harry Potter")) {
  1431          film_name : name@en
  1432          characters : math(paths)  # how many paths (i.e. characters) reach this film
  1433        }
  1434      }
  1435    }
  1436  }
  1437  {{< /runnable >}}
  1438  
  1439  
  1440  Query Example: Each actor who has been in a Peter Jackson movie and the fraction of Peter Jackson movies they have appeared in.
  1441  {{< runnable >}}
  1442  {
  1443      movie_fraction(func:eq(name@en, "Peter Jackson")) @normalize {
  1444  
  1445      paths as math(1)
  1446      total_films : num_films as count(director.film)
  1447      director : name@en
  1448  
  1449      director.film {
  1450        starring {
  1451          performance.actor {
  1452            fraction : math(paths / (num_films/paths))
  1453            actor : name@en
  1454          }
  1455        }
  1456      }
  1457    }
  1458  }
  1459  {{< /runnable >}}
  1460  
  1461  More examples can be found in two Dgraph blog posts about using variable propagation for recommendation engines ([post 1](https://open.dgraph.io/post/recommendation/), [post 2](https://open.dgraph.io/post/recommendation2/)).
  1462  
  1463  ## Aggregation
  1464  
  1465  Syntax Example: `AG(val(varName))`
  1466  
  1467  For `AG` replaced with
  1468  
  1469  * `min` : select the minimum value in the value variable `varName`
  1470  * `max` : select the maximum value
  1471  * `sum` : sum all values in value variable `varName`
  1472  * `avg` : calculate the average of values in `varName`
  1473  
  1474  Schema Types:
  1475  
  1476  | Aggregation       | Schema Types |
  1477  |:-----------|:--------------|
  1478  | `min` / `max`     | `int`, `float`, `string`, `dateTime`, `default`         |
  1479  | `sum` / `avg`    | `int`, `float`       |
  1480  
  1481  Aggregation can only be applied to [value variables]({{< relref "#value-variables">}}).  An index is not required (the values have already been found and stored in the value variable mapping).
  1482  
  1483  An aggregation is applied at the query block enclosing the variable definition.  As opposed to query variables and value variables, which are global, aggregation is computed locally.  For example:
  1484  ```
  1485  A as predicateA {
  1486    ...
  1487    B as predicateB {
  1488      x as ...some value...
  1489    }
  1490    min(val(x))
  1491  }
  1492  ```
  1493  Here, `A` and `B` are the lists of all UIDs that match these blocks.  Value variable `x` is a mapping from UIDs in `B` to values.  The aggregation `min(val(x))`, however, is computed for each UID in `A`.  That is, it has a semantics of: for each UID in `A`, take the slice of `x` that corresponds to `A`'s outgoing `predicateB` edges and compute the aggregation for those values.
  1494  
  1495  Aggregations can themselves be assigned to value variables, making a UID to aggregation map.
  1496  
  1497  
  1498  ### Min
  1499  
  1500  #### Usage at Root
  1501  
  1502  Query Example: Get the min initial release date for any Harry Potter movie.
  1503  
  1504  The release date is assigned to a variable, then it is aggregated and fetched in an empty block.
  1505  {{< runnable >}}
  1506  {
  1507    var(func: allofterms(name@en, "Harry Potter")) {
  1508      d as initial_release_date
  1509    }
  1510    me() {
  1511      min(val(d))
  1512    }
  1513  }
  1514  {{< /runnable >}}
  1515  
  1516  #### Usage at other levels
  1517  
  1518  Query Example:  Directors called Steven and the date of release of their first movie, in ascending order of first movie.
  1519  
  1520  {{< runnable >}}
  1521  {
  1522    stevens as var(func: allofterms(name@en, "steven")) {
  1523      director.film {
  1524        ird as initial_release_date
  1525        # ird is a value variable mapping a film UID to its release date
  1526      }
  1527      minIRD as min(val(ird))
  1528      # minIRD is a value variable mapping a director UID to their first release date
  1529    }
  1530  
  1531    byIRD(func: uid(stevens), orderasc: val(minIRD)) {
  1532      name@en
  1533      firstRelease: val(minIRD)
  1534    }
  1535  }
  1536  {{< /runnable >}}
  1537  
  1538  ### Max
  1539  
  1540  #### Usage at Root
  1541  
  1542  Query Example: Get the max initial release date for any Harry Potter movie.
  1543  
  1544  The release date is assigned to a variable, then it is aggregated and fetched in an empty block.
  1545  {{< runnable >}}
  1546  {
  1547    var(func: allofterms(name@en, "Harry Potter")) {
  1548      d as initial_release_date
  1549    }
  1550    me() {
  1551      max(val(d))
  1552    }
  1553  }
  1554  {{< /runnable >}}
  1555  
  1556  #### Usage at other levels
  1557  
  1558  Query Example: Quentin Tarantino's movies and date of release of the most recent movie.
  1559  
  1560  {{< runnable >}}
  1561  {
  1562    director(func: allofterms(name@en, "Quentin Tarantino")) {
  1563      director.film {
  1564        name@en
  1565        x as initial_release_date
  1566      }
  1567      max(val(x))
  1568    }
  1569  }
  1570  {{< /runnable >}}
  1571  
  1572  ### Sum and Avg
  1573  
  1574  #### Usage at Root
  1575  
  1576  Query Example: Get the sum and average of number of count of movies directed by people who have
  1577  Steven or Tom in their name.
  1578  
  1579  {{< runnable >}}
  1580  {
  1581    var(func: anyofterms(name@en, "Steven Tom")) {
  1582      a as count(director.film)
  1583    }
  1584  
  1585    me() {
  1586      avg(val(a))
  1587      sum(val(a))
  1588    }
  1589  }
  1590  {{< /runnable >}}
  1591  
  1592  #### Usage at other levels
  1593  
  1594  Query Example: Steven Spielberg's movies, with the number of recorded genres per movie, and the total number of genres and average genres per movie.
  1595  
  1596  {{< runnable >}}
  1597  {
  1598    director(func: eq(name@en, "Steven Spielberg")) {
  1599      name@en
  1600      director.film {
  1601        name@en
  1602        numGenres : g as count(genre)
  1603      }
  1604      totalGenres : sum(val(g))
  1605      genresPerMovie : avg(val(g))
  1606    }
  1607  }
  1608  {{< /runnable >}}
  1609  
  1610  
  1611  ### Aggregating Aggregates
  1612  
  1613  Aggregations can be assigned to value variables, and so these variables can in turn be aggregated.
  1614  
  1615  Query Example: For each actor in a Peter Jackson film, find the number of roles played in any movie.  Sum these to find the total number of roles ever played by all actors in the movie.  Then sum the lot to find the total number of roles ever played by actors who have appeared in Peter Jackson movies.  Note that this demonstrates how to aggregate aggregates; the answer in this case isn't quite precise though, because actors that have appeared in multiple Peter Jackson movies are counted more than once.
  1616  
  1617  {{< runnable >}}
  1618  {
  1619    PJ as var(func:allofterms(name@en, "Peter Jackson")) {
  1620      director.film {
  1621        starring {  # starring an actor
  1622          performance.actor {
  1623            movies as count(actor.film)
  1624            # number of roles for this actor
  1625          }
  1626          perf_total as sum(val(movies))
  1627        }
  1628        movie_total as sum(val(perf_total))
  1629        # total roles for all actors in this movie
  1630      }
  1631      gt as sum(val(movie_total))
  1632    }
  1633  
  1634    PJmovies(func: uid(PJ)) {
  1635      name@en
  1636      director.film (orderdesc: val(movie_total), first: 5) {
  1637        name@en
  1638        totalRoles : val(movie_total)
  1639      }
  1640      grandTotal : val(gt)
  1641    }
  1642  }
  1643  {{< /runnable >}}
  1644  
  1645  
  1646  ## Math on value variables
  1647  
  1648  Value variables can be combined using mathematical functions.  For example, this could be used to associate a score which is then used to order or perform other operations, such as might be used in building news feeds, simple recommendation systems, and so on.
  1649  
  1650  Math statements must be enclosed within `math( <exp> )` and must be stored to a value variable.
  1651  
  1652  The supported operators are as follows:
  1653  
  1654  | Operators                       | Types accepted                                 | What it does                                                   |
  1655  | :------------:                  | :--------------:                               | :------------------------:                                     |
  1656  | `+` `-` `*` `/` `%`             | `int`, `float`                                     | performs the corresponding operation                           |
  1657  | `min` `max`                     | All types except `geo`, `bool`  (binary functions) | selects the min/max value among the two                        |
  1658  | `<` `>` `<=` `>=` `==` `!=`     | All types except `geo`, `bool`                     | Returns true or false based on the values                      |
  1659  | `floor` `ceil` `ln` `exp` `sqrt` | `int`, `float` (unary function)                    | performs the corresponding operation                           |
  1660  | `since`                         | `dateTime`                                 | Returns the number of seconds in float from the time specified |
  1661  | `pow(a, b)`                     | `int`, `float`                                     | Returns `a to the power b`                                     |
  1662  | `logbase(a,b)`                  | `int`, `float`                                     | Returns `log(a)` to the base `b`                               |
  1663  | `cond(a, b, c)`                 | first operand must be a boolean                | selects `b` if `a` is true else `c`                            |
  1664  
  1665  
  1666  Query Example:  Form a score for each of Steven Spielberg's movies as the sum of number of actors, number of genres and number of countries.  List the top five such movies in order of decreasing score.
  1667  
  1668  {{< runnable >}}
  1669  {
  1670  	var(func:allofterms(name@en, "steven spielberg")) {
  1671  		films as director.film {
  1672  			p as count(starring)
  1673  			q as count(genre)
  1674  			r as count(country)
  1675  			score as math(p + q + r)
  1676  		}
  1677  	}
  1678  
  1679  	TopMovies(func: uid(films), orderdesc: val(score), first: 5){
  1680  		name@en
  1681  		val(score)
  1682  	}
  1683  }
  1684  {{< /runnable >}}
  1685  
  1686  Value variables and aggregations of them can be used in filters.
  1687  
  1688  Query Example: Calculate a score for each Steven Spielberg movie with a condition on release date to penalize movies that are more than 10 years old, filtering on the resulting score.
  1689  
  1690  {{< runnable >}}
  1691  {
  1692    var(func:allofterms(name@en, "steven spielberg")) {
  1693      films as director.film {
  1694        p as count(starring)
  1695        q as count(genre)
  1696        date as initial_release_date
  1697        years as math(since(date)/(365*24*60*60))
  1698        score as math(cond(years > 10, 0, ln(p)+q-ln(years)))
  1699      }
  1700    }
  1701  
  1702    TopMovies(func: uid(films), orderdesc: val(score)) @filter(gt(val(score), 2)){
  1703      name@en
  1704      val(score)
  1705      val(date)
  1706    }
  1707  }
  1708  {{< /runnable >}}
  1709  
  1710  
  1711  Values calculated with math operations are stored to value variables and so can be aggregated.
  1712  
  1713  Query Example: Compute a score for each Steven Spielberg movie and then aggregate the score.
  1714  
  1715  {{< runnable >}}
  1716  {
  1717  	steven as var(func:eq(name@en, "Steven Spielberg")) @filter(has(director.film)) {
  1718  		director.film {
  1719  			p as count(starring)
  1720  			q as count(genre)
  1721  			r as count(country)
  1722  			score as math(p + q + r)
  1723  		}
  1724  		directorScore as sum(val(score))
  1725  	}
  1726  
  1727  	score(func: uid(steven)){
  1728  		name@en
  1729  		val(directorScore)
  1730  	}
  1731  }
  1732  {{< /runnable >}}
  1733  
  1734  
  1735  ## GroupBy
  1736  
  1737  Syntax Examples:
  1738  
  1739  * `q(func: ...) @groupby(predicate) { min(...) }`
  1740  * `predicate @groupby(pred) { count(uid) }``
  1741  
  1742  
  1743  A `groupby` query aggregates query results given a set of properties on which to group elements.  For example, a query containing the block `friend @groupby(age) { count(uid) }`, finds all nodes reachable along the friend edge, partitions these into groups based on age, then counts how many nodes are in each group.  The returned result is the grouped edges and the aggregations.
  1744  
  1745  Inside a `groupby` block, only aggregations are allowed and `count` may only be applied to `uid`.
  1746  
  1747  If the `groupby` is applied to a `uid` predicate, the resulting aggregations can be saved in a variable (mapping the grouped UIDs to aggregate values) and used elsewhere in the query to extract information other than the grouped or aggregated edges.
  1748  
  1749  Query Example: For Steven Spielberg movies, count the number of movies in each genre and for each of those genres return the genre name and the count.  The name can't be extracted in the `groupby` because it is not an aggregate, but `uid(a)` can be used to extract the UIDs from the UID to value map and thus organize the `byGenre` query by genre UID.
  1750  
  1751  
  1752  {{< runnable >}}
  1753  {
  1754    var(func:allofterms(name@en, "steven spielberg")) {
  1755      director.film @groupby(genre) {
  1756        a as count(uid)
  1757        # a is a genre UID to count value variable
  1758      }
  1759    }
  1760  
  1761    byGenre(func: uid(a), orderdesc: val(a)) {
  1762      name@en
  1763      total_movies : val(a)
  1764    }
  1765  }
  1766  {{< /runnable >}}
  1767  
  1768  Query Example: Actors from Tim Burton movies and how many roles they have played in Tim Burton movies.
  1769  {{< runnable >}}
  1770  {
  1771    var(func:allofterms(name@en, "Tim Burton")) {
  1772      director.film {
  1773        starring @groupby(performance.actor) {
  1774          a as count(uid)
  1775          # a is an actor UID to count value variable
  1776        }
  1777      }
  1778    }
  1779  
  1780    byActor(func: uid(a), orderdesc: val(a)) {
  1781      name@en
  1782      val(a)
  1783    }
  1784  }
  1785  {{< /runnable >}}
  1786  
  1787  
  1788  
  1789  ## Expand Predicates
  1790  
  1791  The `expand()` function can be used to expand the predicates out of a node. To
  1792   use `expand()`, the [type system]({{< relref "#type-system" >}}) is required.
  1793  Refer to the section on the type system to check how to set the types
  1794  nodes. The rest of this section assumes familiarity with that section.
  1795  
  1796  There are four ways to use the `expand` function.
  1797  
  1798  * Predicates can be stored in a variable and passed to `expand()` to expand all
  1799    the predicates in the variable.
  1800  * If `_all_` is passed as an argument to `expand()`, all the predicates for each
  1801    node at that level are retrieved. More levels can be specified in a nested
  1802    fashion under `expand()`.
  1803  * If `_forward_` is passed as an argument to `expand()`, all predicates for each
  1804    node at that level (minus any reverse predicates) are retrieved.
  1805  * If `_reverse_` is passed as an argument to `expand()`, only the reverse
  1806    predicates at each node in that level are retrieved.
  1807  
  1808  The last three keywords require that the nodes have types. Dgraph will look 
  1809  for all the types that have been assigned to a node,
  1810  query the types to check which attributes they have, and use those to compute
  1811  the list of predicates to expand.
  1812  
  1813  For example, consider a node that has types `Animal` and `Pet`, which have 
  1814  the following definitions:
  1815  
  1816  ```
  1817  type Animal {
  1818      name: string
  1819      species: uid
  1820      dob: datetime
  1821  }
  1822  
  1823  type Pet {
  1824      owner: uid
  1825      veterinarian: uid
  1826  }
  1827  ```
  1828  
  1829  When `expand(_all_)` is called on this node, Dgraph will first check which types
  1830  the node has (`Animal` and `Pet`). Then it will get the definitions of 
  1831  `Animal` and `Pet` and build a list of predicates.
  1832  Finally, it will query the schema to check if any of those predicates have a
  1833  reverse node. If, for example, there's a reverse node in the `owner` predicate,
  1834  the final list of predicates to expand will be:
  1835  
  1836  ```
  1837  name
  1838  species
  1839  dob
  1840  owner
  1841  ~owner
  1842  veterinarian
  1843  ```
  1844  
  1845  For `string` predicates, `expand` only returns values not tagged with a language
  1846  (see [language preference]({{< relref "#language-support" >}})).  So it's often 
  1847  required to add `name@fr` or `name@.` as well as expand to a query.
  1848  
  1849  ## Cascade Directive
  1850  
  1851  With the `@cascade` directive, nodes that don't have all predicates specified in the query are removed. This can be useful in cases where some filter was applied or if nodes might not have all listed predicates.
  1852  
  1853  
  1854  Query Example: Harry Potter movies, with each actor and characters played.  With `@cascade`, any character not played by an actor called Warwick is removed, as is any Harry Potter movie without any actors called Warwick.  Without `@cascade`, every character is returned, but only those played by actors called Warwick also have the actor name.
  1855  {{< runnable >}}
  1856  {
  1857    HP(func: allofterms(name@en, "Harry Potter")) @cascade {
  1858      name@en
  1859      starring{
  1860          performance.character {
  1861            name@en
  1862          }
  1863          performance.actor @filter(allofterms(name@en, "Warwick")){
  1864              name@en
  1865           }
  1866      }
  1867    }
  1868  }
  1869  {{< /runnable >}}
  1870  
  1871  ## Normalize directive
  1872  
  1873  With the `@normalize` directive, only aliased predicates are returned and the result is flattened to remove nesting.
  1874  
  1875  Query Example: Film name, country and first two actors (by UID order) of every Steven Spielberg movie, without `initial_release_date` because no alias is given and flattened by `@normalize`
  1876  {{< runnable >}}
  1877  {
  1878    director(func:allofterms(name@en, "steven spielberg")) @normalize {
  1879      director: name@en
  1880      director.film {
  1881        film: name@en
  1882        initial_release_date
  1883        starring(first: 2) {
  1884          performance.actor {
  1885            actor: name@en
  1886          }
  1887          performance.character {
  1888            character: name@en
  1889          }
  1890        }
  1891        country {
  1892          country: name@en
  1893        }
  1894      }
  1895    }
  1896  }
  1897  {{< /runnable >}}
  1898  
  1899  
  1900  ## Ignorereflex directive
  1901  
  1902  The `@ignorereflex` directive forces the removal of child nodes that are reachable from themselves as a parent, through any path in the query result
  1903  
  1904  Query Example: All the co-actors of Rutger Hauer.  Without `@ignorereflex`, the result would also include Rutger Hauer for every movie.
  1905  
  1906  {{< runnable >}}
  1907  {
  1908    coactors(func: eq(name@en, "Rutger Hauer")) @ignorereflex {
  1909      actor.film {
  1910        performance.film {
  1911          starring {
  1912            performance.actor {
  1913              name@en
  1914            }
  1915          }
  1916        }
  1917      }
  1918    }
  1919  }
  1920  {{< /runnable >}}
  1921  
  1922  ## Debug
  1923  
  1924  For the purposes of debugging, you can attach a query parameter `debug=true` to a query. Attaching this parameter lets you retrieve the `uid` attribute for all the entities along with the `server_latency` and `start_ts` information under the `extensions` key of the response.
  1925  
  1926  - `parsing_ns`: Latency in nanoseconds to parse the query.
  1927  - `processing_ns`: Latency in nanoseconds to process the query.
  1928  - `encoding_ns`: Latency in nanoseconds to encode the JSON response.
  1929  - `start_ts`: The logical start timestamp of the transaction.
  1930  
  1931  Query with debug as a query parameter
  1932  ```sh
  1933  curl -H "Content-Type: application/graphql+-" http://localhost:8080/query?debug=true -XPOST -d $'{
  1934    tbl(func: allofterms(name@en, "The Big Lebowski")) {
  1935      name@en
  1936    }
  1937  }' | python -m json.tool | less
  1938  ```
  1939  
  1940  Returns `uid` and `server_latency`
  1941  ```
  1942  {
  1943    "data": {
  1944      "tbl": [
  1945        {
  1946          "uid": "0x41434",
  1947          "name@en": "The Big Lebowski"
  1948        },
  1949        {
  1950          "uid": "0x145834",
  1951          "name@en": "The Big Lebowski 2"
  1952        },
  1953        {
  1954          "uid": "0x2c8a40",
  1955          "name@en": "Jeffrey \"The Big\" Lebowski"
  1956        },
  1957        {
  1958          "uid": "0x3454c4",
  1959          "name@en": "The Big Lebowski"
  1960        }
  1961      ],
  1962      "extensions": {
  1963        "server_latency": {
  1964          "parsing_ns": 18559,
  1965          "processing_ns": 802990982,
  1966          "encoding_ns": 1177565
  1967        },
  1968        "txn": {
  1969          "start_ts": 40010
  1970        }
  1971      }
  1972    }
  1973  }
  1974  ```
  1975  
  1976  
  1977  ## Schema
  1978  
  1979  For each predicate, the schema specifies the target's type.  If a predicate `p` has type `T`, then for all subject-predicate-object triples `s p o` the object `o` is of schema type `T`.
  1980  
  1981  * On mutations, scalar types are checked and an error thrown if the value cannot be converted to the schema type.
  1982  
  1983  * On query, value results are returned according to the schema type of the predicate.
  1984  
  1985  If a schema type isn't specified before a mutation adds triples for a predicate, then the type is inferred from the first mutation.  This type is either:
  1986  
  1987  * type `uid`, if the first mutation for the predicate has nodes for the subject and object, or
  1988  
  1989  * derived from the [RDF type]({{< relref "#rdf-types" >}}), if the object is a literal and an RDF type is present in the first mutation, or
  1990  
  1991  * `default` type, otherwise.
  1992  
  1993  
  1994  ### Schema Types
  1995  
  1996  Dgraph supports scalar types and the UID type.
  1997  
  1998  #### Scalar Types
  1999  
  2000  For all triples with a predicate of scalar types the object is a literal.
  2001  
  2002  | Dgraph Type | Go type |
  2003  | ------------|:--------|
  2004  |  `default`  | string  |
  2005  |  `int`      | int64   |
  2006  |  `float`    | float   |
  2007  |  `string`   | string  |
  2008  |  `bool`     | bool    |
  2009  |  `dateTime` | time.Time (RFC3339 format [Optional timezone] eg: 2006-01-02T15:04:05.999999999+10:00 or 2006-01-02T15:04:05.999999999)    |
  2010  |  `geo`      | [go-geom](https://github.com/twpayne/go-geom)    |
  2011  |  `password` | string (encrypted) |
  2012  
  2013  
  2014  {{% notice "note" %}}Dgraph supports date and time formats for `dateTime` scalar type only if they
  2015  are RFC 3339 compatible which is different from ISO 8601(as defined in the RDF spec). You should
  2016  convert your values to RFC 3339 format before sending them to Dgraph.{{% /notice  %}}
  2017  
  2018  #### UID Type
  2019  
  2020  The `uid` type denotes a node-node edge; internally each node is represented as a `uint64` id.
  2021  
  2022  | Dgraph Type | Go type |
  2023  | ------------|:--------|
  2024  |  `uid`      | uint64  |
  2025  
  2026  
  2027  ### Adding or Modifying Schema
  2028  
  2029  Schema mutations add or modify schema.
  2030  
  2031  Multiple scalar values can also be added for a `S P` by specifying the schema to be of
  2032  list type. Occupations in the example below can store a list of strings for each `S P`.
  2033  
  2034  An index is specified with `@index`, with arguments to specify the tokenizer. When specifying an
  2035  index for a predicate it is mandatory to specify the type of the index. For example:
  2036  
  2037  ```
  2038  name: string @index(exact, fulltext) @count .
  2039  multiname: string @lang .
  2040  age: int @index(int) .
  2041  friend: [uid] @count .
  2042  dob: dateTime .
  2043  location: geo @index(geo) .
  2044  occupations: [string] @index(term) .
  2045  ```
  2046  
  2047  If no data has been stored for the predicates, a schema mutation sets up an empty schema ready to receive triples.
  2048  
  2049  If data is already stored before the mutation, existing values are not checked to conform to the new schema.  On query, Dgraph tries to convert existing values to the new schema types, ignoring any that fail conversion.
  2050  
  2051  If data exists and new indices are specified in a schema mutation, any index not in the updated list is dropped and a new index is created for every new tokenizer specified.
  2052  
  2053  Reverse edges are also computed if specified by a schema mutation.
  2054  
  2055  
  2056  ### Predicate name rules
  2057  
  2058  Any alphanumeric combination of a predicate name is permitted.
  2059  Dgraph also supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs).
  2060  You can read more in [Predicates i18n](#predicates-i18n).
  2061  
  2062  #### Allowed special characters
  2063  
  2064  Single special characters are not accepted, which includes the special characters from IRIs.
  2065  They have to be prefixed/suffixed with alphanumeric characters.
  2066  
  2067  ```
  2068  ][&*()_-+=!#$%
  2069  ```
  2070  
  2071  *Note: You are not restricted to use @ suffix, but the suffix character gets ignored.*
  2072  
  2073  #### Forbidden special characters
  2074  
  2075  The special characters below are not accepted.
  2076  
  2077  ```
  2078  ^}|{`\~
  2079  ```
  2080  
  2081  
  2082  ### Predicates i18n
  2083  
  2084  If your predicate is a URI or has language-specific characters, then enclose
  2085  it with angle brackets `<>` when executing the schema mutation.
  2086  
  2087  {{% notice "note" %}}Dgraph supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs) for predicate names and values.{{% /notice  %}}
  2088  
  2089  Schema syntax:
  2090  ```
  2091  <职业>: string @index(exact) .
  2092  <年龄>: int @index(int) .
  2093  <地点>: geo @index(geo) .
  2094  <公司>: string .
  2095  ```
  2096  
  2097  This syntax allows for internationalized predicate names, but full-text indexing still defaults to English.
  2098  To use the right tokenizer for your language, you need to use the `@lang` directive and enter values using your
  2099  language tag.
  2100  
  2101  Schema:
  2102  ```
  2103  <公司>: string @index(fulltext) @lang .
  2104  ```
  2105  Mutation:
  2106  ```
  2107  {
  2108    set {
  2109      _:a <公司> "Dgraph Labs Inc"@en .
  2110      _:b <公司> "夏新科技有限责任公司"@zh .
  2111    }
  2112  }
  2113  ```
  2114  Query:
  2115  ```
  2116  {
  2117    q(func: alloftext(<公司>@zh, "夏新科技有限责任公司")) {
  2118      uid
  2119      <公司>@.
  2120    }
  2121  }
  2122  ```
  2123  
  2124  
  2125  ### Upsert directive
  2126  
  2127  To use [upsert operations]({{< relref "howto/index.md#upserts">}}) on a
  2128  predicate, specify the `@upsert` directive in the schema. When committing
  2129  transactions involving predicates with the `@upsert` directive, Dgraph checks
  2130  index keys for conflicts, helping to enforce uniqueness constraints when running
  2131  concurrent upserts.
  2132  
  2133  This is how you specify the upsert directive for a predicate.
  2134  ```
  2135  email: string @index(exact) @upsert .
  2136  ```
  2137  
  2138  ### RDF Types
  2139  
  2140  Dgraph supports a number of [RDF types in mutations]({{< relref "mutations/index.md#language-and-rdf-types" >}}).
  2141  
  2142  As well as implying a schema type for a [first mutation]({{< relref "#schema" >}}), an RDF type can override a schema type for storage.
  2143  
  2144  If a predicate has a schema type and a mutation has an RDF type with a different underlying Dgraph type, the convertibility to schema type is checked, and an error is thrown if they are incompatible, but the value is stored in the RDF type's corresponding Dgraph type.  Query results are always returned in schema type.
  2145  
  2146  For example, if no schema is set for the `age` predicate.  Given the mutation
  2147  ```
  2148  {
  2149   set {
  2150    _:a <age> "15"^^<xs:int> .
  2151    _:b <age> "13" .
  2152    _:c <age> "14"^^<xs:string> .
  2153    _:d <age> "14.5"^^<xs:string> .
  2154    _:e <age> "14.5" .
  2155   }
  2156  }
  2157  ```
  2158  Dgraph:
  2159  
  2160  * sets the schema type to `int`, as implied by the first triple,
  2161  * converts `"13"` to `int` on storage,
  2162  * checks `"14"` can be converted to `int`, but stores as `string`,
  2163  * throws an error for the remaining two triples, because `"14.5"` can't be converted to `int`.
  2164  
  2165  ### Extended Types
  2166  
  2167  The following types are also accepted.
  2168  
  2169  #### Password type
  2170  
  2171  A password for an entity is set with setting the schema for the attribute to be of type `password`.  Passwords cannot be queried directly, only checked for a match using the `checkpwd` function.
  2172  The passwords are encrypted using [bcrypt](https://en.wikipedia.org/wiki/Bcrypt).
  2173  
  2174  For example: to set a password, first set schema, then the password:
  2175  ```
  2176  pass: password .
  2177  ```
  2178  
  2179  ```
  2180  {
  2181    set {
  2182      <0x123> <name> "Password Example" .
  2183      <0x123> <pass> "ThePassword" .
  2184    }
  2185  }
  2186  ```
  2187  
  2188  to check a password:
  2189  ```
  2190  {
  2191    check(func: uid(0x123)) {
  2192      name
  2193      checkpwd(pass, "ThePassword")
  2194    }
  2195  }
  2196  ```
  2197  
  2198  output:
  2199  ```
  2200  {
  2201    "data": {
  2202      "check": [
  2203        {
  2204          "name": "Password Example",
  2205          "checkpwd(pass)": true
  2206        }
  2207      ]
  2208    }
  2209  }
  2210  ```
  2211  
  2212  You can also use alias with password type.
  2213  
  2214  ```
  2215  {
  2216    check(func: uid(0x123)) {
  2217      name
  2218      secret: checkpwd(pass, "ThePassword")
  2219    }
  2220  }
  2221  ```
  2222  
  2223  output:
  2224  ```
  2225  {
  2226    "data": {
  2227      "check": [
  2228        {
  2229          "name": "Password Example",
  2230          "secret": true
  2231        }
  2232      ]
  2233    }
  2234  }
  2235  ```
  2236  
  2237  ### Indexing
  2238  
  2239  {{% notice "note" %}}Filtering on a predicate by applying a [function]({{< relref "#functions" >}}) requires an index.{{% /notice %}}
  2240  
  2241  When filtering by applying a function, Dgraph uses the index to make the search through a potentially large dataset efficient.
  2242  
  2243  All scalar types can be indexed.
  2244  
  2245  Types `int`, `float`, `bool` and `geo` have only a default index each: with tokenizers named `int`, `float`, `bool` and `geo`.
  2246  
  2247  Types `string` and `dateTime` have a number of indices.
  2248  
  2249  #### String Indices
  2250  The indices available for strings are as follows.
  2251  
  2252  | Dgraph function            | Required index / tokenizer             | Notes |
  2253  | :-----------------------   | :------------                          | :---  |
  2254  | `eq`                       | `hash`, `exact`, `term`, or `fulltext` | The most performant index for `eq` is `hash`. Only use `term` or `fulltext` if you also require term or full-text search. If you're already using `term`, there is no need to use `hash` or `exact` as well. |
  2255  | `le`, `ge`, `lt`, `gt`     | `exact`                                | Allows faster sorting.                                   |
  2256  | `allofterms`, `anyofterms` | `term`                                 | Allows searching by a term in a sentence.                |
  2257  | `alloftext`, `anyoftext`   | `fulltext`                             | Matching with language specific stemming and stopwords.  |
  2258  | `regexp`                   | `trigram`                              | Regular expression matching. Can also be used for equality checking. |
  2259  
  2260  {{% notice "warning" %}}
  2261  Incorrect index choice can impose performance penalties and an increased
  2262  transaction conflict rate. Use only the minimum number of and simplest indexes
  2263  that your application needs.
  2264  {{% /notice %}}
  2265  
  2266  
  2267  #### DateTime Indices
  2268  
  2269  The indices available for `dateTime` are as follows.
  2270  
  2271  | Index name / Tokenizer   | Part of date indexed                                      |
  2272  | :----------- | :------------------------------------------------------------------ |
  2273  | `year`      | index on year (default)                                        |
  2274  | `month`       | index on year and month                                         |
  2275  | `day`       | index on year, month and day                                      |
  2276  | `hour`       | index on year, month, day and hour                               |
  2277  
  2278  The choices of `dateTime` index allow selecting the precision of the index.  Applications, such as the movies examples in these docs, that require searching over dates but have relatively few nodes per year may prefer the `year` tokenizer; applications that are dependent on fine grained date searches, such as real-time sensor readings, may prefer the `hour` index.
  2279  
  2280  
  2281  All the `dateTime` indices are sortable.
  2282  
  2283  
  2284  #### Sortable Indices
  2285  
  2286  Not all the indices establish a total order among the values that they index. Sortable indices allow inequality functions and sorting.
  2287  
  2288  * Indexes `int` and `float` are sortable.
  2289  * `string` index `exact` is sortable.
  2290  * All `dateTime` indices are sortable.
  2291  
  2292  For example, given an edge `name` of `string` type, to sort by `name` or perform inequality filtering on names, the `exact` index must have been specified.  In which case a schema query would return at least the following tokenizers.
  2293  
  2294  ```
  2295  {
  2296    "predicate": "name",
  2297    "type": "string",
  2298    "index": true,
  2299    "tokenizer": [
  2300      "exact"
  2301    ]
  2302  }
  2303  ```
  2304  
  2305  #### Count index
  2306  
  2307  For predicates with the `@count` Dgraph indexes the number of edges out of each node.  This enables fast queries of the form:
  2308  ```
  2309  {
  2310    q(func: gt(count(pred), threshold)) {
  2311      ...
  2312    }
  2313  }
  2314  ```
  2315  
  2316  ### List Type
  2317  
  2318  Predicate with scalar types can also store a list of values if specified in the schema. The scalar
  2319  type needs to be enclosed within `[]` to indicate that its a list type. These lists are like an
  2320  unordered set.
  2321  
  2322  ```
  2323  occupations: [string] .
  2324  score: [int] .
  2325  ```
  2326  
  2327  * A set operation adds to the list of values. The order of the stored values is non-deterministic.
  2328  * A delete operation deletes the value from the list.
  2329  * Querying for these predicates would return the list in an array.
  2330  * Indexes can be applied on predicates which have a list type and you can use [Functions]({{<ref
  2331    "#functions">}}) on them.
  2332  * Sorting is not allowed using these predicates.
  2333  
  2334  
  2335  ### Reverse Edges
  2336  
  2337  A graph edge is unidirectional. For node-node edges, sometimes modeling requires reverse edges.  If only some subject-predicate-object triples have a reverse, these must be manually added.  But if a predicate always has a reverse, Dgraph computes the reverse edges if `@reverse` is specified in the schema.
  2338  
  2339  The reverse edge of `anEdge` is `~anEdge`.
  2340  
  2341  For existing data, Dgraph computes all reverse edges.  For data added after the schema mutation, Dgraph computes and stores the reverse edge for each added triple.
  2342  
  2343  ### Querying Schema
  2344  
  2345  A schema query queries for the whole schema:
  2346  
  2347  ```
  2348  schema {}
  2349  ```
  2350  
  2351  {{% notice "note" %}} Unlike regular queries, the schema query is not surrounded
  2352  by curly braces. Also, schema queries and regular queries cannot be combined.
  2353  {{% /notice %}}
  2354  
  2355  You can query for particular schema fields in the query body.
  2356  
  2357  ```
  2358  schema {
  2359    type
  2360    index
  2361    reverse
  2362    tokenizer
  2363    list
  2364    count
  2365    upsert
  2366    lang
  2367  }
  2368  ```
  2369  
  2370  You can also query for particular predicates:
  2371  
  2372  ```
  2373  schema(pred: [name, friend]) {
  2374    type
  2375    index
  2376    reverse
  2377    tokenizer
  2378    list
  2379    count
  2380    upsert
  2381    lang
  2382  }
  2383  ```
  2384  
  2385  Types can also be queried. Below are some example queries.
  2386  
  2387  ```
  2388  schema(type: Movie) {}
  2389  schema(type: [Person, Animal]) {}
  2390  ```
  2391  
  2392  Note that type queries do not contain anything between the curly braces. The
  2393  output will be the entire definition of the requested types.
  2394  
  2395  ## Type System
  2396  
  2397  Dgraph supports a type system that can be used to categorize nodes and query
  2398  them based on their type. The type system is also used during expand queries.
  2399  
  2400  ### Type definition
  2401  
  2402  Types are defined using a GraphQL-like syntax. For example:
  2403  
  2404  ```
  2405  type Student {
  2406    name: string
  2407    dob: datetime
  2408    home_address: string
  2409    year: int
  2410    friends: [uid]
  2411  }
  2412  ```
  2413  
  2414  Types are declared along with the schema using the Alter endpoint. In order to
  2415  properly support the above type, a predicate for each of the attributes
  2416  in the type is also needed, such as:
  2417  
  2418  ```
  2419  name: string @index(term) .
  2420  dob: datetime .
  2421  home_address: string .
  2422  year: int .
  2423  friends: [uid] .
  2424  ```
  2425  
  2426  If a `uid` predicate contains a reverse index, both the predicate and the
  2427  reverse predicate are part of any type definition which contain that predicate.
  2428  Expand queries will follow that convention.
  2429  
  2430  Edges can be used in multiple types: for example, `name` might be used for both
  2431  a person and a pet. Sometimes, however, it's required to use a different
  2432  predicate for each type to represent a similar concept. For example, if student
  2433  names and book names required different indexes, then the predicates must be
  2434  different.
  2435  
  2436  ```
  2437  type Student {
  2438    student_name: string
  2439  }
  2440  
  2441  type Textbook {
  2442    textbook_name: string
  2443  }
  2444  
  2445  student_name: string @index(exact) .
  2446  textbook_name: string @lang @index(fulltext) .
  2447  ```
  2448  
  2449  Types also support lists like `friends: [uid]` or `tags: [string]`.
  2450  
  2451  Altering the schema for a type that already exists, overwrites the existing
  2452  definition.
  2453  
  2454  ### Setting the type of a node
  2455  
  2456  Scalar nodes cannot have types since they only have one attribute and its type
  2457  is the type of the node. UID nodes can have a type. The type is set by setting
  2458  the value of the `dgraph.type` predicate for that node. A node can have multiple
  2459  types. Here's an example of how to set the types of a node:
  2460  
  2461  ```
  2462  {
  2463    set {
  2464      _:a <name> "Garfield" .
  2465      _:a <dgraph.type> "Pet" .
  2466      _:a <dgraph.type> "Animal" .
  2467    }
  2468  }
  2469  ```
  2470  
  2471  `dgraph.type` is a reserved predicate and cannot be removed or modified.
  2472  
  2473  ### Using types during queries
  2474  
  2475  Types can be used as a top level function in the query language. For example:
  2476  
  2477  ```
  2478  {
  2479    q(func: type(Animal)) {
  2480      uid
  2481      name
  2482    }
  2483  }
  2484  ```
  2485  
  2486  This query will only return nodes whose type is set to `Animal`.
  2487  
  2488  Types can also be used to filter results inside a query. For example:
  2489  
  2490  ```
  2491  {
  2492    q(func: has(parent)) {
  2493      uid
  2494      parent @filter(type(Person)) {
  2495        uid
  2496        name
  2497      }
  2498    }
  2499  }
  2500  ```
  2501  
  2502  This query will return the nodes that have a parent predicate and only the
  2503  `parent`'s of type `Person`.
  2504  
  2505  ### Deleting a type
  2506  
  2507  Type definitions can be deleted using the Alter endpoint. All that is needed is
  2508  to send an operation object with the field `DropOp` (or `drop_op` depending on
  2509  the client) to the enum value `TYPE` and the field 'DropValue' (or `drop_value`)
  2510  to the type that is meant to be deleted.
  2511  
  2512  Below is an example deleting the type `Person` using the Go client:
  2513  ```go
  2514  err := c.Alter(context.Background(), &api.Operation{
  2515                  DropOp: api.Operation_TYPE,
  2516                  DropValue: "Person"})
  2517  ```
  2518  
  2519  ### Expand queries and types
  2520  
  2521  Queries using [expand]({{< relref "#expand-predicates" >}}) (i.e.:
  2522  `expand(_all_)`, `expand(_reverse_)`, or `expand(_forward_)`) require that the
  2523  nodes to expand have types.
  2524  
  2525  ## Facets : Edge attributes
  2526  
  2527  Dgraph supports facets --- **key value pairs on edges** --- as an extension to RDF triples. That is, facets add properties to edges, rather than to nodes.
  2528  For example, a `friend` edge between two nodes may have a boolean property of `close` friendship.
  2529  Facets can also be used as `weights` for edges.
  2530  
  2531  Though you may find yourself leaning towards facets many times, they should not be misused.  It wouldn't be correct modeling to give the `friend` edge a facet `date_of_birth`. That should be an edge for the friend.  However, a facet like `start_of_friendship` might be appropriate.  Facets are however not first class citizen in Dgraph like predicates.
  2532  
  2533  Facet keys are strings and values can be `string`, `bool`, `int`, `float` and `dateTime`.
  2534  For `int` and `float`, only 32-bit signed integers and 64-bit floats are accepted.
  2535  
  2536  The following mutation is used throughout this section on facets.  The mutation adds data for some peoples and, for example, records a `since` facet in `mobile` and `car` to record when Alice bought the car and started using the mobile number.
  2537  
  2538  First we add some schema.
  2539  ```sh
  2540  curl localhost:8080/alter -XPOST -d $'
  2541      name: string @index(exact, term) .
  2542      rated: [uid] @reverse @count .
  2543  ' | python -m json.tool | less
  2544  
  2545  ```
  2546  
  2547  ```sh
  2548  curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -XPOST -d $'
  2549  {
  2550    set {
  2551  
  2552      # -- Facets on scalar predicates
  2553      _:alice <name> "Alice" .
  2554      _:alice <mobile> "040123456" (since=2006-01-02T15:04:05) .
  2555      _:alice <car> "MA0123" (since=2006-02-02T13:01:09, first=true) .
  2556  
  2557      _:bob <name> "Bob" .
  2558      _:bob <car> "MA0134" (since=2006-02-02T13:01:09) .
  2559  
  2560      _:charlie <name> "Charlie" .
  2561      _:dave <name> "Dave" .
  2562  
  2563  
  2564      # -- Facets on UID predicates
  2565      _:alice <friend> _:bob (close=true, relative=false) .
  2566      _:alice <friend> _:charlie (close=false, relative=true) .
  2567      _:alice <friend> _:dave (close=true, relative=true) .
  2568  
  2569  
  2570      # -- Facets for variable propagation
  2571      _:movie1 <name> "Movie 1" .
  2572      _:movie2 <name> "Movie 2" .
  2573      _:movie3 <name> "Movie 3" .
  2574  
  2575      _:alice <rated> _:movie1 (rating=3) .
  2576      _:alice <rated> _:movie2 (rating=2) .
  2577      _:alice <rated> _:movie3 (rating=5) .
  2578  
  2579      _:bob <rated> _:movie1 (rating=5) .
  2580      _:bob <rated> _:movie2 (rating=5) .
  2581      _:bob <rated> _:movie3 (rating=5) .
  2582  
  2583      _:charlie <rated> _:movie1 (rating=2) .
  2584      _:charlie <rated> _:movie2 (rating=5) .
  2585      _:charlie <rated> _:movie3 (rating=1) .
  2586    }
  2587  }' | python -m json.tool | less
  2588  ```
  2589  
  2590  ### Facets on scalar predicates
  2591  
  2592  
  2593  Querying `name`, `mobile` and `car` of Alice gives the same result as without facets.
  2594  
  2595  {{< runnable >}}
  2596  {
  2597    data(func: eq(name, "Alice")) {
  2598       name
  2599       mobile
  2600       car
  2601    }
  2602  }
  2603  {{</ runnable >}}
  2604  
  2605  
  2606  The syntax `@facets(facet-name)` is used to query facet data. For Alice the `since` facet for `mobile` and `car` are queried as follows.
  2607  
  2608  {{< runnable >}}
  2609  {
  2610    data(func: eq(name, "Alice")) {
  2611       name
  2612       mobile @facets(since)
  2613       car @facets(since)
  2614    }
  2615  }
  2616  {{</ runnable >}}
  2617  
  2618  
  2619  Facets are returned at the same level as the corresponding edge and have keys like edge|facet.
  2620  
  2621  All facets on an edge are queried with `@facets`.
  2622  
  2623  {{< runnable >}}
  2624  {
  2625    data(func: eq(name, "Alice")) {
  2626       name
  2627       mobile @facets
  2628       car @facets
  2629    }
  2630  }
  2631  {{</ runnable >}}
  2632  
  2633  ### Facets i18n
  2634  
  2635  Facets keys and values can use language-specific characters directly when mutating. But facet keys need to be enclosed in angle brackets `<>` when querying. This is similar to predicates. See [Predicates i18n](#predicates-i18n) for more info.
  2636  
  2637  {{% notice "note" %}}Dgraph supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs) for facet keys when querying.{{% /notice  %}}
  2638  
  2639  Example:
  2640  ```
  2641  {
  2642    set {
  2643      _:person1 <name> "Daniel" (वंश="स्पेनी", ancestry="Español") .
  2644      _:person2 <name> "Raj" (वंश="हिंदी", ancestry="हिंदी") .
  2645      _:person3 <name> "Zhang Wei" (वंश="चीनी", ancestry="中文") .
  2646    }
  2647  }
  2648  ```
  2649  Query, notice the `<>`'s:
  2650  ```
  2651  {
  2652    q(func: has(name)) {
  2653      name @facets(<वंश>)
  2654    }
  2655  }
  2656  ```
  2657  
  2658  ### Alias with facets
  2659  
  2660  Alias can be specified while requesting specific predicates. Syntax is similar to how would request
  2661  alias for other predicates. `orderasc` and `orderdesc` are not allowed as alias as they have special
  2662  meaning. Apart from that anything else can be set as alias.
  2663  
  2664  Here we set `car_since`, `close_friend` alias for `since`, `close` facets respectively.
  2665  {{< runnable >}}
  2666  {
  2667     data(func: eq(name, "Alice")) {
  2668       name
  2669       mobile
  2670       car @facets(car_since: since)
  2671       friend @facets(close_friend: close) {
  2672         name
  2673       }
  2674     }
  2675  }
  2676  {{</ runnable >}}
  2677  
  2678  
  2679  
  2680  ### Facets on UID predicates
  2681  
  2682  Facets on UID edges work similarly to facets on value edges.
  2683  
  2684  For example, `friend` is an edge with facet `close`.
  2685  It was set to true for friendship between Alice and Bob
  2686  and false for friendship between Alice and Charlie.
  2687  
  2688  A query for friends of Alice.
  2689  
  2690  {{< runnable >}}
  2691  {
  2692    data(func: eq(name, "Alice")) {
  2693      name
  2694      friend {
  2695        name
  2696      }
  2697    }
  2698  }
  2699  {{</ runnable >}}
  2700  
  2701  A query for friends and the facet `close` with `@facets(close)`.
  2702  
  2703  {{< runnable >}}
  2704  {
  2705     data(func: eq(name, "Alice")) {
  2706       name
  2707       friend @facets(close) {
  2708         name
  2709       }
  2710     }
  2711  }
  2712  {{</ runnable >}}
  2713  
  2714  
  2715  For uid edges like `friend`, facets go to the corresponding child under the key edge|facet. In the above
  2716  example you can see that the `close` facet on the edge between Alice and Bob appears with the key `friend|close`
  2717  along with Bob's results.
  2718  
  2719  {{< runnable >}}
  2720  {
  2721    data(func: eq(name, "Alice")) {
  2722      name
  2723      friend @facets {
  2724        name
  2725        car @facets
  2726      }
  2727    }
  2728  }
  2729  {{</ runnable >}}
  2730  
  2731  Bob has a `car` and it has a facet `since`, which, in the results, is part of the same object as Bob
  2732  under the key car|since.
  2733  Also, the `close` relationship between Bob and Alice is part of Bob's output object.
  2734  Charlie does not have `car` edge and thus only UID facets.
  2735  
  2736  ### Filtering on facets
  2737  
  2738  Dgraph supports filtering edges based on facets.
  2739  Filtering works similarly to how it works on edges without facets and has the same available functions.
  2740  
  2741  
  2742  Find Alice's close friends
  2743  {{< runnable >}}
  2744  {
  2745    data(func: eq(name, "Alice")) {
  2746      friend @facets(eq(close, true)) {
  2747        name
  2748      }
  2749    }
  2750  }
  2751  {{</ runnable >}}
  2752  
  2753  
  2754  To return facets as well as filter, add another `@facets(<facetname>)` to the query.
  2755  
  2756  {{< runnable >}}
  2757  {
  2758    data(func: eq(name, "Alice")) {
  2759      friend @facets(eq(close, true)) @facets(relative) { # filter close friends and give relative status
  2760        name
  2761      }
  2762    }
  2763  }
  2764  {{</ runnable >}}
  2765  
  2766  
  2767  Facet queries can be composed with `AND`, `OR` and `NOT`.
  2768  
  2769  {{< runnable >}}
  2770  {
  2771    data(func: eq(name, "Alice")) {
  2772      friend @facets(eq(close, true) AND eq(relative, true)) @facets(relative) { # filter close friends in my relation
  2773        name
  2774      }
  2775    }
  2776  }
  2777  {{</ runnable >}}
  2778  
  2779  
  2780  ### Sorting using facets
  2781  
  2782  Sorting is possible for a facet on a uid edge. Here we sort the movies rated by Alice, Bob and
  2783  Charlie by their `rating` which is a facet.
  2784  
  2785  {{< runnable >}}
  2786  {
  2787    me(func: anyofterms(name, "Alice Bob Charlie")) {
  2788      name
  2789      rated @facets(orderdesc: rating) {
  2790        name
  2791      }
  2792    }
  2793  }
  2794  {{</ runnable >}}
  2795  
  2796  
  2797  
  2798  ### Assigning Facet values to a variable
  2799  
  2800  Facets on UID edges can be stored in [value variables]({{< relref "#value-variables" >}}).  The variable is a map from the edge target to the facet value.
  2801  
  2802  Alice's friends reported by variables for `close` and `relative`.
  2803  {{< runnable >}}
  2804  {
  2805    var(func: eq(name, "Alice")) {
  2806      friend @facets(a as close, b as relative)
  2807    }
  2808  
  2809    friend(func: uid(a)) {
  2810      name
  2811      val(a)
  2812    }
  2813  
  2814    relative(func: uid(b)) {
  2815      name
  2816      val(b)
  2817    }
  2818  }
  2819  {{</ runnable >}}
  2820  
  2821  
  2822  ### Facets and Variable Propagation
  2823  
  2824  Facet values of `int` and `float` can be assigned to variables and thus the [values propagate]({{< relref "#variable-propagation" >}}).
  2825  
  2826  
  2827  Alice, Bob and Charlie each rated every movie.  A value variable on facet `rating` maps movies to ratings.  A query that reaches a movie through multiple paths sums the ratings on each path.  The following sums Alice, Bob and Charlie's ratings for the three movies.
  2828  
  2829  {{<runnable >}}
  2830  {
  2831    var(func: anyofterms(name, "Alice Bob Charlie")) {
  2832      num_raters as math(1)
  2833      rated @facets(r as rating) {
  2834        total_rating as math(r) # sum of the 3 ratings
  2835        average_rating as math(total_rating / num_raters)
  2836      }
  2837    }
  2838    data(func: uid(total_rating)) {
  2839      name
  2840      val(total_rating)
  2841      val(average_rating)
  2842    }
  2843  
  2844  }
  2845  {{</ runnable >}}
  2846  
  2847  
  2848  
  2849  ### Facets and Aggregation
  2850  
  2851  Facet values assigned to value variables can be aggregated.
  2852  
  2853  {{< runnable >}}
  2854  {
  2855    data(func: eq(name, "Alice")) {
  2856      name
  2857      rated @facets(r as rating) {
  2858        name
  2859      }
  2860      avg(val(r))
  2861    }
  2862  }
  2863  {{</ runnable >}}
  2864  
  2865  
  2866  Note though that `r` is a map from movies to the sum of ratings on edges in the query reaching the movie.  Hence, the following does not correctly calculate the average ratings for Alice and Bob individually --- it calculates 2 times the average of both Alice and Bob's ratings.
  2867  
  2868  {{< runnable >}}
  2869  
  2870  {
  2871    data(func: anyofterms(name, "Alice Bob")) {
  2872      name
  2873      rated @facets(r as rating) {
  2874        name
  2875      }
  2876      avg(val(r))
  2877    }
  2878  }
  2879  {{</ runnable >}}
  2880  
  2881  Calculating the average ratings of users requires a variable that maps users to the sum of their ratings.
  2882  
  2883  {{< runnable >}}
  2884  
  2885  {
  2886    var(func: has(rated)) {
  2887      num_rated as math(1)
  2888      rated @facets(r as rating) {
  2889        avg_rating as math(r / num_rated)
  2890      }
  2891    }
  2892  
  2893    data(func: uid(avg_rating)) {
  2894      name
  2895      val(avg_rating)
  2896    }
  2897  }
  2898  {{</ runnable >}}
  2899  
  2900  ## K-Shortest Path Queries
  2901  
  2902  The shortest path between a source (`from`) node and destination (`to`) node can be found using the keyword `shortest` for the query block name. It requires the source node UID, destination node UID and the predicates (at least one) that have to be considered for traversal. A `shortest` query block returns the shortest path under `_path_` in the query response. The path can also be stored in a variable which is used in other query blocks.
  2903  
  2904  By default the shortest path is returned. With `numpaths: k`, the k-shortest paths are returned. With `depth: n`, the shortest paths up to `n` hops away are returned.
  2905  
  2906  {{% notice "note" %}}
  2907  - If no predicates are specified in the `shortest` block, no path can be fetched as no edge is traversed.
  2908  - If you're seeing queries take a long time, you can set a [gRPC deadline](https://grpc.io/blog/deadlines) to stop the query after a certain amount of time.
  2909  {{% /notice %}}
  2910  
  2911  For example:
  2912  
  2913  ```sh
  2914  curl localhost:8080/alter -XPOST -d $'
  2915      name: string @index(exact) .
  2916  ' | python -m json.tool | less
  2917  ```
  2918  
  2919  ```sh
  2920  curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -XPOST -d $'
  2921  {
  2922    set {
  2923      _:a <friend> _:b (weight=0.1) .
  2924      _:b <friend> _:c (weight=0.2) .
  2925      _:c <friend> _:d (weight=0.3) .
  2926      _:a <friend> _:d (weight=1) .
  2927      _:a <name> "Alice" .
  2928      _:b <name> "Bob" .
  2929      _:c <name> "Tom" .
  2930      _:d <name> "Mallory" .
  2931    }
  2932  }' | python -m json.tool | less
  2933  ```
  2934  
  2935  The shortest path between Alice and Mallory (assuming UIDs 0x2 and 0x5 respectively) can be found with query:
  2936  
  2937  ```sh
  2938  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{
  2939   path as shortest(from: 0x2, to: 0x5) {
  2940    friend
  2941   }
  2942   path(func: uid(path)) {
  2943     name
  2944   }
  2945  }' | python -m json.tool | less
  2946  ```
  2947  
  2948  Which returns the following results. (Note, without considering the `weight` facet, each edges' weight is considered as 1)
  2949  
  2950  ```
  2951  {
  2952    "data": {
  2953      "path": [
  2954        {
  2955          "name": "Alice"
  2956        },
  2957        {
  2958          "name": "Mallory"
  2959        }
  2960      ],
  2961      "_path_": [
  2962        {
  2963          "uid": "0x2",
  2964          "friend": [
  2965            {
  2966              "uid": "0x5"
  2967            }
  2968          ]
  2969        }
  2970      ]
  2971    }
  2972  }
  2973  ```
  2974  
  2975  We can return more paths by specifying `numpaths`. Setting `numpaths: 2` returns the shortest two paths:
  2976  
  2977  ```sh
  2978  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{
  2979  
  2980   A as var(func: eq(name, "Alice"))
  2981   M as var(func: eq(name, "Mallory"))
  2982  
  2983   path as shortest(from: uid(A), to: uid(M), numpaths: 2) {
  2984    friend
  2985   }
  2986   path(func: uid(path)) {
  2987     name
  2988   }
  2989  }' | python -m json.tool | less
  2990  ```
  2991  
  2992  {{% notice "note" %}}In the query above, instead of using UID literals, we query both people using var blocks and the `uid()` function. You can also combine it with [GraphQL Variables]({{< relref "#graphql-variables" >}}).{{% /notice %}}
  2993  
  2994  Edges weights are included by using facets on the edges as follows.
  2995  
  2996  {{% notice "note" %}}Only one facet per predicate is allowed in the shortest query block.{{% /notice %}}
  2997  
  2998  ```sh
  2999  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{
  3000   path as shortest(from: 0x2, to: 0x5) {
  3001    friend @facets(weight)
  3002   }
  3003  
  3004   path(func: uid(path)) {
  3005    name
  3006   }
  3007  }' | python -m json.tool | less
  3008  ```
  3009  
  3010  ```
  3011  {
  3012    "data": {
  3013      "path": [
  3014        {
  3015          "name": "Alice"
  3016        },
  3017        {
  3018          "name": "Bob"
  3019        },
  3020        {
  3021          "name": "Tom"
  3022        },
  3023        {
  3024          "name": "Mallory"
  3025        }
  3026      ],
  3027      "_path_": [
  3028        {
  3029          "uid": "0x2",
  3030          "friend": [
  3031            {
  3032              "uid": "0x3",
  3033              "friend|weight": 0.1,
  3034              "friend": [
  3035                {
  3036                  "uid": "0x4",
  3037                  "friend|weight": 0.2,
  3038                  "friend": [
  3039                    {
  3040                      "uid": "0x5",
  3041                      "friend|weight": 0.3
  3042                    }
  3043                  ]
  3044                }
  3045              ]
  3046            }
  3047          ]
  3048        }
  3049      ]
  3050    }
  3051  }
  3052  ```
  3053  
  3054  Constraints can be applied to the intermediate nodes as follows.
  3055  
  3056  ```sh
  3057  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{
  3058    path as shortest(from: 0x2, to: 0x5) {
  3059      friend @filter(not eq(name, "Bob")) @facets(weight)
  3060      relative @facets(liking)
  3061    }
  3062  
  3063    relationship(func: uid(path)) {
  3064      name
  3065    }
  3066  }' | python -m json.tool | less
  3067  ```
  3068  
  3069  The k-shortest path algorithm (used when `numpaths` > 1) also accepts the arguments `minweight` and `maxweight`, which take a float as their value. When they are passed, only paths within the weight range `[minweight, maxweight]` will be considered as valid paths. This can be used, for example, to query the shortest paths that traverse between 2 and 4 nodes.
  3070  
  3071  ```sh
  3072  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{
  3073   path as shortest(from: 0x2, to: 0x5, numpaths: 2, minweight: 2, maxweight: 4) {
  3074    friend
  3075   }
  3076   path(func: uid(path)) {
  3077     name
  3078   }
  3079  }' | python -m json.tool | less
  3080  ```
  3081  
  3082  Some points to keep in mind for shortest path queries:
  3083  
  3084  - Weights must be non-negative. Dijkstra's algorithm is used to calculate the shortest paths.
  3085  - Only one facet per predicate in the shortest query block is allowed.
  3086  - Only one `shortest` path block is allowed per query. Only one `_path_` is returned in the result.
  3087  - For k-shortest paths (when `numpaths` > 1), the result of the shortest path query variable will only return a single path. All k paths are returned in `_path_`.
  3088  
  3089  ## Recurse Query
  3090  
  3091  `Recurse` queries let you traverse a set of predicates (with filter, facets, etc.) until we reach all leaf nodes or we reach the maximum depth which is specified by the `depth` parameter.
  3092  
  3093  To get 10 movies from a genre that has more than 30000 films and then get two actors for those movies we'd do something as follows:
  3094  {{< runnable >}}
  3095  {
  3096  	me(func: gt(count(~genre), 30000), first: 1) @recurse(depth: 5, loop: true) {
  3097  		name@en
  3098  		~genre (first:10) @filter(gt(count(starring), 2))
  3099  		starring (first: 2)
  3100  		performance.actor
  3101  	}
  3102  }
  3103  {{< /runnable >}}
  3104  Some points to keep in mind while using recurse queries are:
  3105  
  3106  - You can specify only one level of predicates after root. These would be traversed recursively. Both scalar and entity-nodes are treated similarly.
  3107  - Only one recurse block is advised per query.
  3108  - Be careful as the result size could explode quickly and an error would be returned if the result set gets too large. In such cases use more filters, limit results using pagination, or provide a depth parameter at root as shown in the example above.
  3109  - The `loop` parameter can be set to false, in which case paths which lead to a loop would be ignored
  3110    while traversing.
  3111  - If not specified, the value of the `loop` parameter defaults to false.
  3112  - If the value of the `loop` parameter is false and depth is not specified, `depth` will default to `math.MaxUint64`, which means that the entire graph might be traversed until all the leaf nodes are reached.
  3113  
  3114  
  3115  ## Fragments
  3116  
  3117  `fragment` keyword allows you to define new fragments that can be referenced in a query, as per [GraphQL specification](https://facebook.github.io/graphql/#sec-Language.Fragments). The point is that if there are multiple parts which query the same set of fields, you can define a fragment and refer to it multiple times instead. Fragments can be nested inside fragments, but no cycles are allowed. Here is one contrived example.
  3118  
  3119  ```sh
  3120  curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'
  3121  query {
  3122    debug(func: uid(1)) {
  3123      name@en
  3124      ...TestFrag
  3125    }
  3126  }
  3127  fragment TestFrag {
  3128    initial_release_date
  3129    ...TestFragB
  3130  }
  3131  fragment TestFragB {
  3132    country
  3133  }' | python -m json.tool | less
  3134  ```
  3135  
  3136  ## GraphQL Variables
  3137  
  3138  `Variables` can be defined and used in queries which helps in query reuse and avoids costly string building in clients at runtime by passing a separate variable map. A variable starts with a `$` symbol.
  3139  For **HTTP requests** with GraphQL Variables, we must use `Content-Type: application/json` header and pass data with a JSON object containing `query` and `variables`.
  3140  
  3141  ```sh
  3142  curl -H "Content-Type: application/json" localhost:8080/query -XPOST -d $'{
  3143    "query": "query test($a: string) { test(func: eq(name, $a)) { \n uid \n name \n } }",
  3144    "variables": { "$a": "Alice" }
  3145  }' | python -m json.tool | less
  3146  ```
  3147  
  3148  {{< runnable vars="{\"$a\": \"5\", \"$b\": \"10\", \"$name\": \"Steven Spielberg\"}" >}}
  3149  query test($a: int, $b: int, $name: string) {
  3150    me(func: allofterms(name@en, $name)) {
  3151      name@en
  3152      director.film (first: $a, offset: $b) {
  3153        name @en
  3154        genre(first: $a) {
  3155          name@en
  3156        }
  3157      }
  3158    }
  3159  }
  3160  {{< /runnable >}}
  3161  
  3162  * Variables can have default values. In the example below, `$a` has a default value of `2`. Since the value for `$a` isn't provided in the variable map, `$a` takes on the default value.
  3163  * Variables whose type is suffixed with a `!` can't have a default value but must have a value as part of the variables map.
  3164  * The value of the variable must be parsable to the given type, if not, an error is thrown.
  3165  * The variable types that are supported as of now are: `int`, `float`, `bool` and `string`.
  3166  * Any variable that is being used must be declared in the named query clause in the beginning.
  3167  
  3168  {{< runnable vars="{\"$b\": \"10\", \"$name\": \"Steven Spielberg\"}" >}}
  3169  query test($a: int = 2, $b: int!, $name: string) {
  3170    me(func: allofterms(name@en, $name)) {
  3171      director.film (first: $a, offset: $b) {
  3172        genre(first: $a) {
  3173          name@en
  3174        }
  3175      }
  3176    }
  3177  }
  3178  {{< /runnable >}}
  3179  
  3180  You can also use array with GraphQL Variables.
  3181  
  3182  {{< runnable vars="{\"$b\": \"10\", \"$aName\": \"Steven Spielberg\", \"$bName\": \"Quentin Tarantino\"}" >}}
  3183  query test($a: int = 2, $b: int!, $aName: string, $bName: string) {
  3184    me(func: eq(name@en, [$aName, $bName])) {
  3185      director.film (first: $a, offset: $b) {
  3186        genre(first: $a) {
  3187          name@en
  3188        }
  3189      }
  3190    }
  3191  }
  3192  {{< /runnable >}}
  3193  
  3194  
  3195  {{% notice "note" %}}
  3196  If you want to input a list of uids as a GraphQL variable value, you can have the variable as string type and
  3197  have the value surrounded by square brackets like `["13", "14"]`.
  3198  {{% /notice %}}
  3199  
  3200  ## Indexing with Custom Tokenizers
  3201  
  3202  Dgraph comes with a large toolkit of builtin indexes, but sometimes for niche
  3203  use cases they're not always enough.
  3204  
  3205  Dgraph allows you to implement custom tokenizers via a plugin system in order
  3206  to fill the gaps.
  3207  
  3208  ### Caveats
  3209  
  3210  The plugin system uses Go's [`pkg/plugin`](https://golang.org/pkg/plugin/).
  3211  This brings some restrictions to how plugins can be used.
  3212  
  3213  - Plugins must be written in Go.
  3214  
  3215  - As of Go 1.9, `pkg/plugin` only works on Linux. Therefore, plugins will only
  3216    work on Dgraph instances deployed in a Linux environment.
  3217  
  3218  - The version of Go used to compile the plugin should be the same as the version
  3219    of Go used to compile Dgraph itself. Dgraph always uses the latest version of
  3220  Go (and so should you!).
  3221  
  3222  ### Implementing a plugin
  3223  
  3224  {{% notice "note" %}}
  3225  You should consider Go's [plugin](https://golang.org/pkg/plugin/) documentation
  3226  to be supplementary to the documentation provided here.
  3227  {{% /notice %}}
  3228  
  3229  Plugins are implemented as their own main package. They must export a
  3230  particular symbol that allows Dgraph to hook into the custom logic the plugin
  3231  provides.
  3232  
  3233  The plugin must export a symbol named `Tokenizer`. The type of the symbol must
  3234  be `func() interface{}`. When the function is called the result returned should
  3235  be a value that implements the following interface:
  3236  
  3237  ```
  3238  type PluginTokenizer interface {
  3239      // Name is the name of the tokenizer. It should be unique among all
  3240      // builtin tokenizers and other custom tokenizers. It identifies the
  3241      // tokenizer when an index is set in the schema and when search/filter
  3242      // is used in queries.
  3243      Name() string
  3244  
  3245      // Identifier is a byte that uniquely identifiers the tokenizer.
  3246      // Bytes in the range 0x80 to 0xff (inclusive) are reserved for
  3247      // custom tokenizers.
  3248      Identifier() byte
  3249  
  3250      // Type is a string representing the type of data that is to be
  3251      // tokenized. This must match the schema type of the predicate
  3252      // being indexed. Allowable values are shown in the table below.
  3253      Type() string
  3254  
  3255      // Tokens should implement the tokenization logic. The input is
  3256      // the value to be tokenized, and will always have a concrete type
  3257      // corresponding to Type(). The return value should be a list of
  3258      // the tokens generated.
  3259      Tokens(interface{}) ([]string, error)
  3260  }
  3261  ```
  3262  
  3263  The return value of `Type()` corresponds to the concrete input type of
  3264  `Tokens(interface{})` in the following way:
  3265  
  3266   `Type()` return value | `Tokens(interface{})` input type
  3267  -----------------------|----------------------------------
  3268   `"int"`               | `int64`
  3269   `"float"`             | `float64`
  3270   `"string"`            | `string`
  3271   `"bool"`              | `bool`
  3272   `"datetime"`          | `time.Time`
  3273  
  3274  ### Building the plugin
  3275  
  3276  The plugin has to be built using the `plugin` build mode so that an `.so` file
  3277  is produced instead of a regular executable. For example:
  3278  
  3279  ```sh
  3280  go build -buildmode=plugin -o myplugin.so ~/go/src/myplugin/main.go
  3281  ```
  3282  
  3283  ### Running Dgraph with plugins
  3284  
  3285  When starting Dgraph, use the `--custom_tokenizers` flag to tell Dgraph which
  3286  tokenizers to load. It accepts a comma separated list of plugins. E.g.
  3287  
  3288  ```sh
  3289  dgraph ...other-args... --custom_tokenizers=plugin1.so,plugin2.so
  3290  ```
  3291  
  3292  {{% notice "note" %}}
  3293  Plugin validation is performed on startup. If a problem is detected, Dgraph
  3294  will refuse to initialise.
  3295  {{% /notice %}}
  3296  
  3297  ### Adding the index to the schema
  3298  
  3299  To use a tokenization plugin, an index has to be created in the schema.
  3300  
  3301  The syntax is the same as adding any built-in index. To add an custom index
  3302  using a tokenizer plugin named `foo` to a `string` predicate named
  3303  `my_predicate`, use the following in the schema:
  3304  
  3305  ```sh
  3306  my_predicate: string @index(foo) .
  3307  ```
  3308  
  3309  ### Using the index in queries
  3310  
  3311  There are two functions that can use custom indexes:
  3312  
  3313   Mode | Behaviour
  3314  --------|-------
  3315   `anyof` | Returns nodes that match on *any* of the tokens generated
  3316   `allof` | Returns nodes that match on *all* of the tokens generated
  3317  
  3318  The functions can be used either at the query root or in filters.
  3319  
  3320  There behaviour here an analogous to `anyofterms`/`allofterms` and
  3321  `anyoftext`/`alloftext`.
  3322  
  3323  ### Examples
  3324  
  3325  The following examples should make the process of writing a tokenization plugin
  3326  more concrete.
  3327  
  3328  #### Unicode Characters
  3329  
  3330  This example shows the type of tokenization that is similar to term
  3331  tokenization of full-text search. Instead of being broken down into terms or
  3332  stem words, the text is instead broken down into its constituent unicode
  3333  codepoints (in Go terminology these are called *runes*).
  3334  
  3335  {{% notice "note" %}}
  3336  This tokenizer would create a very large index that would be expensive to
  3337  manage and store. That's one of the reasons that text indexing usually occurs
  3338  at a higher level; stem words for full-text search or terms for term search.
  3339  {{% /notice %}}
  3340  
  3341  The implementation of the plugin looks like this:
  3342  
  3343  ```go
  3344  package main
  3345  
  3346  import "encoding/binary"
  3347  
  3348  func Tokenizer() interface{} { return RuneTokenizer{} }
  3349  
  3350  type RuneTokenizer struct{}
  3351  
  3352  func (RuneTokenizer) Name() string     { return "rune" }
  3353  func (RuneTokenizer) Type() string     { return "string" }
  3354  func (RuneTokenizer) Identifier() byte { return 0xfd }
  3355  
  3356  func (t RuneTokenizer) Tokens(value interface{}) ([]string, error) {
  3357  	var toks []string
  3358  	for _, r := range value.(string) {
  3359  		var buf [binary.MaxVarintLen32]byte
  3360  		n := binary.PutVarint(buf[:], int64(r))
  3361  		tok := string(buf[:n])
  3362  		toks = append(toks, tok)
  3363  	}
  3364  	return toks, nil
  3365  }
  3366  ```
  3367  
  3368  **Hints and tips:**
  3369  
  3370  - Inside `Tokens`, you can assume that `value` will have concrete type
  3371    corresponding to that specified by `Type()`. It's safe to do a type
  3372  assertion.
  3373  
  3374  - Even though the return value is `[]string`, you can always store non-unicode
  3375    data inside the string. See [this blogpost](https://blog.golang.org/strings)
  3376  for some interesting background how string are implemented in Go and why they
  3377  can be used to store non-textual data. By storing arbitrary data in the string,
  3378  you can make the index more compact. In this case, varints are stored in the
  3379  return values.
  3380  
  3381  Setting up the indexing and adding data:
  3382  ```
  3383  name: string @index(rune) .
  3384  ```
  3385  
  3386  
  3387  ```
  3388  {
  3389    set{
  3390      _:ad <name> "Adam" .
  3391      _:aa <name> "Aaron" .
  3392      _:am <name> "Amy" .
  3393      _:ro <name> "Ronald" .
  3394    }
  3395  }
  3396  ```
  3397  Now queries can be performed.
  3398  
  3399  The only person that has all of the runes `A` and `n` in their `name` is Aaron:
  3400  ```
  3401  {
  3402    q(func: allof(name, rune, "An")) {
  3403      name
  3404    }
  3405  }
  3406  =>
  3407  {
  3408    "data": {
  3409      "q": [
  3410        { "name": "Aaron" }
  3411      ]
  3412    }
  3413  }
  3414  ```
  3415  But there are multiple people who have both of the runes `A` and `m`:
  3416  ```
  3417  {
  3418    q(func: allof(name, rune, "Am")) {
  3419      name
  3420    }
  3421  }
  3422  =>
  3423  {
  3424    "data": {
  3425      "q": [
  3426        { "name": "Amy" },
  3427        { "name": "Adam" }
  3428      ]
  3429    }
  3430  }
  3431  ```
  3432  Case is taken into account, so if you search for all names containing `"ron"`,
  3433  you would find `"Aaron"`, but not `"Ronald"`. But if you were to search for
  3434  `"no"`, you would match both `"Aaron"` and `"Ronald"`. The order of the runes in
  3435  the strings doesn't matter.
  3436  
  3437  It's possible to search for people that have *any* of the supplied runes in
  3438  their names (rather than *all* of the supplied runes). To do this, use `anyof`
  3439  instead of `allof`:
  3440  ```
  3441  {
  3442    q(func: anyof(name, rune, "mr")) {
  3443      name
  3444    }
  3445  }
  3446  =>
  3447  {
  3448    "data": {
  3449      "q": [
  3450        { "name": "Adam" },
  3451        { "name": "Aaron" },
  3452        { "name": "Amy" }
  3453      ]
  3454    }
  3455  }
  3456  ```
  3457  `"Ronald"` doesn't contain `m` or `r`, so isn't found by the search.
  3458  
  3459  {{% notice "note" %}}
  3460  Understanding what's going on under the hood can help you intuitively
  3461  understand how `Tokens` method should be implemented.
  3462  
  3463  When Dgraph sees new edges that are to be indexed by your tokenizer, it
  3464  will tokenize the value. The resultant tokens are used as keys for posting
  3465  lists. The edge subject is then added to the posting list for each token.
  3466  
  3467  When a query root search occurs, the search value is tokenized. The result of
  3468  the search is all of the nodes in the union or intersection of the corresponding
  3469  posting lists (depending on whether `anyof` or `allof` was used).
  3470  {{% /notice %}}
  3471  
  3472  #### CIDR Range
  3473  
  3474  Tokenizers don't always have to be about splitting text up into its constituent
  3475  parts. This example indexes [IP addresses into their CIDR
  3476  ranges](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing). This
  3477  allows you to search for all IP addresses that fall into a particular CIDR
  3478  range.
  3479  
  3480  The plugin code is more complicated than the rune example. The input is an IP
  3481  address stored as a string, e.g. `"100.55.22.11/32"`. The output are the CIDR
  3482  ranges that the IP address could possibly fall into. There could be up to 32
  3483  different outputs (`"100.55.22.11/32"` does indeed have 32 possible ranges, one
  3484  for each mask size).
  3485  
  3486  ```go
  3487  package main
  3488  
  3489  import "net"
  3490  
  3491  func Tokenizer() interface{} { return CIDRTokenizer{} }
  3492  
  3493  type CIDRTokenizer struct{}
  3494  
  3495  func (CIDRTokenizer) Name() string     { return "cidr" }
  3496  func (CIDRTokenizer) Type() string     { return "string" }
  3497  func (CIDRTokenizer) Identifier() byte { return 0xff }
  3498  
  3499  func (t CIDRTokenizer) Tokens(value interface{}) ([]string, error) {
  3500  	_, ipnet, err := net.ParseCIDR(value.(string))
  3501  	if err != nil {
  3502  		return nil, err
  3503  	}
  3504  	ones, bits := ipnet.Mask.Size()
  3505  	var toks []string
  3506  	for i := ones; i >= 1; i-- {
  3507  		m := net.CIDRMask(i, bits)
  3508  		tok := net.IPNet{
  3509  			IP:   ipnet.IP.Mask(m),
  3510  			Mask: m,
  3511  		}
  3512  		toks = append(toks, tok.String())
  3513  	}
  3514  	return toks, nil
  3515  }
  3516  ```
  3517  An example of using the tokenizer:
  3518  
  3519  Setting up the indexing and adding data:
  3520  ```
  3521  ip: string @index(cidr) .
  3522  
  3523  ```
  3524  
  3525  ```
  3526  {
  3527    set{
  3528      _:a <ip> "100.55.22.11/32" .
  3529      _:b <ip> "100.33.81.19/32" .
  3530      _:c <ip> "100.49.21.25/32" .
  3531      _:d <ip> "101.0.0.5/32" .
  3532      _:e <ip> "100.176.2.1/32" .
  3533    }
  3534  }
  3535  ```
  3536  ```
  3537  {
  3538    q(func: allof(ip, cidr, "100.48.0.0/12")) {
  3539      ip
  3540    }
  3541  }
  3542  =>
  3543  {
  3544    "data": {
  3545      "q": [
  3546        { "ip": "100.55.22.11/32" },
  3547        { "ip": "100.49.21.25/32" }
  3548      ]
  3549    }
  3550  }
  3551  ```
  3552  The CIDR ranges of `100.55.22.11/32` and `100.49.21.25/32` are both
  3553  `100.48.0.0/12`.  The other IP addresses in the database aren't included in the
  3554  search result, since they have different CIDR ranges for 12 bit masks
  3555  (`100.32.0.0/12`, `101.0.0.0/12`, `100.154.0.0/12` for `100.33.81.19/32`,
  3556  `101.0.0.5/32`, and `100.176.2.1/32` respectively).
  3557  
  3558  Note that we're using `allof` instead of `anyof`. Only `allof` will work
  3559  correctly with this index. Remember that the tokenizer generates all possible
  3560  CIDR ranges for an IP address. If we were to use `anyof` then the search result
  3561  would include all IP addresses under the 1 bit mask (in this case, `0.0.0.0/1`,
  3562  which would match all IPs in this dataset).
  3563  
  3564  #### Anagram
  3565  
  3566  Tokenizers don't always have to return multiple tokens. If you just want to
  3567  index data into groups, have the tokenizer just return an identifying member of
  3568  that group.
  3569  
  3570  In this example, we want to find groups of words that are
  3571  [anagrams](https://en.wikipedia.org/wiki/Anagram) of each
  3572  other.
  3573  
  3574  A token to correspond to a group of anagrams could just be the letters in the
  3575  anagram in sorted order, as implemented below:
  3576  
  3577  ```go
  3578  package main
  3579  
  3580  import "sort"
  3581  
  3582  func Tokenizer() interface{} { return AnagramTokenizer{} }
  3583  
  3584  type AnagramTokenizer struct{}
  3585  
  3586  func (AnagramTokenizer) Name() string     { return "anagram" }
  3587  func (AnagramTokenizer) Type() string     { return "string" }
  3588  func (AnagramTokenizer) Identifier() byte { return 0xfc }
  3589  
  3590  func (t AnagramTokenizer) Tokens(value interface{}) ([]string, error) {
  3591  	b := []byte(value.(string))
  3592  	sort.Slice(b, func(i, j int) bool { return b[i] < b[j] })
  3593  	return []string{string(b)}, nil
  3594  }
  3595  ```
  3596  In action:
  3597  
  3598  Setting up the indexing and adding data:
  3599  ```
  3600  word: string @index(anagram) .
  3601  ```
  3602  
  3603  ```
  3604  {
  3605    set{
  3606      _:1 <word> "airmen" .
  3607      _:2 <word> "marine" .
  3608      _:3 <word> "beat" .
  3609      _:4 <word> "beta" .
  3610      _:5 <word> "race" .
  3611      _:6 <word> "care" .
  3612    }
  3613  }
  3614  ```
  3615  ```
  3616  {
  3617    q(func: allof(word, anagram, "remain")) {
  3618      word
  3619    }
  3620  }
  3621  =>
  3622  {
  3623    "data": {
  3624      "q": [
  3625        { "word": "airmen" },
  3626        { "word": "marine" }
  3627      ]
  3628    }
  3629  }
  3630  ```
  3631  
  3632  Since a single token is only ever generated, it doesn't matter if `anyof` or
  3633  `allof` is used. The result will always be the same.
  3634  
  3635  #### Integer prime factors
  3636  
  3637  All of the custom tokenizers shown previously have worked with strings.
  3638  However, other data types can be used as well. This example is contrived, but
  3639  nonetheless shows some advanced usages of custom tokenizers.
  3640  
  3641  The tokenizer creates a token for each prime factor in the input.
  3642  
  3643  ```
  3644  package main
  3645  
  3646  import (
  3647      "encoding/binary"
  3648      "fmt"
  3649  )
  3650  
  3651  func Tokenizer() interface{} { return FactorTokenizer{} }
  3652  
  3653  type FactorTokenizer struct{}
  3654  
  3655  func (FactorTokenizer) Name() string     { return "factor" }
  3656  func (FactorTokenizer) Type() string     { return "int" }
  3657  func (FactorTokenizer) Identifier() byte { return 0xfe }
  3658  
  3659  func (FactorTokenizer) Tokens(value interface{}) ([]string, error) {
  3660      x := value.(int64)
  3661      if x <= 1 {
  3662          return nil, fmt.Errorf("Cannot factor int <= 1: %d", x)
  3663      }
  3664      var toks []string
  3665      for p := int64(2); x > 1; p++ {
  3666          if x%p == 0 {
  3667              toks = append(toks, encodeInt(p))
  3668              for x%p == 0 {
  3669                  x /= p
  3670              }
  3671          }
  3672      }
  3673      return toks, nil
  3674  
  3675  }
  3676  
  3677  func encodeInt(x int64) string {
  3678      var buf [binary.MaxVarintLen64]byte
  3679      n := binary.PutVarint(buf[:], x)
  3680      return string(buf[:n])
  3681  }
  3682  ```
  3683  {{% notice "note" %}}
  3684  Notice that the return of `Type()` is `"int"`, corresponding to the concrete
  3685  type of the input to `Tokens` (which is `int64`).
  3686  {{% /notice %}}
  3687  
  3688  This allows you do things like search for all numbers that share prime
  3689  factors with a particular number.
  3690  
  3691  In particular, we search for numbers that contain any of the prime factors of
  3692  15, i.e. any numbers that are divisible by either 3 or 5.
  3693  
  3694  Setting up the indexing and adding data:
  3695  ```
  3696  num: int @index(factor) .
  3697  ```
  3698  
  3699  ```
  3700  {
  3701    set{
  3702      _:2 <num> "2"^^<xs:int> .
  3703      _:3 <num> "3"^^<xs:int> .
  3704      _:4 <num> "4"^^<xs:int> .
  3705      _:5 <num> "5"^^<xs:int> .
  3706      _:6 <num> "6"^^<xs:int> .
  3707      _:7 <num> "7"^^<xs:int> .
  3708      _:8 <num> "8"^^<xs:int> .
  3709      _:9 <num> "9"^^<xs:int> .
  3710      _:10 <num> "10"^^<xs:int> .
  3711      _:11 <num> "11"^^<xs:int> .
  3712      _:12 <num> "12"^^<xs:int> .
  3713      _:13 <num> "13"^^<xs:int> .
  3714      _:14 <num> "14"^^<xs:int> .
  3715      _:15 <num> "15"^^<xs:int> .
  3716      _:16 <num> "16"^^<xs:int> .
  3717      _:17 <num> "17"^^<xs:int> .
  3718      _:18 <num> "18"^^<xs:int> .
  3719      _:19 <num> "19"^^<xs:int> .
  3720      _:20 <num> "20"^^<xs:int> .
  3721      _:21 <num> "21"^^<xs:int> .
  3722      _:22 <num> "22"^^<xs:int> .
  3723      _:23 <num> "23"^^<xs:int> .
  3724      _:24 <num> "24"^^<xs:int> .
  3725      _:25 <num> "25"^^<xs:int> .
  3726      _:26 <num> "26"^^<xs:int> .
  3727      _:27 <num> "27"^^<xs:int> .
  3728      _:28 <num> "28"^^<xs:int> .
  3729      _:29 <num> "29"^^<xs:int> .
  3730      _:30 <num> "30"^^<xs:int> .
  3731    }
  3732  }
  3733  ```
  3734  ```
  3735  {
  3736    q(func: anyof(num, factor, 15)) {
  3737      num
  3738    }
  3739  }
  3740  =>
  3741  {
  3742    "data": {
  3743      "q": [
  3744        { "num": 3 },
  3745        { "num": 5 },
  3746        { "num": 6 },
  3747        { "num": 9 },
  3748        { "num": 10 },
  3749        { "num": 12 },
  3750        { "num": 15 },
  3751        { "num": 18 }
  3752        { "num": 20 },
  3753        { "num": 21 },
  3754        { "num": 25 },
  3755        { "num": 24 },
  3756        { "num": 27 },
  3757        { "num": 30 },
  3758      ]
  3759    }
  3760  }
  3761  ```