github.com/dgraph-io/dgraph@v1.2.8/wiki/content/query-language/index.md (about) 1 +++ 2 title = "Query Language" 3 +++ 4 5 Dgraph's GraphQL+- is based on Facebook's [GraphQL](https://facebook.github.io/graphql/). GraphQL wasn't developed for Graph databases, but its graph-like query syntax, schema validation and subgraph shaped response make it a great language choice. We've modified the language to better support graph operations, adding and removing features to get the best fit for graph databases. We're calling this simplified, feature rich language, ''GraphQL+-''. 6 7 GraphQL+- is a work in progress. We're adding more features and we might further simplify existing ones. 8 9 ## Take a Tour - https://tour.dgraph.io 10 11 This document is the Dgraph query reference material. It is not a tutorial. It's designed as a reference for users who already know how to write queries in GraphQL+- but need to check syntax, or indices, or functions, etc. 12 13 {{% notice "note" %}}If you are new to Dgraph and want to learn how to use Dgraph and GraphQL+-, take the tour - https://tour.dgraph.io{{% /notice %}} 14 15 16 ### Running examples 17 18 The examples in this reference use a database of 21 million triples about movies and actors. The example queries run and return results. The queries are executed by an instance of Dgraph running at https://play.dgraph.io/. To run the queries locally or experiment a bit more, see the [Getting Started]({{< relref "get-started/index.md" >}}) guide, which also shows how to load the datasets used in the examples here. 19 20 ## GraphQL+- Fundamentals 21 22 A GraphQL+- query finds nodes based on search criteria, matches patterns in a graph and returns a graph as a result. 23 24 A query is composed of nested blocks, starting with a query root. The root finds the initial set of nodes against which the following graph matching and filtering is applied. 25 26 {{% notice "note" %}}See more about Queries in [Queries design concept]({{< relref "design-concepts/index.md#queries" >}}) {{% /notice %}} 27 28 ### Returning Values 29 30 Each query has a name, specified at the query root, and the same name identifies the results. 31 32 If an edge is of a value type, the value can be returned by giving the edge name. 33 34 Query Example: In the example dataset, edges that link movies to directors and actors, movies have a name, release date and identifiers for a number of well known movie databases. This query, with name `bladerunner`, and root matching a movie name, returns those values for the early 80's sci-fi classic "Blade Runner". 35 36 {{< runnable >}} 37 { 38 bladerunner(func: eq(name@en, "Blade Runner")) { 39 uid 40 name@en 41 initial_release_date 42 netflix_id 43 } 44 } 45 {{< /runnable >}} 46 47 The query first searches the graph, using indexes to make the search efficient, for all nodes with a `name` edge equaling "Blade Runner". For the found node the query then returns the listed outgoing edges. 48 49 Every node had a unique 64-bit identifier. The `uid` edge in the query above returns that identifier. If the required node is already known, then the function `uid` finds the node. 50 51 Query Example: "Blade Runner" movie data found by UID. 52 53 {{< runnable >}} 54 { 55 bladerunner(func: uid(0x394c)) { 56 uid 57 name@en 58 initial_release_date 59 netflix_id 60 } 61 } 62 {{< /runnable >}} 63 64 A query can match many nodes and return the values for each. 65 66 Query Example: All nodes that have either "Blade" or "Runner" in the name. 67 68 {{< runnable >}} 69 { 70 bladerunner(func: anyofterms(name@en, "Blade Runner")) { 71 uid 72 name@en 73 initial_release_date 74 netflix_id 75 } 76 } 77 {{< /runnable >}} 78 79 Multiple IDs can be specified in a list to the `uid` function. 80 81 Query Example: 82 {{< runnable >}} 83 { 84 movies(func: uid(0xb5849, 0x394c)) { 85 uid 86 name@en 87 initial_release_date 88 netflix_id 89 } 90 } 91 {{< /runnable >}} 92 93 94 {{% notice "note" %}} If your predicate has special characters, then you should wrap it with angular 95 brackets while asking for it in the query. E.g. `<first:name>`{{% /notice %}} 96 97 ### Expanding Graph Edges 98 99 A query expands edges from node to node by nesting query blocks with `{ }`. 100 101 Query Example: The actors and characters played in "Blade Runner". The query first finds the node with name "Blade Runner", then follows outgoing `starring` edges to nodes representing an actor's performance as a character. From there the `performance.actor` and `performance.character` edges are expanded to find the actor names and roles for every actor in the movie. 102 {{< runnable >}} 103 { 104 brCharacters(func: eq(name@en, "Blade Runner")) { 105 name@en 106 initial_release_date 107 starring { 108 performance.actor { 109 name@en # actor name 110 } 111 performance.character { 112 name@en # character name 113 } 114 } 115 } 116 } 117 {{< /runnable >}} 118 119 120 ### Comments 121 122 Anything on a line following a `#` is a comment 123 124 ### Applying Filters 125 126 The query root finds an initial set of nodes and the query proceeds by returning values and following edges to further nodes - any node reached in the query is found by traversal after the search at root. The nodes found can be filtered by applying `@filter`, either after the root or at any edge. 127 128 Query Example: "Blade Runner" director Ridley Scott's movies released before the year 2000. 129 {{< runnable >}} 130 { 131 scott(func: eq(name@en, "Ridley Scott")) { 132 name@en 133 initial_release_date 134 director.film @filter(le(initial_release_date, "2000")) { 135 name@en 136 initial_release_date 137 } 138 } 139 } 140 {{< /runnable >}} 141 142 Query Example: Movies with either "Blade" or "Runner" in the title and released before the year 2000. 143 144 {{< runnable >}} 145 { 146 bladerunner(func: anyofterms(name@en, "Blade Runner")) @filter(le(initial_release_date, "2000")) { 147 uid 148 name@en 149 initial_release_date 150 netflix_id 151 } 152 } 153 {{< /runnable >}} 154 155 ### Language Support 156 157 {{% notice "note" %}}A `@lang` directive must be specified in the schema to query or mutate 158 predicates with language tags.{{% /notice %}} 159 160 Dgraph supports UTF-8 strings. 161 162 In a query, for a string valued edge `edge`, the syntax 163 ``` 164 edge@lang1:...:langN 165 ``` 166 specifies the preference order for returned languages, with the following rules. 167 168 * At most one result will be returned (except in the case where the language list is set to *). 169 * The preference list is considered left to right: if a value in given language is not found, the next language from the list is considered. 170 * If there are no values in any of the specified languages, no value is returned. 171 * A final `.` means that a value without a specified language is returned or if there is no value without language, a value in ''some'' language is returned. 172 * Setting the language list value to * will return all the values for that predicate along with their language. Values without a language tag are also returned. 173 174 For example: 175 176 - `name` => Look for an untagged string; return nothing if no untagged value exits. 177 - `name@.` => Look for an untagged string, then any language. 178 - `name@en` => Look for `en` tagged string; return nothing if no `en` tagged string exists. 179 - `name@en:.` => Look for `en`, then untagged, then any language. 180 - `name@en:pl` => Look for `en`, then `pl`, otherwise nothing. 181 - `name@en:pl:.` => Look for `en`, then `pl`, then untagged, then any language. 182 - `name@*` => Look for all the values of this predicate and return them along with their language. For example, if there are two values with languages en and hi, this query will return two keys named "name@en" and "name@hi". 183 184 185 {{% notice "note" %}}In functions, language lists (including the `@*` notation) are not allowed. Untagged predicates, Single language tags, and `.` notation work as described above. 186 187 --- 188 189 In [full-text search functions]({{< relref "#full-text-search" >}}) (`alloftext`, `anyoftext`), when no language is specified (untagged or `@.`), the default (English) full-text tokenizer is used.{{% /notice %}} 190 191 192 Query Example: Some of Bollywood director and actor Farhan Akhtar's movies have a name stored in Russian as well as Hindi and English, others do not. 193 194 {{< runnable >}} 195 { 196 q(func: allofterms(name@en, "Farhan Akhtar")) { 197 name@hi 198 name@en 199 200 director.film { 201 name@ru:hi:en 202 name@en 203 name@hi 204 name@ru 205 } 206 } 207 } 208 {{< /runnable >}} 209 210 211 212 213 ## Functions 214 215 {{% notice "note" %}}Functions can only be applied to [indexed]({{< relref "#indexing">}}) predicates.{{% /notice %}} 216 217 Functions allow filtering based on properties of nodes or variables. Functions can be applied in the query root or in filters. 218 219 For functions on string valued predicates, if no language preference is given, the function is applied to all languages and strings without a language tag; if a language preference is given, the function is applied only to strings of the given language. 220 221 222 ### Term matching 223 224 225 #### allofterms 226 227 Syntax Example: `allofterms(predicate, "space-separated term list")` 228 229 Schema Types: `string` 230 231 Index Required: `term` 232 233 234 Matches strings that have all specified terms in any order; case insensitive. 235 236 ##### Usage at root 237 238 Query Example: All nodes that have `name` containing terms `indiana` and `jones`, returning the English name and genre in English. 239 240 {{< runnable >}} 241 { 242 me(func: allofterms(name@en, "jones indiana")) { 243 name@en 244 genre { 245 name@en 246 } 247 } 248 } 249 {{< /runnable >}} 250 251 ##### Usage as Filter 252 253 Query Example: All Steven Spielberg films that contain the words `indiana` and `jones`. The `@filter(has(director.film))` removes nodes with name Steven Spielberg that aren't the director --- the data also contains a character in a film called Steven Spielberg. 254 255 {{< runnable >}} 256 { 257 me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) { 258 name@en 259 director.film @filter(allofterms(name@en, "jones indiana")) { 260 name@en 261 } 262 } 263 } 264 {{< /runnable >}} 265 266 267 #### anyofterms 268 269 270 Syntax Example: `anyofterms(predicate, "space-separated term list")` 271 272 Schema Types: `string` 273 274 Index Required: `term` 275 276 277 Matches strings that have any of the specified terms in any order; case insensitive. 278 279 ##### Usage at root 280 281 Query Example: All nodes that have a `name` containing either `poison` or `peacock`. Many of the returned nodes are movies, but people like Joan Peacock also meet the search terms because without a [cascade directive]({{< relref "#cascade-directive">}}) the query doesn't require a genre. 282 283 {{< runnable >}} 284 { 285 me(func:anyofterms(name@en, "poison peacock")) { 286 name@en 287 genre { 288 name@en 289 } 290 } 291 } 292 {{< /runnable >}} 293 294 295 ##### Usage as filter 296 297 Query Example: All Steven Spielberg movies that contain `war` or `spies`. The `@filter(has(director.film))` removes nodes with name Steven Spielberg that aren't the director --- the data also contains a character in a film called Steven Spielberg. 298 299 {{< runnable >}} 300 { 301 me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) { 302 name@en 303 director.film @filter(anyofterms(name@en, "war spies")) { 304 name@en 305 } 306 } 307 } 308 {{< /runnable >}} 309 310 311 ### Regular Expressions 312 313 314 Syntax Examples: `regexp(predicate, /regular-expression/)` or case insensitive `regexp(predicate, /regular-expression/i)` 315 316 Schema Types: `string` 317 318 Index Required: `trigram` 319 320 321 Matches strings by regular expression. The regular expression language is that of [go regular expressions](https://golang.org/pkg/regexp/syntax/). 322 323 Query Example: At root, match nodes with `Steven Sp` at the start of `name`, followed by any characters. For each such matched uid, match the films containing `ryan`. Note the difference with `allofterms`, which would match only `ryan` but regular expression search will also match within terms, such as `bryan`. 324 325 {{< runnable >}} 326 { 327 directors(func: regexp(name@en, /^Steven Sp.*$/)) { 328 name@en 329 director.film @filter(regexp(name@en, /ryan/i)) { 330 name@en 331 } 332 } 333 } 334 {{< /runnable >}} 335 336 337 #### Technical details 338 339 A Trigram is a substring of three continuous runes. For example, `Dgraph` has trigrams `Dgr`, `gra`, `rap`, `aph`. 340 341 To ensure efficiency of regular expression matching, Dgraph uses [trigram indexing](https://swtch.com/~rsc/regexp/regexp4.html). That is, Dgraph converts the regular expression to a trigram query, uses the trigram index and trigram query to find possible matches and applies the full regular expression search only to the possibles. 342 343 #### Writing Efficient Regular Expressions and Limitations 344 345 Keep the following in mind when designing regular expression queries. 346 347 - At least one trigram must be matched by the regular expression (patterns shorter than 3 runes are not supported). That is, Dgraph requires regular expressions that can be converted to a trigram query. 348 - The number of alternative trigrams matched by the regular expression should be as small as possible (`[a-zA-Z][a-zA-Z][0-9]` is not a good idea). Many possible matches means the full regular expression is checked against many strings; where as, if the expression enforces more trigrams to match, Dgraph can make better use of the index and check the full regular expression against a smaller set of possible matches. 349 - Thus, the regular expression should be as precise as possible. Matching longer strings means more required trigrams, which helps to effectively use the index. 350 - If repeat specifications (`*`, `+`, `?`, `{n,m}`) are used, the entire regular expression must not match the _empty_ string or _any_ string: for example, `*` may be used like `[Aa]bcd*` but not like `(abcd)*` or `(abcd)|((defg)*)` 351 - Repeat specifications after bracket expressions (e.g. `[fgh]{7}`, `[0-9]+` or `[a-z]{3,5}`) are often considered as matching any string because they match too many trigrams. 352 - If the partial result (for subset of trigrams) exceeds 1000000 uids during index scan, the query is stopped to prohibit expensive queries. 353 354 355 ### Fuzzy matching 356 357 358 Syntax: `match(predicate, string, distance)` 359 360 Schema Types: `string` 361 362 Index Required: `trigram` 363 364 Matches predicate values by calculating the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) to the string, 365 also known as _fuzzy matching_. The distance parameter must be greater than zero (0). Using a greater distance value can yield more but less accurate results. 366 367 Query Example: At root, fuzzy match nodes similar to `Stephen`, with a distance value of less than or equal to 8. 368 369 {{< runnable >}} 370 { 371 directors(func: match(name@en, Stephen, 8)) { 372 name@en 373 } 374 } 375 {{< /runnable >}} 376 377 Same query with a Levenshtein distance of 3. 378 379 {{< runnable >}} 380 { 381 directors(func: match(name@en, Stephen, 3)) { 382 name@en 383 } 384 } 385 {{< /runnable >}} 386 387 388 ### Full-Text Search 389 390 Syntax Examples: `alloftext(predicate, "space-separated text")` and `anyoftext(predicate, "space-separated text")` 391 392 Schema Types: `string` 393 394 Index Required: `fulltext` 395 396 397 Apply full-text search with stemming and stop words to find strings matching all or any of the given text. 398 399 The following steps are applied during index generation and to process full-text search arguments: 400 401 1. Tokenization (according to Unicode word boundaries). 402 1. Conversion to lowercase. 403 1. Unicode-normalization (to [Normalization Form KC](http://unicode.org/reports/tr15/#Norm_Forms)). 404 1. Stemming using language-specific stemmer (if supported by language). 405 1. Stop words removal (if supported by language). 406 407 Dgraph uses [bleve](https://github.com/blevesearch/bleve) for its full-text search indexing. See also the bleve language specific [stop word lists](https://github.com/blevesearch/bleve/tree/master/analysis/lang). 408 409 Following table contains all supported languages, corresponding country-codes, stemming and stop words filtering support. 410 411 | Language | Country Code | Stemming | Stop words | 412 | :--------: | :----------: | :------: | :--------: | 413 | Arabic | ar | ✓ | ✓ | 414 | Armenian | hy | | ✓ | 415 | Basque | eu | | ✓ | 416 | Bulgarian | bg | | ✓ | 417 | Catalan | ca | | ✓ | 418 | Chinese | zh | ✓ | ✓ | 419 | Czech | cs | | ✓ | 420 | Danish | da | ✓ | ✓ | 421 | Dutch | nl | ✓ | ✓ | 422 | English | en | ✓ | ✓ | 423 | Finnish | fi | ✓ | ✓ | 424 | French | fr | ✓ | ✓ | 425 | Gaelic | ga | | ✓ | 426 | Galician | gl | | ✓ | 427 | German | de | ✓ | ✓ | 428 | Greek | el | | ✓ | 429 | Hindi | hi | ✓ | ✓ | 430 | Hungarian | hu | ✓ | ✓ | 431 | Indonesian | id | | ✓ | 432 | Italian | it | ✓ | ✓ | 433 | Japanese | ja | ✓ | ✓ | 434 | Korean | ko | ✓ | ✓ | 435 | Norwegian | no | ✓ | ✓ | 436 | Persian | fa | | ✓ | 437 | Portuguese | pt | ✓ | ✓ | 438 | Romanian | ro | ✓ | ✓ | 439 | Russian | ru | ✓ | ✓ | 440 | Spanish | es | ✓ | ✓ | 441 | Swedish | sv | ✓ | ✓ | 442 | Turkish | tr | ✓ | ✓ | 443 444 445 Query Example: All names that have `dog`, `dogs`, `bark`, `barks`, `barking`, etc. Stop word removal eliminates `the` and `which`. 446 447 {{< runnable >}} 448 { 449 movie(func:alloftext(name@en, "the dog which barks")) { 450 name@en 451 } 452 } 453 {{< /runnable >}} 454 455 456 ### Inequality 457 458 #### equal to 459 460 Syntax Examples: 461 462 * `eq(predicate, value)` 463 * `eq(val(varName), value)` 464 * `eq(predicate, val(varName))` 465 * `eq(count(predicate), value)` 466 * `eq(predicate, [val1, val2, ..., valN])` 467 * `eq(predicate, [$var1, "value", ..., $varN])` 468 469 Schema Types: `int`, `float`, `bool`, `string`, `dateTime` 470 471 Index Required: An index is required for the `eq(predicate, ...)` forms (see table below). For `count(predicate)` at the query root, the `@count` index is required. For variables the values have been calculated as part of the query, so no index is required. 472 473 | Type | Index Options | 474 |:-----------|:--------------| 475 | `int` | `int` | 476 | `float` | `float` | 477 | `bool` | `bool` | 478 | `string` | `exact`, `hash` | 479 | `dateTime` | `dateTime` | 480 481 Test for equality of a predicate or variable to a value or find in a list of values. 482 483 The boolean constants are `true` and `false`, so with `eq` this becomes, for example, `eq(boolPred, true)`. 484 485 Query Example: Movies with exactly thirteen genres. 486 487 {{< runnable >}} 488 { 489 me(func: eq(count(genre), 13)) { 490 name@en 491 genre { 492 name@en 493 } 494 } 495 } 496 {{< /runnable >}} 497 498 499 Query Example: Directors called Steven who have directed 1,2 or 3 movies. 500 501 {{< runnable >}} 502 { 503 steve as var(func: allofterms(name@en, "Steven")) { 504 films as count(director.film) 505 } 506 507 stevens(func: uid(steve)) @filter(eq(val(films), [1,2,3])) { 508 name@en 509 numFilms : val(films) 510 } 511 } 512 {{< /runnable >}} 513 514 515 #### less than, less than or equal to, greater than and greater than or equal to 516 517 Syntax Examples: for inequality `IE` 518 519 * `IE(predicate, value)` 520 * `IE(val(varName), value)` 521 * `IE(predicate, val(varName))` 522 * `IE(count(predicate), value)` 523 524 With `IE` replaced by 525 526 * `le` less than or equal to 527 * `lt` less than 528 * `ge` greater than or equal to 529 * `gt` greather than 530 531 Schema Types: `int`, `float`, `string`, `dateTime` 532 533 Index required: An index is required for the `IE(predicate, ...)` forms (see table below). For `count(predicate)` at the query root, the `@count` index is required. For variables the values have been calculated as part of the query, so no index is required. 534 535 | Type | Index Options | 536 |:-----------|:--------------| 537 | `int` | `int` | 538 | `float` | `float` | 539 | `string` | `exact` | 540 | `dateTime` | `dateTime` | 541 542 543 Query Example: Ridley Scott movies released before 1980. 544 545 {{< runnable >}} 546 { 547 me(func: eq(name@en, "Ridley Scott")) { 548 name@en 549 director.film @filter(lt(initial_release_date, "1980-01-01")) { 550 initial_release_date 551 name@en 552 } 553 } 554 } 555 {{< /runnable >}} 556 557 558 Query Example: Movies with directors with `Steven` in `name` and have directed more than `100` actors. 559 560 {{< runnable >}} 561 { 562 ID as var(func: allofterms(name@en, "Steven")) { 563 director.film { 564 num_actors as count(starring) 565 } 566 total as sum(val(num_actors)) 567 } 568 569 dirs(func: uid(ID)) @filter(gt(val(total), 100)) { 570 name@en 571 total_actors : val(total) 572 } 573 } 574 {{< /runnable >}} 575 576 577 578 Query Example: A movie in each genre that has over 30000 movies. Because there is no order specified on genres, the order will be by UID. The [count index]({{< relref "#count-index">}}) records the number of edges out of nodes and makes such queries more . 579 580 {{< runnable >}} 581 { 582 genre(func: gt(count(~genre), 30000)){ 583 name@en 584 ~genre (first:1) { 585 name@en 586 } 587 } 588 } 589 {{< /runnable >}} 590 591 Query Example: Directors called Steven and their movies which have `initial_release_date` greater 592 than that of the movie Minority Report. 593 594 {{< runnable >}} 595 { 596 var(func: eq(name@en,"Minority Report")) { 597 d as initial_release_date 598 } 599 600 me(func: eq(name@en, "Steven Spielberg")) { 601 name@en 602 director.film @filter(ge(initial_release_date, val(d))) { 603 initial_release_date 604 name@en 605 } 606 } 607 } 608 {{< /runnable >}} 609 610 611 ### uid 612 613 Syntax Examples: 614 615 * `q(func: uid(<uid>)) ` 616 * `predicate @filter(uid(<uid1>, ..., <uidn>))` 617 * `predicate @filter(uid(a))` for variable `a` 618 * `q(func: uid(a,b))` for variables `a` and `b` 619 620 621 Filters nodes at the current query level to only nodes in the given set of UIDs. 622 623 For query variable `a`, `uid(a)` represents the set of UIDs stored in `a`. For value variable `b`, `uid(b)` represents the UIDs from the UID to value map. With two or more variables, `uid(a,b,...)` represents the union of all the variables. 624 625 `uid(<uid>)`, like an identity function, will return the requested UID even if the node does not have any edges. 626 627 Query Example: If the UID of a node is known, values for the node can be read directly. The films of Priyanka Chopra by known UID 628 629 {{< runnable >}} 630 { 631 films(func: uid(0x2c964)) { 632 name@hi 633 actor.film { 634 performance.film { 635 name@hi 636 } 637 } 638 } 639 } 640 {{< /runnable >}} 641 642 643 644 Query Example: The films of Taraji Henson by genre. 645 {{< runnable >}} 646 { 647 var(func: allofterms(name@en, "Taraji Henson")) { 648 actor.film { 649 F as performance.film { 650 G as genre 651 } 652 } 653 } 654 655 Taraji_films_by_genre(func: uid(G)) { 656 genre_name : name@en 657 films : ~genre @filter(uid(F)) { 658 film_name : name@en 659 } 660 } 661 } 662 {{< /runnable >}} 663 664 665 666 Query Example: Taraji Henson films ordered by number of genres, with genres listed in order of how many films Taraji has made in each genre. 667 {{< runnable >}} 668 { 669 var(func: allofterms(name@en, "Taraji Henson")) { 670 actor.film { 671 F as performance.film { 672 G as count(genre) 673 genre { 674 C as count(~genre @filter(uid(F))) 675 } 676 } 677 } 678 } 679 680 Taraji_films_by_genre_count(func: uid(G), orderdesc: val(G)) { 681 film_name : name@en 682 genres : genre (orderdesc: val(C)) { 683 genre_name : name@en 684 } 685 } 686 } 687 {{< /runnable >}} 688 689 690 ### uid_in 691 692 693 Syntax Examples: 694 695 * `q(func: ...) @filter(uid_in(predicate, <uid>))` 696 * `predicate1 @filter(uid_in(predicate2, <uid>))` 697 698 Schema Types: UID 699 700 Index Required: none 701 702 While the `uid` function filters nodes at the current level based on UID, function `uid_in` allows looking ahead along an edge to check that it leads to a particular UID. This can often save an extra query block and avoids returning the edge. 703 704 `uid_in` cannot be used at root, it accepts one UID constant as its argument (not a variable). 705 706 707 Query Example: The collaborations of Marc Caro and Jean-Pierre Jeunet (UID 0x99706). If the UID of Jean-Pierre Jeunet is known, querying this way removes the need to have a block extracting his UID into a variable and the extra edge traversal and filter for `~director.film`. 708 {{< runnable >}} 709 { 710 caro(func: eq(name@en, "Marc Caro")) { 711 name@en 712 director.film @filter(uid_in(~director.film, 0x99706)) { 713 name@en 714 } 715 } 716 } 717 {{< /runnable >}} 718 719 720 ### has 721 722 Syntax Examples: `has(predicate)` 723 724 Schema Types: all 725 726 Determines if a node has a particular predicate. 727 728 Query Example: First five directors and all their movies that have a release date recorded. Directors have directed at least one film --- equivalent semantics to `gt(count(director.film), 0)`. 729 {{< runnable >}} 730 { 731 me(func: has(director.film), first: 5) { 732 name@en 733 director.film @filter(has(initial_release_date)) { 734 initial_release_date 735 name@en 736 } 737 } 738 } 739 {{< /runnable >}} 740 741 ### Geolocation 742 743 {{% notice "note" %}} As of now we only support indexing Point, Polygon and MultiPolygon [geometry types](https://github.com/twpayne/go-geom#geometry-types). However, Dgraph can store other types of gelocation data. {{% /notice %}} 744 745 Note that for geo queries, any polygon with holes is replace with the outer loop, ignoring holes. Also, as for version 0.7.7 polygon containment checks are approximate. 746 747 #### Mutations 748 749 To make use of the geo functions you would need an index on your predicate. 750 ``` 751 loc: geo @index(geo) . 752 ``` 753 754 Here is how you would add a `Point`. 755 756 ``` 757 { 758 set { 759 <_:0xeb1dde9c> <loc> "{'type':'Point','coordinates':[-122.4220186,37.772318]}"^^<geo:geojson> . 760 <_:0xeb1dde9c> <name> "Hamon Tower" . 761 <_:0xeb1dde9c> <dgraph.type> "Location" . 762 } 763 } 764 ``` 765 766 Here is how you would associate a `Polygon` with a node. Adding a `MultiPolygon` is also similar. 767 768 ``` 769 { 770 set { 771 <_:0xf76c276b> <loc> "{'type':'Polygon','coordinates':[[[-122.409869,37.7785442],[-122.4097444,37.7786443],[-122.4097544,37.7786521],[-122.4096334,37.7787494],[-122.4096233,37.7787416],[-122.4094004,37.7789207],[-122.4095818,37.7790617],[-122.4097883,37.7792189],[-122.4102599,37.7788413],[-122.409869,37.7785442]],[[-122.4097357,37.7787848],[-122.4098499,37.778693],[-122.4099025,37.7787339],[-122.4097882,37.7788257],[-122.4097357,37.7787848]]]}"^^<geo:geojson> . 772 <_:0xf76c276b> <name> "Best Western Americana Hotel" . 773 <_:0xf76c276b> <dgraph.type> "Location" . 774 } 775 } 776 ``` 777 778 The above examples have been picked from our [SF Tourism](https://github.com/dgraph-io/benchmarks/blob/master/data/sf.tourism.gz?raw=true) dataset. 779 780 #### Query 781 782 ##### near 783 784 Syntax Example: `near(predicate, [long, lat], distance)` 785 786 Schema Types: `geo` 787 788 Index Required: `geo` 789 790 Matches all entities where the location given by `predicate` is within `distance` meters of geojson coordinate `[long, lat]`. 791 792 Query Example: Tourist destinations within 1000 meters (1 kilometer) of a point in Golden Gate Park in San Francisco. 793 794 {{< runnable >}} 795 { 796 tourist(func: near(loc, [-122.469829, 37.771935], 1000) ) { 797 name 798 } 799 } 800 {{< /runnable >}} 801 802 803 ##### within 804 805 Syntax Example: `within(predicate, [[[long1, lat1], ..., [longN, latN]]])` 806 807 Schema Types: `geo` 808 809 Index Required: `geo` 810 811 Matches all entities where the location given by `predicate` lies within the polygon specified by the geojson coordinate array. 812 813 Query Example: Tourist destinations within the specified area of Golden Gate Park, San Francisco. 814 815 {{< runnable >}} 816 { 817 tourist(func: within(loc, [[[-122.47266769409178, 37.769018558337926 ], [ -122.47266769409178, 37.773699921075135 ], [ -122.4651575088501, 37.773699921075135 ], [ -122.4651575088501, 37.769018558337926 ], [ -122.47266769409178, 37.769018558337926]]] )) { 818 name 819 } 820 } 821 {{< /runnable >}} 822 823 824 ##### contains 825 826 Syntax Examples: `contains(predicate, [long, lat])` or `contains(predicate, [[long1, lat1], ..., [longN, latN]])` 827 828 Schema Types: `geo` 829 830 Index Required: `geo` 831 832 Matches all entities where the polygon describing the location given by `predicate` contains geojson coordinate `[long, lat]` or given geojson polygon. 833 834 Query Example : All entities that contain a point in the flamingo enclosure of San Francisco Zoo. 835 {{< runnable >}} 836 { 837 tourist(func: contains(loc, [ -122.50326097011566, 37.73353615592843 ] )) { 838 name 839 } 840 } 841 {{< /runnable >}} 842 843 844 ##### intersects 845 846 Syntax Example: `intersects(predicate, [[[long1, lat1], ..., [longN, latN]]])` 847 848 Schema Types: `geo` 849 850 Index Required: `geo` 851 852 Matches all entities where the polygon describing the location given by `predicate` intersects the given geojson polygon. 853 854 855 {{< runnable >}} 856 { 857 tourist(func: intersects(loc, [[[-122.503325343132, 37.73345766902749 ], [ -122.503325343132, 37.733903134117966 ], [ -122.50271648168564, 37.733903134117966 ], [ -122.50271648168564, 37.73345766902749 ], [ -122.503325343132, 37.73345766902749]]] )) { 858 name 859 } 860 } 861 {{< /runnable >}} 862 863 864 865 ## Connecting Filters 866 867 Within `@filter` multiple functions can be used with boolean connectives. 868 869 ### AND, OR and NOT 870 871 Connectives `AND`, `OR` and `NOT` join filters and can be built into arbitrarily complex filters, such as `(NOT A OR B) AND (C AND NOT (D OR E))`. Note that, `NOT` binds more tightly than `AND` which binds more tightly than `OR`. 872 873 Query Example : All Steven Spielberg movies that contain either both "indiana" and "jones" OR both "jurassic" and "park". 874 875 {{< runnable >}} 876 { 877 me(func: eq(name@en, "Steven Spielberg")) @filter(has(director.film)) { 878 name@en 879 director.film @filter(allofterms(name@en, "jones indiana") OR allofterms(name@en, "jurassic park")) { 880 uid 881 name@en 882 } 883 } 884 } 885 {{< /runnable >}} 886 887 888 ## Alias 889 890 Syntax Examples: 891 892 * `aliasName : predicate` 893 * `aliasName : predicate { ... }` 894 * `aliasName : varName as ...` 895 * `aliasName : count(predicate)` 896 * `aliasName : max(val(varName))` 897 898 An alias provides an alternate name in results. Predicates, variables and aggregates can be aliased by prefixing with the alias name and `:`. Aliases do not have to be different to the original predicate name, but, within a block, an alias must be distinct from predicate names and other aliases returned in the same block. Aliases can be used to return the same predicate multiple times within a block. 899 900 901 902 Query Example: Directors with `name` matching term `Steven`, their UID, English name, average number of actors per movie, total number of films, and the name of each film in English and French. 903 {{< runnable >}} 904 { 905 ID as var(func: allofterms(name@en, "Steven")) @filter(has(director.film)) { 906 director.film { 907 num_actors as count(starring) 908 } 909 average as avg(val(num_actors)) 910 } 911 912 films(func: uid(ID)) { 913 director_id : uid 914 english_name : name@en 915 average_actors : val(average) 916 num_films : count(director.film) 917 918 films : director.film { 919 name : name@en 920 english_name : name@en 921 french_name : name@fr 922 } 923 } 924 } 925 {{< /runnable >}} 926 927 928 ## Pagination 929 930 Pagination allows returning only a portion, rather than the whole, result set. This can be useful for top-k style queries as well as to reduce the size of the result set for client side processing or to allow paged access to results. 931 932 Pagination is often used with [sorting]({{< relref "#sorting">}}). 933 934 {{% notice "note" %}}Without a sort order specified, the results are sorted by `uid`, which is assigned randomly. So the ordering, while deterministic, might not be what you expected.{{% /notice %}} 935 936 ### First 937 938 Syntax Examples: 939 940 * `q(func: ..., first: N)` 941 * `predicate (first: N) { ... }` 942 * `predicate @filter(...) (first: N) { ... }` 943 944 For positive `N`, `first: N` retrieves the first `N` results, by sorted or UID order. 945 946 For negative `N`, `first: N` retrieves the last `N` results, by sorted or UID order. Currently, negative is only supported when no order is applied. To achieve the effect of a negative with a sort, reverse the order of the sort and use a positive `N`. 947 948 949 Query Example: Last two films, by UID order, directed by Steven Spielberg and the first three genres of those movies, sorted alphabetically by English name. 950 951 {{< runnable >}} 952 { 953 me(func: allofterms(name@en, "Steven Spielberg")) { 954 director.film (first: -2) { 955 name@en 956 initial_release_date 957 genre (orderasc: name@en) (first: 3) { 958 name@en 959 } 960 } 961 } 962 } 963 {{< /runnable >}} 964 965 966 967 Query Example: The three directors named Steven who have directed the most actors of all directors named Steven. 968 969 {{< runnable >}} 970 { 971 ID as var(func: allofterms(name@en, "Steven")) @filter(has(director.film)) { 972 director.film { 973 stars as count(starring) 974 } 975 totalActors as sum(val(stars)) 976 } 977 978 mostStars(func: uid(ID), orderdesc: val(totalActors), first: 3) { 979 name@en 980 stars : val(totalActors) 981 982 director.film { 983 name@en 984 } 985 } 986 } 987 {{< /runnable >}} 988 989 ### Offset 990 991 Syntax Examples: 992 993 * `q(func: ..., offset: N)` 994 * `predicate (offset: N) { ... }` 995 * `predicate (first: M, offset: N) { ... }` 996 * `predicate @filter(...) (offset: N) { ... }` 997 998 With `offset: N` the first `N` results are not returned. Used in combination with first, `first: M, offset: N` skips over `N` results and returns the following `M`. 999 1000 Query Example: Order Hark Tsui's films by English title, skip over the first 4 and return the following 6. 1001 1002 {{< runnable >}} 1003 { 1004 me(func: allofterms(name@en, "Hark Tsui")) { 1005 name@zh 1006 name@en 1007 director.film (orderasc: name@en) (first:6, offset:4) { 1008 genre { 1009 name@en 1010 } 1011 name@zh 1012 name@en 1013 initial_release_date 1014 } 1015 } 1016 } 1017 {{< /runnable >}} 1018 1019 ### After 1020 1021 Syntax Examples: 1022 1023 * `q(func: ..., after: UID)` 1024 * `predicate (first: N, after: UID) { ... }` 1025 * `predicate @filter(...) (first: N, after: UID) { ... }` 1026 1027 Another way to get results after skipping over some results is to use the default UID ordering and skip directly past a node specified by UID. For example, a first query could be of the form `predicate (after: 0x0, first: N)`, or just `predicate (first: N)`, with subsequent queries of the form `predicate(after: <uid of last entity in last result>, first: N)`. 1028 1029 1030 Query Example: The first five of Baz Luhrmann's films, sorted by UID order. 1031 1032 {{< runnable >}} 1033 { 1034 me(func: allofterms(name@en, "Baz Luhrmann")) { 1035 name@en 1036 director.film (first:5) { 1037 uid 1038 name@en 1039 } 1040 } 1041 } 1042 {{< /runnable >}} 1043 1044 The fifth movie is the Australian movie classic Strictly Ballroom. It has UID `0x99e44`. The results after Strictly Ballroom can now be obtained with `after`. 1045 1046 {{< runnable >}} 1047 { 1048 me(func: allofterms(name@en, "Baz Luhrmann")) { 1049 name@en 1050 director.film (first:5, after: 0x99e44) { 1051 uid 1052 name@en 1053 } 1054 } 1055 } 1056 {{< /runnable >}} 1057 1058 1059 ## Count 1060 1061 Syntax Examples: 1062 1063 * `count(predicate)` 1064 * `count(uid)` 1065 1066 The form `count(predicate)` counts how many `predicate` edges lead out of a node. 1067 1068 The form `count(uid)` counts the number of UIDs matched in the enclosing block. 1069 1070 Query Example: The number of films acted in by each actor with `Orlando` in their name. 1071 1072 {{< runnable >}} 1073 { 1074 me(func: allofterms(name@en, "Orlando")) @filter(has(actor.film)) { 1075 name@en 1076 count(actor.film) 1077 } 1078 } 1079 {{< /runnable >}} 1080 1081 Count can be used at root and [aliased]({{< relref "#alias">}}). 1082 1083 Query Example: Count of directors who have directed more than five films. When used at the query root, the [count index]({{< relref "#count-index">}}) is required. 1084 1085 {{< runnable >}} 1086 { 1087 directors(func: gt(count(director.film), 5)) { 1088 totalDirectors : count(uid) 1089 } 1090 } 1091 {{< /runnable >}} 1092 1093 1094 Count can be assigned to a [value variable]({{< relref "#value-variables">}}). 1095 1096 Query Example: The actors of Ang Lee's "Eat Drink Man Woman" ordered by the number of movies acted in. 1097 1098 {{< runnable >}} 1099 { 1100 var(func: allofterms(name@en, "eat drink man woman")) { 1101 starring { 1102 actors as performance.actor { 1103 totalRoles as count(actor.film) 1104 } 1105 } 1106 } 1107 1108 edmw(func: uid(actors), orderdesc: val(totalRoles)) { 1109 name@en 1110 name@zh 1111 totalRoles : val(totalRoles) 1112 } 1113 } 1114 {{< /runnable >}} 1115 1116 1117 ## Sorting 1118 1119 Syntax Examples: 1120 1121 * `q(func: ..., orderasc: predicate)` 1122 * `q(func: ..., orderdesc: val(varName))` 1123 * `predicate (orderdesc: predicate) { ... }` 1124 * `predicate @filter(...) (orderasc: N) { ... }` 1125 * `q(func: ..., orderasc: predicate1, orderdesc: predicate2)` 1126 1127 Sortable Types: `int`, `float`, `String`, `dateTime`, `default` 1128 1129 Results can be sorted in ascending order (`orderasc`) or descending order (`orderdesc`) by a predicate or variable. 1130 1131 For sorting on predicates with [sortable indices]({{< relref "#sortable-indices">}}), Dgraph sorts on the values and with the index in parallel and returns whichever result is computed first. 1132 1133 Sorted queries retrieve up to 1000 results by default. This can be changed with [first]({{< relref "#first">}}). 1134 1135 1136 Query Example: French director Jean-Pierre Jeunet's movies sorted by release date. 1137 1138 {{< runnable >}} 1139 { 1140 me(func: allofterms(name@en, "Jean-Pierre Jeunet")) { 1141 name@fr 1142 director.film(orderasc: initial_release_date) { 1143 name@fr 1144 name@en 1145 initial_release_date 1146 } 1147 } 1148 } 1149 {{< /runnable >}} 1150 1151 Sorting can be performed at root and on value variables. 1152 1153 Query Example: All genres sorted alphabetically and the five movies in each genre with the most genres. 1154 1155 {{< runnable >}} 1156 { 1157 genres as var(func: has(~genre)) { 1158 ~genre { 1159 numGenres as count(genre) 1160 } 1161 } 1162 1163 genres(func: uid(genres), orderasc: name@en) { 1164 name@en 1165 ~genre (orderdesc: val(numGenres), first: 5) { 1166 name@en 1167 genres : val(numGenres) 1168 } 1169 } 1170 } 1171 {{< /runnable >}} 1172 1173 Sorting can also be performed by multiple predicates as shown below. If the values are equal for the 1174 first predicate, then they are sorted by the second predicate and so on. 1175 1176 Query Example: Find all nodes which have type Person, sort them by their first_name and among those 1177 that have the same first_name sort them by last_name in descending order. 1178 1179 ``` 1180 { 1181 me(func: type("Person"), orderasc: first_name, orderdesc: last_name) { 1182 first_name 1183 last_name 1184 } 1185 } 1186 ``` 1187 1188 ## Multiple Query Blocks 1189 1190 Inside a single query, multiple query blocks are allowed. The result is all blocks with corresponding block names. 1191 1192 Multiple query blocks are executed in parallel. 1193 1194 The blocks need not be related in any way. 1195 1196 Query Example: All of Angelina Jolie's films, with genres, and Peter Jackson's films since 2008. 1197 1198 {{< runnable >}} 1199 { 1200 AngelinaInfo(func:allofterms(name@en, "angelina jolie")) { 1201 name@en 1202 actor.film { 1203 performance.film { 1204 genre { 1205 name@en 1206 } 1207 } 1208 } 1209 } 1210 1211 DirectorInfo(func: eq(name@en, "Peter Jackson")) { 1212 name@en 1213 director.film @filter(ge(initial_release_date, "2008")) { 1214 Release_date: initial_release_date 1215 Name: name@en 1216 } 1217 } 1218 } 1219 {{< /runnable >}} 1220 1221 1222 If queries contain some overlap in answers, the result sets are still independent. 1223 1224 Query Example: The movies Mackenzie Crook has acted in and the movies Jack Davenport has acted in. The results sets overlap because both have acted in the Pirates of the Caribbean movies, but the results are independent and both contain the full answers sets. 1225 1226 {{< runnable >}} 1227 { 1228 Mackenzie(func:allofterms(name@en, "Mackenzie Crook")) { 1229 name@en 1230 actor.film { 1231 performance.film { 1232 uid 1233 name@en 1234 } 1235 performance.character { 1236 name@en 1237 } 1238 } 1239 } 1240 1241 Jack(func:allofterms(name@en, "Jack Davenport")) { 1242 name@en 1243 actor.film { 1244 performance.film { 1245 uid 1246 name@en 1247 } 1248 performance.character { 1249 name@en 1250 } 1251 } 1252 } 1253 } 1254 {{< /runnable >}} 1255 1256 1257 ### Var Blocks 1258 1259 Var blocks start with the keyword `var` and are not returned in the query results. 1260 1261 Query Example: Angelina Jolie's movies ordered by genre. 1262 1263 {{< runnable >}} 1264 { 1265 var(func:allofterms(name@en, "angelina jolie")) { 1266 name@en 1267 actor.film { 1268 A AS performance.film { 1269 B AS genre 1270 } 1271 } 1272 } 1273 1274 films(func: uid(B), orderasc: name@en) { 1275 name@en 1276 ~genre @filter(uid(A)) { 1277 name@en 1278 } 1279 } 1280 } 1281 {{< /runnable >}} 1282 1283 1284 ## Query Variables 1285 1286 Syntax Examples: 1287 1288 * `varName as q(func: ...) { ... }` 1289 * `varName as var(func: ...) { ... }` 1290 * `varName as predicate { ... }` 1291 * `varName as predicate @filter(...) { ... }` 1292 1293 Types : `uid` 1294 1295 Nodes (UIDs) matched at one place in a query can be stored in a variable and used elsewhere. Query variables can be used in other query blocks or in a child node of the defining block. 1296 1297 Query variables do not affect the semantics of the query at the point of definition. Query variables are evaluated to all nodes matched by the defining block. 1298 1299 In general, query blocks are executed in parallel, but variables impose an evaluation order on some blocks. Cycles induced by variable dependence are not permitted. 1300 1301 If a variable is defined, it must be used elsewhere in the query. 1302 1303 A query variable is used by extracting the UIDs in it with `uid(var-name)`. 1304 1305 The syntax `func: uid(A,B)` or `@filter(uid(A,B))` means the union of UIDs for variables `A` and `B`. 1306 1307 Query Example: The movies of Angelia Jolie and Brad Pitt where both have acted on movies in the same genre. Note that `B` and `D` match all genres for all movies, not genres per movie. 1308 {{< runnable >}} 1309 { 1310 var(func:allofterms(name@en, "angelina jolie")) { 1311 actor.film { 1312 A AS performance.film { # All films acted in by Angelina Jolie 1313 B As genre # Genres of all the films acted in by Angelina Jolie 1314 } 1315 } 1316 } 1317 1318 var(func:allofterms(name@en, "brad pitt")) { 1319 actor.film { 1320 C AS performance.film { # All films acted in by Brad Pitt 1321 D as genre # Genres of all the films acted in by Brad Pitt 1322 } 1323 } 1324 } 1325 1326 films(func: uid(D)) @filter(uid(B)) { # Genres from both Angelina and Brad 1327 name@en 1328 ~genre @filter(uid(A, C)) { # Movies in either A or C. 1329 name@en 1330 } 1331 } 1332 } 1333 {{< /runnable >}} 1334 1335 1336 ## Value Variables 1337 1338 Syntax Examples: 1339 1340 * `varName as scalarPredicate` 1341 * `varName as count(predicate)` 1342 * `varName as avg(...)` 1343 * `varName as math(...)` 1344 1345 Types : `int`, `float`, `String`, `dateTime`, `default`, `geo`, `bool` 1346 1347 Value variables store scalar values. Value variables are a map from the UIDs of the enclosing block to the corresponding values. 1348 1349 It therefore only makes sense to use the values from a value variable in a context that matches the same UIDs - if used in a block matching different UIDs the value variable is undefined. 1350 1351 It is an error to define a value variable but not use it elsewhere in the query. 1352 1353 Value variables are used by extracting the values with `val(var-name)`, or by extracting the UIDs with `uid(var-name)`. 1354 1355 [Facet]({{< relref "#facets-edge-attributes">}}) values can be stored in value variables. 1356 1357 Query Example: The number of movie roles played by the actors of the 80's classic "The Princess Bride". Query variable `pbActors` matches the UIDs of all actors from the movie. Value variable `roles` is thus a map from actor UID to number of roles. Value variable `roles` can be used in the `totalRoles` query block because that query block also matches the `pbActors` UIDs, so the actor to number of roles map is available. 1358 1359 {{< runnable >}} 1360 { 1361 var(func:allofterms(name@en, "The Princess Bride")) { 1362 starring { 1363 pbActors as performance.actor { 1364 roles as count(actor.film) 1365 } 1366 } 1367 } 1368 totalRoles(func: uid(pbActors), orderasc: val(roles)) { 1369 name@en 1370 numRoles : val(roles) 1371 } 1372 } 1373 {{< /runnable >}} 1374 1375 1376 Value variables can be used in place of UID variables by extracting the UID list from the map. 1377 1378 Query Example: The same query as the previous example, but using value variable `roles` for matching UIDs in the `totalRoles` query block. 1379 1380 {{< runnable >}} 1381 { 1382 var(func:allofterms(name@en, "The Princess Bride")) { 1383 starring { 1384 performance.actor { 1385 roles as count(actor.film) 1386 } 1387 } 1388 } 1389 totalRoles(func: uid(roles), orderasc: val(roles)) { 1390 name@en 1391 numRoles : val(roles) 1392 } 1393 } 1394 {{< /runnable >}} 1395 1396 1397 ### Variable Propagation 1398 1399 Like query variables, value variables can be used in other query blocks and in blocks nested within the defining block. When used in a block nested within the block that defines the variable, the value is computed as a sum of the variable for parent nodes along all paths to the point of use. This is called variable propagation. 1400 1401 For example: 1402 ``` 1403 { 1404 q(func: uid(0x01)) { 1405 myscore as math(1) # A 1406 friends { # B 1407 friends { # C 1408 ...myscore... 1409 } 1410 } 1411 } 1412 } 1413 ``` 1414 At line A, a value variable `myscore` is defined as mapping node with UID `0x01` to value 1. At B, the value for each friend is still 1: there is only one path to each friend. Traversing the friend edge twice reaches the friends of friends. The variable `myscore` gets propagated such that each friend of friend will receive the sum of its parents values: if a friend of a friend is reachable from only one friend, the value is still 1, if they are reachable from two friends, the value is two and so on. That is, the value of `myscore` for each friend of friends inside the block marked C will be the number of paths to them. 1415 1416 **The value that a node receives for a propagated variable is the sum of the values of all its parent nodes.** 1417 1418 This propagation is useful, for example, in normalizing a sum across users, finding the number of paths between nodes and accumulating a sum through a graph. 1419 1420 1421 1422 Query Example: For each Harry Potter movie, the number of roles played by actor Warwick Davis. 1423 {{< runnable >}} 1424 { 1425 num_roles(func: eq(name@en, "Warwick Davis")) @cascade @normalize { 1426 1427 paths as math(1) # records number of paths to each character 1428 1429 actor : name@en 1430 1431 actor.film { 1432 performance.film @filter(allofterms(name@en, "Harry Potter")) { 1433 film_name : name@en 1434 characters : math(paths) # how many paths (i.e. characters) reach this film 1435 } 1436 } 1437 } 1438 } 1439 {{< /runnable >}} 1440 1441 1442 Query Example: Each actor who has been in a Peter Jackson movie and the fraction of Peter Jackson movies they have appeared in. 1443 {{< runnable >}} 1444 { 1445 movie_fraction(func:eq(name@en, "Peter Jackson")) @normalize { 1446 1447 paths as math(1) 1448 total_films : num_films as count(director.film) 1449 director : name@en 1450 1451 director.film { 1452 starring { 1453 performance.actor { 1454 fraction : math(paths / (num_films/paths)) 1455 actor : name@en 1456 } 1457 } 1458 } 1459 } 1460 } 1461 {{< /runnable >}} 1462 1463 More examples can be found in two Dgraph blog posts about using variable propagation for recommendation engines ([post 1](https://open.dgraph.io/post/recommendation/), [post 2](https://open.dgraph.io/post/recommendation2/)). 1464 1465 ## Aggregation 1466 1467 Syntax Example: `AG(val(varName))` 1468 1469 For `AG` replaced with 1470 1471 * `min` : select the minimum value in the value variable `varName` 1472 * `max` : select the maximum value 1473 * `sum` : sum all values in value variable `varName` 1474 * `avg` : calculate the average of values in `varName` 1475 1476 Schema Types: 1477 1478 | Aggregation | Schema Types | 1479 |:-----------|:--------------| 1480 | `min` / `max` | `int`, `float`, `string`, `dateTime`, `default` | 1481 | `sum` / `avg` | `int`, `float` | 1482 1483 Aggregation can only be applied to [value variables]({{< relref "#value-variables">}}). An index is not required (the values have already been found and stored in the value variable mapping). 1484 1485 An aggregation is applied at the query block enclosing the variable definition. As opposed to query variables and value variables, which are global, aggregation is computed locally. For example: 1486 ``` 1487 A as predicateA { 1488 ... 1489 B as predicateB { 1490 x as ...some value... 1491 } 1492 min(val(x)) 1493 } 1494 ``` 1495 Here, `A` and `B` are the lists of all UIDs that match these blocks. Value variable `x` is a mapping from UIDs in `B` to values. The aggregation `min(val(x))`, however, is computed for each UID in `A`. That is, it has a semantics of: for each UID in `A`, take the slice of `x` that corresponds to `A`'s outgoing `predicateB` edges and compute the aggregation for those values. 1496 1497 Aggregations can themselves be assigned to value variables, making a UID to aggregation map. 1498 1499 1500 ### Min 1501 1502 #### Usage at Root 1503 1504 Query Example: Get the min initial release date for any Harry Potter movie. 1505 1506 The release date is assigned to a variable, then it is aggregated and fetched in an empty block. 1507 {{< runnable >}} 1508 { 1509 var(func: allofterms(name@en, "Harry Potter")) { 1510 d as initial_release_date 1511 } 1512 me() { 1513 min(val(d)) 1514 } 1515 } 1516 {{< /runnable >}} 1517 1518 #### Usage at other levels 1519 1520 Query Example: Directors called Steven and the date of release of their first movie, in ascending order of first movie. 1521 1522 {{< runnable >}} 1523 { 1524 stevens as var(func: allofterms(name@en, "steven")) { 1525 director.film { 1526 ird as initial_release_date 1527 # ird is a value variable mapping a film UID to its release date 1528 } 1529 minIRD as min(val(ird)) 1530 # minIRD is a value variable mapping a director UID to their first release date 1531 } 1532 1533 byIRD(func: uid(stevens), orderasc: val(minIRD)) { 1534 name@en 1535 firstRelease: val(minIRD) 1536 } 1537 } 1538 {{< /runnable >}} 1539 1540 ### Max 1541 1542 #### Usage at Root 1543 1544 Query Example: Get the max initial release date for any Harry Potter movie. 1545 1546 The release date is assigned to a variable, then it is aggregated and fetched in an empty block. 1547 {{< runnable >}} 1548 { 1549 var(func: allofterms(name@en, "Harry Potter")) { 1550 d as initial_release_date 1551 } 1552 me() { 1553 max(val(d)) 1554 } 1555 } 1556 {{< /runnable >}} 1557 1558 #### Usage at other levels 1559 1560 Query Example: Quentin Tarantino's movies and date of release of the most recent movie. 1561 1562 {{< runnable >}} 1563 { 1564 director(func: allofterms(name@en, "Quentin Tarantino")) { 1565 director.film { 1566 name@en 1567 x as initial_release_date 1568 } 1569 max(val(x)) 1570 } 1571 } 1572 {{< /runnable >}} 1573 1574 ### Sum and Avg 1575 1576 #### Usage at Root 1577 1578 Query Example: Get the sum and average of number of count of movies directed by people who have 1579 Steven or Tom in their name. 1580 1581 {{< runnable >}} 1582 { 1583 var(func: anyofterms(name@en, "Steven Tom")) { 1584 a as count(director.film) 1585 } 1586 1587 me() { 1588 avg(val(a)) 1589 sum(val(a)) 1590 } 1591 } 1592 {{< /runnable >}} 1593 1594 #### Usage at other levels 1595 1596 Query Example: Steven Spielberg's movies, with the number of recorded genres per movie, and the total number of genres and average genres per movie. 1597 1598 {{< runnable >}} 1599 { 1600 director(func: eq(name@en, "Steven Spielberg")) { 1601 name@en 1602 director.film { 1603 name@en 1604 numGenres : g as count(genre) 1605 } 1606 totalGenres : sum(val(g)) 1607 genresPerMovie : avg(val(g)) 1608 } 1609 } 1610 {{< /runnable >}} 1611 1612 1613 ### Aggregating Aggregates 1614 1615 Aggregations can be assigned to value variables, and so these variables can in turn be aggregated. 1616 1617 Query Example: For each actor in a Peter Jackson film, find the number of roles played in any movie. Sum these to find the total number of roles ever played by all actors in the movie. Then sum the lot to find the total number of roles ever played by actors who have appeared in Peter Jackson movies. Note that this demonstrates how to aggregate aggregates; the answer in this case isn't quite precise though, because actors that have appeared in multiple Peter Jackson movies are counted more than once. 1618 1619 {{< runnable >}} 1620 { 1621 PJ as var(func:allofterms(name@en, "Peter Jackson")) { 1622 director.film { 1623 starring { # starring an actor 1624 performance.actor { 1625 movies as count(actor.film) 1626 # number of roles for this actor 1627 } 1628 perf_total as sum(val(movies)) 1629 } 1630 movie_total as sum(val(perf_total)) 1631 # total roles for all actors in this movie 1632 } 1633 gt as sum(val(movie_total)) 1634 } 1635 1636 PJmovies(func: uid(PJ)) { 1637 name@en 1638 director.film (orderdesc: val(movie_total), first: 5) { 1639 name@en 1640 totalRoles : val(movie_total) 1641 } 1642 grandTotal : val(gt) 1643 } 1644 } 1645 {{< /runnable >}} 1646 1647 1648 ## Math on value variables 1649 1650 Value variables can be combined using mathematical functions. For example, this could be used to associate a score which is then used to order or perform other operations, such as might be used in building news feeds, simple recommendation systems, and so on. 1651 1652 Math statements must be enclosed within `math( <exp> )` and must be stored to a value variable. 1653 1654 The supported operators are as follows: 1655 1656 | Operators | Types accepted | What it does | 1657 | :------------: | :--------------: | :------------------------: | 1658 | `+` `-` `*` `/` `%` | `int`, `float` | performs the corresponding operation | 1659 | `min` `max` | All types except `geo`, `bool` (binary functions) | selects the min/max value among the two | 1660 | `<` `>` `<=` `>=` `==` `!=` | All types except `geo`, `bool` | Returns true or false based on the values | 1661 | `floor` `ceil` `ln` `exp` `sqrt` | `int`, `float` (unary function) | performs the corresponding operation | 1662 | `since` | `dateTime` | Returns the number of seconds in float from the time specified | 1663 | `pow(a, b)` | `int`, `float` | Returns `a to the power b` | 1664 | `logbase(a,b)` | `int`, `float` | Returns `log(a)` to the base `b` | 1665 | `cond(a, b, c)` | first operand must be a boolean | selects `b` if `a` is true else `c` | 1666 1667 1668 Query Example: Form a score for each of Steven Spielberg's movies as the sum of number of actors, number of genres and number of countries. List the top five such movies in order of decreasing score. 1669 1670 {{< runnable >}} 1671 { 1672 var(func:allofterms(name@en, "steven spielberg")) { 1673 films as director.film { 1674 p as count(starring) 1675 q as count(genre) 1676 r as count(country) 1677 score as math(p + q + r) 1678 } 1679 } 1680 1681 TopMovies(func: uid(films), orderdesc: val(score), first: 5){ 1682 name@en 1683 val(score) 1684 } 1685 } 1686 {{< /runnable >}} 1687 1688 Value variables and aggregations of them can be used in filters. 1689 1690 Query Example: Calculate a score for each Steven Spielberg movie with a condition on release date to penalize movies that are more than 10 years old, filtering on the resulting score. 1691 1692 {{< runnable >}} 1693 { 1694 var(func:allofterms(name@en, "steven spielberg")) { 1695 films as director.film { 1696 p as count(starring) 1697 q as count(genre) 1698 date as initial_release_date 1699 years as math(since(date)/(365*24*60*60)) 1700 score as math(cond(years > 10, 0, ln(p)+q-ln(years))) 1701 } 1702 } 1703 1704 TopMovies(func: uid(films), orderdesc: val(score)) @filter(gt(val(score), 2)){ 1705 name@en 1706 val(score) 1707 val(date) 1708 } 1709 } 1710 {{< /runnable >}} 1711 1712 1713 Values calculated with math operations are stored to value variables and so can be aggregated. 1714 1715 Query Example: Compute a score for each Steven Spielberg movie and then aggregate the score. 1716 1717 {{< runnable >}} 1718 { 1719 steven as var(func:eq(name@en, "Steven Spielberg")) @filter(has(director.film)) { 1720 director.film { 1721 p as count(starring) 1722 q as count(genre) 1723 r as count(country) 1724 score as math(p + q + r) 1725 } 1726 directorScore as sum(val(score)) 1727 } 1728 1729 score(func: uid(steven)){ 1730 name@en 1731 val(directorScore) 1732 } 1733 } 1734 {{< /runnable >}} 1735 1736 1737 ## GroupBy 1738 1739 Syntax Examples: 1740 1741 * `q(func: ...) @groupby(predicate) { min(...) }` 1742 * `predicate @groupby(pred) { count(uid) }`` 1743 1744 1745 A `groupby` query aggregates query results given a set of properties on which to group elements. For example, a query containing the block `friend @groupby(age) { count(uid) }`, finds all nodes reachable along the friend edge, partitions these into groups based on age, then counts how many nodes are in each group. The returned result is the grouped edges and the aggregations. 1746 1747 Inside a `groupby` block, only aggregations are allowed and `count` may only be applied to `uid`. 1748 1749 If the `groupby` is applied to a `uid` predicate, the resulting aggregations can be saved in a variable (mapping the grouped UIDs to aggregate values) and used elsewhere in the query to extract information other than the grouped or aggregated edges. 1750 1751 Query Example: For Steven Spielberg movies, count the number of movies in each genre and for each of those genres return the genre name and the count. The name can't be extracted in the `groupby` because it is not an aggregate, but `uid(a)` can be used to extract the UIDs from the UID to value map and thus organize the `byGenre` query by genre UID. 1752 1753 1754 {{< runnable >}} 1755 { 1756 var(func:allofterms(name@en, "steven spielberg")) { 1757 director.film @groupby(genre) { 1758 a as count(uid) 1759 # a is a genre UID to count value variable 1760 } 1761 } 1762 1763 byGenre(func: uid(a), orderdesc: val(a)) { 1764 name@en 1765 total_movies : val(a) 1766 } 1767 } 1768 {{< /runnable >}} 1769 1770 Query Example: Actors from Tim Burton movies and how many roles they have played in Tim Burton movies. 1771 {{< runnable >}} 1772 { 1773 var(func:allofterms(name@en, "Tim Burton")) { 1774 director.film { 1775 starring @groupby(performance.actor) { 1776 a as count(uid) 1777 # a is an actor UID to count value variable 1778 } 1779 } 1780 } 1781 1782 byActor(func: uid(a), orderdesc: val(a)) { 1783 name@en 1784 val(a) 1785 } 1786 } 1787 {{< /runnable >}} 1788 1789 1790 1791 ## Expand Predicates 1792 1793 The `expand()` function can be used to expand the predicates out of a node. To 1794 use `expand()`, the [type system]({{< relref "#type-system" >}}) is required. 1795 Refer to the section on the type system to check how to set the types 1796 nodes. The rest of this section assumes familiarity with that section. 1797 1798 There are four ways to use the `expand` function. 1799 1800 * Predicates can be stored in a variable and passed to `expand()` to expand all 1801 the predicates in the variable. 1802 * If `_all_` is passed as an argument to `expand()`, the predicates to be 1803 expanded will be the union of fields in the types assigned to a given node. 1804 1805 The `_all_` keyword requires that the nodes have types. Dgraph will look for all 1806 the types that have been assigned to a node, query the types to check which 1807 attributes they have, and use those to compute the list of predicates to expand. 1808 1809 For example, consider a node that has types `Animal` and `Pet`, which have 1810 the following definitions: 1811 1812 ``` 1813 type Animal { 1814 name 1815 species 1816 dob 1817 } 1818 1819 type Pet { 1820 owner 1821 veterinarian 1822 } 1823 ``` 1824 1825 When `expand(_all_)` is called on this node, Dgraph will first check which types 1826 the node has (`Animal` and `Pet`). Then it will get the definitions of `Animal` 1827 and `Pet` and build a list of predicates from their type definitions. 1828 1829 ``` 1830 name 1831 species 1832 dob 1833 owner 1834 veterinarian 1835 ``` 1836 1837 For `string` predicates, `expand` only returns values not tagged with a language 1838 (see [language preference]({{< relref "#language-support" >}})). So it's often 1839 required to add `name@fr` or `name@.` as well to an expand query. 1840 1841 ### Filtering during expand. 1842 1843 Expand queries support filters on the type of the outgoing edge. For example, 1844 `expand(_all_) @filter(type(Person))` will expand on all the predicates but will 1845 only include edges whose destination node is of type Person. Since only nodes of 1846 type `uid` can have a type, this query will filter out any scalar values. 1847 1848 Please note that other type of filters and directives are not currently supported 1849 with the expand function. The filter needs to use the `type` function for the 1850 filter to be allowed. Logical `AND` and `OR` operations are allowed. For 1851 example, `expand(_all_) @filter(type(Person) OR type(Animal))` will only expand 1852 the edges that point to nodes of either type. 1853 1854 ## Cascade Directive 1855 1856 With the `@cascade` directive, nodes that don't have all predicates specified in the query are removed. This can be useful in cases where some filter was applied or if nodes might not have all listed predicates. 1857 1858 1859 Query Example: Harry Potter movies, with each actor and characters played. With `@cascade`, any character not played by an actor called Warwick is removed, as is any Harry Potter movie without any actors called Warwick. Without `@cascade`, every character is returned, but only those played by actors called Warwick also have the actor name. 1860 {{< runnable >}} 1861 { 1862 HP(func: allofterms(name@en, "Harry Potter")) @cascade { 1863 name@en 1864 starring{ 1865 performance.character { 1866 name@en 1867 } 1868 performance.actor @filter(allofterms(name@en, "Warwick")){ 1869 name@en 1870 } 1871 } 1872 } 1873 } 1874 {{< /runnable >}} 1875 1876 You can apply `@cascade` on inner query blocks as well. 1877 {{< runnable >}} 1878 { 1879 HP(func: allofterms(name@en, "Harry Potter")) { 1880 name@en 1881 genre { 1882 name@en 1883 } 1884 starring @cascade { 1885 performance.character { 1886 name@en 1887 } 1888 performance.actor @filter(allofterms(name@en, "Warwick")){ 1889 name@en 1890 } 1891 } 1892 } 1893 } 1894 {{< /runnable >}} 1895 1896 ## Normalize directive 1897 1898 With the `@normalize` directive, only aliased predicates are returned and the result is flattened to remove nesting. 1899 1900 Query Example: Film name, country and first two actors (by UID order) of every Steven Spielberg movie, without `initial_release_date` because no alias is given and flattened by `@normalize` 1901 {{< runnable >}} 1902 { 1903 director(func:allofterms(name@en, "steven spielberg")) @normalize { 1904 director: name@en 1905 director.film { 1906 film: name@en 1907 initial_release_date 1908 starring(first: 2) { 1909 performance.actor { 1910 actor: name@en 1911 } 1912 performance.character { 1913 character: name@en 1914 } 1915 } 1916 country { 1917 country: name@en 1918 } 1919 } 1920 } 1921 } 1922 {{< /runnable >}} 1923 1924 You can also apply `@normalize` on nested query blocks. It will work similarly but only flatten the result of the nested query block where `@normalize` has been applied. `@normalize` will return a list irrespective of the type of attribute on which it is applied. 1925 {{< runnable >}} 1926 { 1927 director(func:allofterms(name@en, "steven spielberg")) { 1928 director: name@en 1929 director.film { 1930 film: name@en 1931 initial_release_date 1932 starring(first: 2) @normalize { 1933 performance.actor { 1934 actor: name@en 1935 } 1936 performance.character { 1937 character: name@en 1938 } 1939 } 1940 country { 1941 country: name@en 1942 } 1943 } 1944 } 1945 } 1946 {{< /runnable >}} 1947 1948 1949 ## Ignorereflex directive 1950 1951 The `@ignorereflex` directive forces the removal of child nodes that are reachable from themselves as a parent, through any path in the query result 1952 1953 Query Example: All the co-actors of Rutger Hauer. Without `@ignorereflex`, the result would also include Rutger Hauer for every movie. 1954 1955 {{< runnable >}} 1956 { 1957 coactors(func: eq(name@en, "Rutger Hauer")) @ignorereflex { 1958 actor.film { 1959 performance.film { 1960 starring { 1961 performance.actor { 1962 name@en 1963 } 1964 } 1965 } 1966 } 1967 } 1968 } 1969 {{< /runnable >}} 1970 1971 ## Debug 1972 1973 For the purposes of debugging, you can attach a query parameter `debug=true` to a query. Attaching this parameter lets you retrieve the `uid` attribute for all the entities along with the `server_latency` and `start_ts` information under the `extensions` key of the response. 1974 1975 - `parsing_ns`: Latency in nanoseconds to parse the query. 1976 - `processing_ns`: Latency in nanoseconds to process the query. 1977 - `encoding_ns`: Latency in nanoseconds to encode the JSON response. 1978 - `start_ts`: The logical start timestamp of the transaction. 1979 1980 Query with debug as a query parameter 1981 ```sh 1982 curl -H "Content-Type: application/graphql+-" http://localhost:8080/query?debug=true -XPOST -d $'{ 1983 tbl(func: allofterms(name@en, "The Big Lebowski")) { 1984 name@en 1985 } 1986 }' | python -m json.tool | less 1987 ``` 1988 1989 Returns `uid` and `server_latency` 1990 ``` 1991 { 1992 "data": { 1993 "tbl": [ 1994 { 1995 "uid": "0x41434", 1996 "name@en": "The Big Lebowski" 1997 }, 1998 { 1999 "uid": "0x145834", 2000 "name@en": "The Big Lebowski 2" 2001 }, 2002 { 2003 "uid": "0x2c8a40", 2004 "name@en": "Jeffrey \"The Big\" Lebowski" 2005 }, 2006 { 2007 "uid": "0x3454c4", 2008 "name@en": "The Big Lebowski" 2009 } 2010 ], 2011 "extensions": { 2012 "server_latency": { 2013 "parsing_ns": 18559, 2014 "processing_ns": 802990982, 2015 "encoding_ns": 1177565 2016 }, 2017 "txn": { 2018 "start_ts": 40010 2019 } 2020 } 2021 } 2022 } 2023 ``` 2024 2025 2026 ## Schema 2027 2028 For each predicate, the schema specifies the target's type. If a predicate `p` has type `T`, then for all subject-predicate-object triples `s p o` the object `o` is of schema type `T`. 2029 2030 * On mutations, scalar types are checked and an error thrown if the value cannot be converted to the schema type. 2031 2032 * On query, value results are returned according to the schema type of the predicate. 2033 2034 If a schema type isn't specified before a mutation adds triples for a predicate, then the type is inferred from the first mutation. This type is either: 2035 2036 * type `uid`, if the first mutation for the predicate has nodes for the subject and object, or 2037 2038 * derived from the [RDF type]({{< relref "#rdf-types" >}}), if the object is a literal and an RDF type is present in the first mutation, or 2039 2040 * `default` type, otherwise. 2041 2042 2043 ### Schema Types 2044 2045 Dgraph supports scalar types and the UID type. 2046 2047 #### Scalar Types 2048 2049 For all triples with a predicate of scalar types the object is a literal. 2050 2051 | Dgraph Type | Go type | 2052 | ------------|:--------| 2053 | `default` | string | 2054 | `int` | int64 | 2055 | `float` | float | 2056 | `string` | string | 2057 | `bool` | bool | 2058 | `dateTime` | time.Time (RFC3339 format [Optional timezone] eg: 2006-01-02T15:04:05.999999999+10:00 or 2006-01-02T15:04:05.999999999) | 2059 | `geo` | [go-geom](https://github.com/twpayne/go-geom) | 2060 | `password` | string (encrypted) | 2061 2062 2063 {{% notice "note" %}}Dgraph supports date and time formats for `dateTime` scalar type only if they 2064 are RFC 3339 compatible which is different from ISO 8601(as defined in the RDF spec). You should 2065 convert your values to RFC 3339 format before sending them to Dgraph.{{% /notice %}} 2066 2067 #### UID Type 2068 2069 The `uid` type denotes a node-node edge; internally each node is represented as a `uint64` id. 2070 2071 | Dgraph Type | Go type | 2072 | ------------|:--------| 2073 | `uid` | uint64 | 2074 2075 2076 ### Adding or Modifying Schema 2077 2078 Schema mutations add or modify schema. 2079 2080 Multiple scalar values can also be added for a `S P` by specifying the schema to be of 2081 list type. Occupations in the example below can store a list of strings for each `S P`. 2082 2083 An index is specified with `@index`, with arguments to specify the tokenizer. When specifying an 2084 index for a predicate it is mandatory to specify the type of the index. For example: 2085 2086 ``` 2087 name: string @index(exact, fulltext) @count . 2088 multiname: string @lang . 2089 age: int @index(int) . 2090 friend: [uid] @count . 2091 dob: dateTime . 2092 location: geo @index(geo) . 2093 occupations: [string] @index(term) . 2094 ``` 2095 2096 If no data has been stored for the predicates, a schema mutation sets up an empty schema ready to receive triples. 2097 2098 If data is already stored before the mutation, existing values are not checked to conform to the new schema. On query, Dgraph tries to convert existing values to the new schema types, ignoring any that fail conversion. 2099 2100 If data exists and new indices are specified in a schema mutation, any index not in the updated list is dropped and a new index is created for every new tokenizer specified. 2101 2102 Reverse edges are also computed if specified by a schema mutation. 2103 2104 2105 ### Predicate name rules 2106 2107 Any alphanumeric combination of a predicate name is permitted. 2108 Dgraph also supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs). 2109 You can read more in [Predicates i18n](#predicates-i18n). 2110 2111 #### Allowed special characters 2112 2113 Single special characters are not accepted, which includes the special characters from IRIs. 2114 They have to be prefixed/suffixed with alphanumeric characters. 2115 2116 ``` 2117 ][&*()_-+=!#$% 2118 ``` 2119 2120 *Note: You are not restricted to use @ suffix, but the suffix character gets ignored.* 2121 2122 #### Forbidden special characters 2123 2124 The special characters below are not accepted. 2125 2126 ``` 2127 ^}|{`\~ 2128 ``` 2129 2130 2131 ### Predicates i18n 2132 2133 If your predicate is a URI or has language-specific characters, then enclose 2134 it with angle brackets `<>` when executing the schema mutation. 2135 2136 {{% notice "note" %}}Dgraph supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs) for predicate names and values.{{% /notice %}} 2137 2138 Schema syntax: 2139 ``` 2140 <职业>: string @index(exact) . 2141 <年龄>: int @index(int) . 2142 <地点>: geo @index(geo) . 2143 <公司>: string . 2144 ``` 2145 2146 This syntax allows for internationalized predicate names, but full-text indexing still defaults to English. 2147 To use the right tokenizer for your language, you need to use the `@lang` directive and enter values using your 2148 language tag. 2149 2150 Schema: 2151 ``` 2152 <公司>: string @index(fulltext) @lang . 2153 ``` 2154 Mutation: 2155 ``` 2156 { 2157 set { 2158 _:a <公司> "Dgraph Labs Inc"@en . 2159 _:b <公司> "夏新科技有限责任公司"@zh . 2160 _:a <dgraph.type> "Company" . 2161 } 2162 } 2163 ``` 2164 Query: 2165 ``` 2166 { 2167 q(func: alloftext(<公司>@zh, "夏新科技有限责任公司")) { 2168 uid 2169 <公司>@. 2170 } 2171 } 2172 ``` 2173 2174 2175 ### Upsert directive 2176 2177 To use [upsert operations]({{< relref "howto/index.md#upserts">}}) on a 2178 predicate, specify the `@upsert` directive in the schema. When committing 2179 transactions involving predicates with the `@upsert` directive, Dgraph checks 2180 index keys for conflicts, helping to enforce uniqueness constraints when running 2181 concurrent upserts. 2182 2183 This is how you specify the upsert directive for a predicate. 2184 ``` 2185 email: string @index(exact) @upsert . 2186 ``` 2187 2188 ### RDF Types 2189 2190 Dgraph supports a number of [RDF types in mutations]({{< relref "mutations/index.md#language-and-rdf-types" >}}). 2191 2192 As well as implying a schema type for a [first mutation]({{< relref "#schema" >}}), an RDF type can override a schema type for storage. 2193 2194 If a predicate has a schema type and a mutation has an RDF type with a different underlying Dgraph type, the convertibility to schema type is checked, and an error is thrown if they are incompatible, but the value is stored in the RDF type's corresponding Dgraph type. Query results are always returned in schema type. 2195 2196 For example, if no schema is set for the `age` predicate. Given the mutation 2197 ``` 2198 { 2199 set { 2200 _:a <age> "15"^^<xs:int> . 2201 _:b <age> "13" . 2202 _:c <age> "14"^^<xs:string> . 2203 _:d <age> "14.5"^^<xs:string> . 2204 _:e <age> "14.5" . 2205 } 2206 } 2207 ``` 2208 Dgraph: 2209 2210 * sets the schema type to `int`, as implied by the first triple, 2211 * converts `"13"` to `int` on storage, 2212 * checks `"14"` can be converted to `int`, but stores as `string`, 2213 * throws an error for the remaining two triples, because `"14.5"` can't be converted to `int`. 2214 2215 ### Extended Types 2216 2217 The following types are also accepted. 2218 2219 #### Password type 2220 2221 A password for an entity is set with setting the schema for the attribute to be of type `password`. Passwords cannot be queried directly, only checked for a match using the `checkpwd` function. 2222 The passwords are encrypted using [bcrypt](https://en.wikipedia.org/wiki/Bcrypt). 2223 2224 For example: to set a password, first set schema, then the password: 2225 ``` 2226 pass: password . 2227 ``` 2228 2229 ``` 2230 { 2231 set { 2232 <0x123> <name> "Password Example" . 2233 <0x123> <pass> "ThePassword" . 2234 } 2235 } 2236 ``` 2237 2238 to check a password: 2239 ``` 2240 { 2241 check(func: uid(0x123)) { 2242 name 2243 checkpwd(pass, "ThePassword") 2244 } 2245 } 2246 ``` 2247 2248 output: 2249 ``` 2250 { 2251 "data": { 2252 "check": [ 2253 { 2254 "name": "Password Example", 2255 "checkpwd(pass)": true 2256 } 2257 ] 2258 } 2259 } 2260 ``` 2261 2262 You can also use alias with password type. 2263 2264 ``` 2265 { 2266 check(func: uid(0x123)) { 2267 name 2268 secret: checkpwd(pass, "ThePassword") 2269 } 2270 } 2271 ``` 2272 2273 output: 2274 ``` 2275 { 2276 "data": { 2277 "check": [ 2278 { 2279 "name": "Password Example", 2280 "secret": true 2281 } 2282 ] 2283 } 2284 } 2285 ``` 2286 2287 ### Indexing 2288 2289 {{% notice "note" %}}Filtering on a predicate by applying a [function]({{< relref "#functions" >}}) requires an index.{{% /notice %}} 2290 2291 When filtering by applying a function, Dgraph uses the index to make the search through a potentially large dataset efficient. 2292 2293 All scalar types can be indexed. 2294 2295 Types `int`, `float`, `bool` and `geo` have only a default index each: with tokenizers named `int`, `float`, `bool` and `geo`. 2296 2297 Types `string` and `dateTime` have a number of indices. 2298 2299 #### String Indices 2300 The indices available for strings are as follows. 2301 2302 | Dgraph function | Required index / tokenizer | Notes | 2303 | :----------------------- | :------------ | :--- | 2304 | `eq` | `hash`, `exact`, `term`, or `fulltext` | The most performant index for `eq` is `hash`. Only use `term` or `fulltext` if you also require term or full-text search. If you're already using `term`, there is no need to use `hash` or `exact` as well. | 2305 | `le`, `ge`, `lt`, `gt` | `exact` | Allows faster sorting. | 2306 | `allofterms`, `anyofterms` | `term` | Allows searching by a term in a sentence. | 2307 | `alloftext`, `anyoftext` | `fulltext` | Matching with language specific stemming and stopwords. | 2308 | `regexp` | `trigram` | Regular expression matching. Can also be used for equality checking. | 2309 2310 {{% notice "warning" %}} 2311 Incorrect index choice can impose performance penalties and an increased 2312 transaction conflict rate. Use only the minimum number of and simplest indexes 2313 that your application needs. 2314 {{% /notice %}} 2315 2316 2317 #### DateTime Indices 2318 2319 The indices available for `dateTime` are as follows. 2320 2321 | Index name / Tokenizer | Part of date indexed | 2322 | :----------- | :------------------------------------------------------------------ | 2323 | `year` | index on year (default) | 2324 | `month` | index on year and month | 2325 | `day` | index on year, month and day | 2326 | `hour` | index on year, month, day and hour | 2327 2328 The choices of `dateTime` index allow selecting the precision of the index. Applications, such as the movies examples in these docs, that require searching over dates but have relatively few nodes per year may prefer the `year` tokenizer; applications that are dependent on fine grained date searches, such as real-time sensor readings, may prefer the `hour` index. 2329 2330 2331 All the `dateTime` indices are sortable. 2332 2333 2334 #### Sortable Indices 2335 2336 Not all the indices establish a total order among the values that they index. Sortable indices allow inequality functions and sorting. 2337 2338 * Indexes `int` and `float` are sortable. 2339 * `string` index `exact` is sortable. 2340 * All `dateTime` indices are sortable. 2341 2342 For example, given an edge `name` of `string` type, to sort by `name` or perform inequality filtering on names, the `exact` index must have been specified. In which case a schema query would return at least the following tokenizers. 2343 2344 ``` 2345 { 2346 "predicate": "name", 2347 "type": "string", 2348 "index": true, 2349 "tokenizer": [ 2350 "exact" 2351 ] 2352 } 2353 ``` 2354 2355 #### Count index 2356 2357 For predicates with the `@count` Dgraph indexes the number of edges out of each node. This enables fast queries of the form: 2358 ``` 2359 { 2360 q(func: gt(count(pred), threshold)) { 2361 ... 2362 } 2363 } 2364 ``` 2365 2366 ### List Type 2367 2368 Predicate with scalar types can also store a list of values if specified in the schema. The scalar 2369 type needs to be enclosed within `[]` to indicate that its a list type. These lists are like an 2370 unordered set. 2371 2372 ``` 2373 occupations: [string] . 2374 score: [int] . 2375 ``` 2376 2377 * A set operation adds to the list of values. The order of the stored values is non-deterministic. 2378 * A delete operation deletes the value from the list. 2379 * Querying for these predicates would return the list in an array. 2380 * Indexes can be applied on predicates which have a list type and you can use [Functions]({{<ref 2381 "#functions">}}) on them. 2382 * Sorting is not allowed using these predicates. 2383 2384 2385 ### Reverse Edges 2386 2387 A graph edge is unidirectional. For node-node edges, sometimes modeling requires reverse edges. If only some subject-predicate-object triples have a reverse, these must be manually added. But if a predicate always has a reverse, Dgraph computes the reverse edges if `@reverse` is specified in the schema. 2388 2389 The reverse edge of `anEdge` is `~anEdge`. 2390 2391 For existing data, Dgraph computes all reverse edges. For data added after the schema mutation, Dgraph computes and stores the reverse edge for each added triple. 2392 2393 ### Querying Schema 2394 2395 A schema query queries for the whole schema: 2396 2397 ``` 2398 schema {} 2399 ``` 2400 2401 {{% notice "note" %}} Unlike regular queries, the schema query is not surrounded 2402 by curly braces. Also, schema queries and regular queries cannot be combined. 2403 {{% /notice %}} 2404 2405 You can query for particular schema fields in the query body. 2406 2407 ``` 2408 schema { 2409 type 2410 index 2411 reverse 2412 tokenizer 2413 list 2414 count 2415 upsert 2416 lang 2417 } 2418 ``` 2419 2420 You can also query for particular predicates: 2421 2422 ``` 2423 schema(pred: [name, friend]) { 2424 type 2425 index 2426 reverse 2427 tokenizer 2428 list 2429 count 2430 upsert 2431 lang 2432 } 2433 ``` 2434 2435 Types can also be queried. Below are some example queries. 2436 2437 ``` 2438 schema(type: Movie) {} 2439 schema(type: [Person, Animal]) {} 2440 ``` 2441 2442 Note that type queries do not contain anything between the curly braces. The 2443 output will be the entire definition of the requested types. 2444 2445 ## Type System 2446 2447 Dgraph supports a type system that can be used to categorize nodes and query 2448 them based on their type. The type system is also used during expand queries. 2449 2450 ### Type definition 2451 2452 Types are defined using a GraphQL-like syntax. For example: 2453 2454 ``` 2455 type Student { 2456 name 2457 dob 2458 home_address 2459 year 2460 friends 2461 } 2462 ``` 2463 2464 Types are declared along with the schema using the Alter endpoint. In order to 2465 properly support the above type, a predicate for each of the attributes 2466 in the type is also needed, such as: 2467 2468 ``` 2469 name: string @index(term) . 2470 dob: datetime . 2471 home_address: string . 2472 year: int . 2473 friends: [uid] . 2474 ``` 2475 2476 Reverse predicates can also be included inside a type definition. For example, the type above 2477 could be expanded to include the parent of the student if there's a predicate `children` with 2478 a reverse edge (the brackets around the predicate name are needed to properly understand the 2479 special character `~`). 2480 2481 ``` 2482 children: [uid] @reverse . 2483 2484 type Student { 2485 name 2486 dob 2487 home_address 2488 year 2489 friends 2490 <~children> 2491 } 2492 ``` 2493 2494 Edges can be used in multiple types: for example, `name` might be used for both 2495 a person and a pet. Sometimes, however, it's required to use a different 2496 predicate for each type to represent a similar concept. For example, if student 2497 names and book names required different indexes, then the predicates must be 2498 different. 2499 2500 ``` 2501 type Student { 2502 student_name 2503 } 2504 2505 type Textbook { 2506 textbook_name 2507 } 2508 2509 student_name: string @index(exact) . 2510 textbook_name: string @lang @index(fulltext) . 2511 ``` 2512 2513 Altering the schema for a type that already exists, overwrites the existing 2514 definition. 2515 2516 ### Setting the type of a node 2517 2518 Scalar nodes cannot have types since they only have one attribute and its type 2519 is the type of the node. UID nodes can have a type. The type is set by setting 2520 the value of the `dgraph.type` predicate for that node. A node can have multiple 2521 types. Here's an example of how to set the types of a node: 2522 2523 ``` 2524 { 2525 set { 2526 _:a <name> "Garfield" . 2527 _:a <dgraph.type> "Pet" . 2528 _:a <dgraph.type> "Animal" . 2529 } 2530 } 2531 ``` 2532 2533 `dgraph.type` is a reserved predicate and cannot be removed or modified. 2534 2535 ### Using types during queries 2536 2537 Types can be used as a top level function in the query language. For example: 2538 2539 ``` 2540 { 2541 q(func: type(Animal)) { 2542 uid 2543 name 2544 } 2545 } 2546 ``` 2547 2548 This query will only return nodes whose type is set to `Animal`. 2549 2550 Types can also be used to filter results inside a query. For example: 2551 2552 ``` 2553 { 2554 q(func: has(parent)) { 2555 uid 2556 parent @filter(type(Person)) { 2557 uid 2558 name 2559 } 2560 } 2561 } 2562 ``` 2563 2564 This query will return the nodes that have a parent predicate and only the 2565 `parent`'s of type `Person`. 2566 2567 ### Deleting a type 2568 2569 Type definitions can be deleted using the Alter endpoint. All that is needed is 2570 to send an operation object with the field `DropOp` (or `drop_op` depending on 2571 the client) to the enum value `TYPE` and the field 'DropValue' (or `drop_value`) 2572 to the type that is meant to be deleted. 2573 2574 Below is an example deleting the type `Person` using the Go client: 2575 ```go 2576 err := c.Alter(context.Background(), &api.Operation{ 2577 DropOp: api.Operation_TYPE, 2578 DropValue: "Person"}) 2579 ``` 2580 2581 ### Expand queries and types 2582 2583 Queries using [expand]({{< relref "#expand-predicates" >}}) (i.e.: 2584 `expand(_all_)`) require that the nodes to be expanded have types. 2585 2586 ## Facets : Edge attributes 2587 2588 Dgraph supports facets --- **key value pairs on edges** --- as an extension to RDF triples. That is, facets add properties to edges, rather than to nodes. 2589 For example, a `friend` edge between two nodes may have a boolean property of `close` friendship. 2590 Facets can also be used as `weights` for edges. 2591 2592 Though you may find yourself leaning towards facets many times, they should not be misused. It wouldn't be correct modeling to give the `friend` edge a facet `date_of_birth`. That should be an edge for the friend. However, a facet like `start_of_friendship` might be appropriate. Facets are however not first class citizen in Dgraph like predicates. 2593 2594 Facet keys are strings and values can be `string`, `bool`, `int`, `float` and `dateTime`. 2595 For `int` and `float`, only 32-bit signed integers and 64-bit floats are accepted. 2596 2597 The following mutation is used throughout this section on facets. The mutation adds data for some peoples and, for example, records a `since` facet in `mobile` and `car` to record when Alice bought the car and started using the mobile number. 2598 2599 First we add some schema. 2600 ```sh 2601 curl localhost:8080/alter -XPOST -d $' 2602 name: string @index(exact, term) . 2603 rated: [uid] @reverse @count . 2604 ' | python -m json.tool | less 2605 2606 ``` 2607 2608 ```sh 2609 curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -XPOST -d $' 2610 { 2611 set { 2612 2613 # -- Facets on scalar predicates 2614 _:alice <name> "Alice" . 2615 _:alice <dgraph.type> "Person" . 2616 _:alice <mobile> "040123456" (since=2006-01-02T15:04:05) . 2617 _:alice <car> "MA0123" (since=2006-02-02T13:01:09, first=true) . 2618 2619 _:bob <name> "Bob" . 2620 _:bob <dgraph.type> "Person" . 2621 _:bob <car> "MA0134" (since=2006-02-02T13:01:09) . 2622 2623 _:charlie <name> "Charlie" . 2624 _:charlie <dgraph.type> "Person" . 2625 _:dave <name> "Dave" . 2626 _:dave <dgraph.type> "Person" . 2627 2628 2629 # -- Facets on UID predicates 2630 _:alice <friend> _:bob (close=true, relative=false) . 2631 _:alice <friend> _:charlie (close=false, relative=true) . 2632 _:alice <friend> _:dave (close=true, relative=true) . 2633 2634 2635 # -- Facets for variable propagation 2636 _:movie1 <name> "Movie 1" . 2637 _:movie1 <dgraph.type> "Movie" . 2638 _:movie2 <name> "Movie 2" . 2639 _:movie2 <dgraph.type> "Movie" . 2640 _:movie3 <name> "Movie 3" . 2641 _:movie3 <dgraph.type> "Movie" . 2642 2643 _:alice <rated> _:movie1 (rating=3) . 2644 _:alice <rated> _:movie2 (rating=2) . 2645 _:alice <rated> _:movie3 (rating=5) . 2646 2647 _:bob <rated> _:movie1 (rating=5) . 2648 _:bob <rated> _:movie2 (rating=5) . 2649 _:bob <rated> _:movie3 (rating=5) . 2650 2651 _:charlie <rated> _:movie1 (rating=2) . 2652 _:charlie <rated> _:movie2 (rating=5) . 2653 _:charlie <rated> _:movie3 (rating=1) . 2654 } 2655 }' | python -m json.tool | less 2656 ``` 2657 2658 ### Facets on scalar predicates 2659 2660 2661 Querying `name`, `mobile` and `car` of Alice gives the same result as without facets. 2662 2663 {{< runnable >}} 2664 { 2665 data(func: eq(name, "Alice")) { 2666 name 2667 mobile 2668 car 2669 } 2670 } 2671 {{</ runnable >}} 2672 2673 2674 The syntax `@facets(facet-name)` is used to query facet data. For Alice the `since` facet for `mobile` and `car` are queried as follows. 2675 2676 {{< runnable >}} 2677 { 2678 data(func: eq(name, "Alice")) { 2679 name 2680 mobile @facets(since) 2681 car @facets(since) 2682 } 2683 } 2684 {{</ runnable >}} 2685 2686 2687 Facets are returned at the same level as the corresponding edge and have keys like edge|facet. 2688 2689 All facets on an edge are queried with `@facets`. 2690 2691 {{< runnable >}} 2692 { 2693 data(func: eq(name, "Alice")) { 2694 name 2695 mobile @facets 2696 car @facets 2697 } 2698 } 2699 {{</ runnable >}} 2700 2701 ### Facets i18n 2702 2703 Facets keys and values can use language-specific characters directly when mutating. But facet keys need to be enclosed in angle brackets `<>` when querying. This is similar to predicates. See [Predicates i18n](#predicates-i18n) for more info. 2704 2705 {{% notice "note" %}}Dgraph supports [Internationalized Resource Identifiers](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) (IRIs) for facet keys when querying.{{% /notice %}} 2706 2707 Example: 2708 ``` 2709 { 2710 set { 2711 _:person1 <name> "Daniel" (वंश="स्पेनी", ancestry="Español") . 2712 _:person1 <dgraph.type> "Person" . 2713 _:person2 <name> "Raj" (वंश="हिंदी", ancestry="हिंदी") . 2714 _:person2 <dgraph.type> "Person" . 2715 _:person3 <name> "Zhang Wei" (वंश="चीनी", ancestry="中文") . 2716 _:person3 <dgraph.type> "Person" . 2717 } 2718 } 2719 ``` 2720 Query, notice the `<>`'s: 2721 ``` 2722 { 2723 q(func: has(name)) { 2724 name @facets(<वंश>) 2725 } 2726 } 2727 ``` 2728 2729 ### Alias with facets 2730 2731 Alias can be specified while requesting specific predicates. Syntax is similar to how would request 2732 alias for other predicates. `orderasc` and `orderdesc` are not allowed as alias as they have special 2733 meaning. Apart from that anything else can be set as alias. 2734 2735 Here we set `car_since`, `close_friend` alias for `since`, `close` facets respectively. 2736 {{< runnable >}} 2737 { 2738 data(func: eq(name, "Alice")) { 2739 name 2740 mobile 2741 car @facets(car_since: since) 2742 friend @facets(close_friend: close) { 2743 name 2744 } 2745 } 2746 } 2747 {{</ runnable >}} 2748 2749 2750 2751 ### Facets on UID predicates 2752 2753 Facets on UID edges work similarly to facets on value edges. 2754 2755 For example, `friend` is an edge with facet `close`. 2756 It was set to true for friendship between Alice and Bob 2757 and false for friendship between Alice and Charlie. 2758 2759 A query for friends of Alice. 2760 2761 {{< runnable >}} 2762 { 2763 data(func: eq(name, "Alice")) { 2764 name 2765 friend { 2766 name 2767 } 2768 } 2769 } 2770 {{</ runnable >}} 2771 2772 A query for friends and the facet `close` with `@facets(close)`. 2773 2774 {{< runnable >}} 2775 { 2776 data(func: eq(name, "Alice")) { 2777 name 2778 friend @facets(close) { 2779 name 2780 } 2781 } 2782 } 2783 {{</ runnable >}} 2784 2785 2786 For uid edges like `friend`, facets go to the corresponding child under the key edge|facet. In the above 2787 example you can see that the `close` facet on the edge between Alice and Bob appears with the key `friend|close` 2788 along with Bob's results. 2789 2790 {{< runnable >}} 2791 { 2792 data(func: eq(name, "Alice")) { 2793 name 2794 friend @facets { 2795 name 2796 car @facets 2797 } 2798 } 2799 } 2800 {{</ runnable >}} 2801 2802 Bob has a `car` and it has a facet `since`, which, in the results, is part of the same object as Bob 2803 under the key car|since. 2804 Also, the `close` relationship between Bob and Alice is part of Bob's output object. 2805 Charlie does not have `car` edge and thus only UID facets. 2806 2807 ### Filtering on facets 2808 2809 Dgraph supports filtering edges based on facets. 2810 Filtering works similarly to how it works on edges without facets and has the same available functions. 2811 2812 2813 Find Alice's close friends 2814 {{< runnable >}} 2815 { 2816 data(func: eq(name, "Alice")) { 2817 friend @facets(eq(close, true)) { 2818 name 2819 } 2820 } 2821 } 2822 {{</ runnable >}} 2823 2824 2825 To return facets as well as filter, add another `@facets(<facetname>)` to the query. 2826 2827 {{< runnable >}} 2828 { 2829 data(func: eq(name, "Alice")) { 2830 friend @facets(eq(close, true)) @facets(relative) { # filter close friends and give relative status 2831 name 2832 } 2833 } 2834 } 2835 {{</ runnable >}} 2836 2837 2838 Facet queries can be composed with `AND`, `OR` and `NOT`. 2839 2840 {{< runnable >}} 2841 { 2842 data(func: eq(name, "Alice")) { 2843 friend @facets(eq(close, true) AND eq(relative, true)) @facets(relative) { # filter close friends in my relation 2844 name 2845 } 2846 } 2847 } 2848 {{</ runnable >}} 2849 2850 2851 ### Sorting using facets 2852 2853 Sorting is possible for a facet on a uid edge. Here we sort the movies rated by Alice, Bob and 2854 Charlie by their `rating` which is a facet. 2855 2856 {{< runnable >}} 2857 { 2858 me(func: anyofterms(name, "Alice Bob Charlie")) { 2859 name 2860 rated @facets(orderdesc: rating) { 2861 name 2862 } 2863 } 2864 } 2865 {{</ runnable >}} 2866 2867 2868 2869 ### Assigning Facet values to a variable 2870 2871 Facets on UID edges can be stored in [value variables]({{< relref "#value-variables" >}}). The variable is a map from the edge target to the facet value. 2872 2873 Alice's friends reported by variables for `close` and `relative`. 2874 {{< runnable >}} 2875 { 2876 var(func: eq(name, "Alice")) { 2877 friend @facets(a as close, b as relative) 2878 } 2879 2880 friend(func: uid(a)) { 2881 name 2882 val(a) 2883 } 2884 2885 relative(func: uid(b)) { 2886 name 2887 val(b) 2888 } 2889 } 2890 {{</ runnable >}} 2891 2892 2893 ### Facets and Variable Propagation 2894 2895 Facet values of `int` and `float` can be assigned to variables and thus the [values propagate]({{< relref "#variable-propagation" >}}). 2896 2897 2898 Alice, Bob and Charlie each rated every movie. A value variable on facet `rating` maps movies to ratings. A query that reaches a movie through multiple paths sums the ratings on each path. The following sums Alice, Bob and Charlie's ratings for the three movies. 2899 2900 {{<runnable >}} 2901 { 2902 var(func: anyofterms(name, "Alice Bob Charlie")) { 2903 num_raters as math(1) 2904 rated @facets(r as rating) { 2905 total_rating as math(r) # sum of the 3 ratings 2906 average_rating as math(total_rating / num_raters) 2907 } 2908 } 2909 data(func: uid(total_rating)) { 2910 name 2911 val(total_rating) 2912 val(average_rating) 2913 } 2914 2915 } 2916 {{</ runnable >}} 2917 2918 2919 2920 ### Facets and Aggregation 2921 2922 Facet values assigned to value variables can be aggregated. 2923 2924 {{< runnable >}} 2925 { 2926 data(func: eq(name, "Alice")) { 2927 name 2928 rated @facets(r as rating) { 2929 name 2930 } 2931 avg(val(r)) 2932 } 2933 } 2934 {{</ runnable >}} 2935 2936 2937 Note though that `r` is a map from movies to the sum of ratings on edges in the query reaching the movie. Hence, the following does not correctly calculate the average ratings for Alice and Bob individually --- it calculates 2 times the average of both Alice and Bob's ratings. 2938 2939 {{< runnable >}} 2940 2941 { 2942 data(func: anyofterms(name, "Alice Bob")) { 2943 name 2944 rated @facets(r as rating) { 2945 name 2946 } 2947 avg(val(r)) 2948 } 2949 } 2950 {{</ runnable >}} 2951 2952 Calculating the average ratings of users requires a variable that maps users to the sum of their ratings. 2953 2954 {{< runnable >}} 2955 2956 { 2957 var(func: has(rated)) { 2958 num_rated as math(1) 2959 rated @facets(r as rating) { 2960 avg_rating as math(r / num_rated) 2961 } 2962 } 2963 2964 data(func: uid(avg_rating)) { 2965 name 2966 val(avg_rating) 2967 } 2968 } 2969 {{</ runnable >}} 2970 2971 ## Shortest Path Queries 2972 2973 The shortest path between a source (`from`) node and destination (`to`) node can be found using the keyword `shortest` for the query block name. It requires the source node UID, destination node UID and the predicates (at least one) that have to be considered for traversal. A `shortest` query block returns the shortest path under `_path_` in the query response. The path can also be stored in a variable which is used in other query blocks. 2974 2975 **K-Shortest Path queries:** By default the shortest path is returned. With `numpaths: k`, and `k > 1`, the k-shortest paths are returned. Cyclical paths are pruned out from the result of k-shortest path query. With `depth: n`, the paths up to `n` depth away are returned. 2976 2977 {{% notice "note" %}} 2978 - If no predicates are specified in the `shortest` block, no path can be fetched as no edge is traversed. 2979 - If you're seeing queries take a long time, you can set a [gRPC deadline](https://grpc.io/blog/deadlines) to stop the query after a certain amount of time. 2980 {{% /notice %}} 2981 2982 For example: 2983 2984 ```sh 2985 curl localhost:8080/alter -XPOST -d $' 2986 name: string @index(exact) . 2987 ' | python -m json.tool | less 2988 ``` 2989 2990 ```sh 2991 curl -H "Content-Type: application/rdf" localhost:8080/mutate?commitNow=true -XPOST -d $' 2992 { 2993 set { 2994 _:a <friend> _:b (weight=0.1) . 2995 _:b <friend> _:c (weight=0.2) . 2996 _:c <friend> _:d (weight=0.3) . 2997 _:a <friend> _:d (weight=1) . 2998 _:a <name> "Alice" . 2999 _:a <dgraph.type> "Person" . 3000 _:b <name> "Bob" . 3001 _:b <dgraph.type> "Person" . 3002 _:c <name> "Tom" . 3003 _:c <dgraph.type> "Person" . 3004 _:d <name> "Mallory" . 3005 _:d <dgraph.type> "Person" . 3006 } 3007 }' | python -m json.tool | less 3008 ``` 3009 3010 The shortest path between Alice and Mallory (assuming UIDs 0x2 and 0x5 respectively) can be found with query: 3011 3012 ```sh 3013 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{ 3014 path as shortest(from: 0x2, to: 0x5) { 3015 friend 3016 } 3017 path(func: uid(path)) { 3018 name 3019 } 3020 }' | python -m json.tool | less 3021 ``` 3022 3023 Which returns the following results. (Note, without considering the `weight` facet, each edges' weight is considered as 1) 3024 3025 ``` 3026 { 3027 "data": { 3028 "path": [ 3029 { 3030 "name": "Alice" 3031 }, 3032 { 3033 "name": "Mallory" 3034 } 3035 ], 3036 "_path_": [ 3037 { 3038 "uid": "0x2", 3039 "friend": [ 3040 { 3041 "uid": "0x5" 3042 } 3043 ] 3044 } 3045 ] 3046 } 3047 } 3048 ``` 3049 3050 We can return more paths by specifying `numpaths`. Setting `numpaths: 2` returns the shortest two paths: 3051 3052 ```sh 3053 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{ 3054 3055 A as var(func: eq(name, "Alice")) 3056 M as var(func: eq(name, "Mallory")) 3057 3058 path as shortest(from: uid(A), to: uid(M), numpaths: 2) { 3059 friend 3060 } 3061 path(func: uid(path)) { 3062 name 3063 } 3064 }' | python -m json.tool | less 3065 ``` 3066 3067 {{% notice "note" %}}In the query above, instead of using UID literals, we query both people using var blocks and the `uid()` function. You can also combine it with [GraphQL Variables]({{< relref "#graphql-variables" >}}).{{% /notice %}} 3068 3069 Edges weights are included by using facets on the edges as follows. 3070 3071 {{% notice "note" %}}Only one facet per predicate is allowed in the shortest query block.{{% /notice %}} 3072 3073 ```sh 3074 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{ 3075 path as shortest(from: 0x2, to: 0x5) { 3076 friend @facets(weight) 3077 } 3078 3079 path(func: uid(path)) { 3080 name 3081 } 3082 }' | python -m json.tool | less 3083 ``` 3084 3085 ``` 3086 { 3087 "data": { 3088 "path": [ 3089 { 3090 "name": "Alice" 3091 }, 3092 { 3093 "name": "Bob" 3094 }, 3095 { 3096 "name": "Tom" 3097 }, 3098 { 3099 "name": "Mallory" 3100 } 3101 ], 3102 "_path_": [ 3103 { 3104 "uid": "0x2", 3105 "friend": [ 3106 { 3107 "uid": "0x3", 3108 "friend|weight": 0.1, 3109 "friend": [ 3110 { 3111 "uid": "0x4", 3112 "friend|weight": 0.2, 3113 "friend": [ 3114 { 3115 "uid": "0x5", 3116 "friend|weight": 0.3 3117 } 3118 ] 3119 } 3120 ] 3121 } 3122 ] 3123 } 3124 ] 3125 } 3126 } 3127 ``` 3128 3129 Constraints can be applied to the intermediate nodes as follows. 3130 3131 ```sh 3132 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{ 3133 path as shortest(from: 0x2, to: 0x5) { 3134 friend @filter(not eq(name, "Bob")) @facets(weight) 3135 relative @facets(liking) 3136 } 3137 3138 relationship(func: uid(path)) { 3139 name 3140 } 3141 }' | python -m json.tool | less 3142 ``` 3143 3144 The k-shortest path algorithm (used when `numpaths` > 1) also accepts the arguments `minweight` and `maxweight`, which take a float as their value. When they are passed, only paths within the weight range `[minweight, maxweight]` will be considered as valid paths. This can be used, for example, to query the shortest paths that traverse between 2 and 4 nodes. 3145 3146 ```sh 3147 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $'{ 3148 path as shortest(from: 0x2, to: 0x5, numpaths: 2, minweight: 2, maxweight: 4) { 3149 friend 3150 } 3151 path(func: uid(path)) { 3152 name 3153 } 3154 }' | python -m json.tool | less 3155 ``` 3156 3157 Some points to keep in mind for shortest path queries: 3158 3159 - Weights must be non-negative. Dijkstra's algorithm is used to calculate the shortest paths. 3160 - Only one facet per predicate in the shortest query block is allowed. 3161 - Only one `shortest` path block is allowed per query. Only one `_path_` is returned in the result. For queries with `numpaths` > 1, `_path_` contains all the paths. 3162 - Cyclical paths are not included in the result of k-shortest path query. 3163 - For k-shortest paths (when `numpaths` > 1), the result of the shortest path query variable will only return a single path which will be the shortest path among the k paths. All k paths are returned in `_path_`. 3164 3165 ## Recurse Query 3166 3167 `Recurse` queries let you traverse a set of predicates (with filter, facets, etc.) until we reach all leaf nodes or we reach the maximum depth which is specified by the `depth` parameter. 3168 3169 To get 10 movies from a genre that has more than 30000 films and then get two actors for those movies we'd do something as follows: 3170 {{< runnable >}} 3171 { 3172 me(func: gt(count(~genre), 30000), first: 1) @recurse(depth: 5, loop: true) { 3173 name@en 3174 ~genre (first:10) @filter(gt(count(starring), 2)) 3175 starring (first: 2) 3176 performance.actor 3177 } 3178 } 3179 {{< /runnable >}} 3180 Some points to keep in mind while using recurse queries are: 3181 3182 - You can specify only one level of predicates after root. These would be traversed recursively. Both scalar and entity-nodes are treated similarly. 3183 - Only one recurse block is advised per query. 3184 - Be careful as the result size could explode quickly and an error would be returned if the result set gets too large. In such cases use more filters, limit results using pagination, or provide a depth parameter at root as shown in the example above. 3185 - The `loop` parameter can be set to false, in which case paths which lead to a loop would be ignored 3186 while traversing. 3187 - If not specified, the value of the `loop` parameter defaults to false. 3188 - If the value of the `loop` parameter is false and depth is not specified, `depth` will default to `math.MaxUint64`, which means that the entire graph might be traversed until all the leaf nodes are reached. 3189 3190 3191 ## Fragments 3192 3193 `fragment` keyword allows you to define new fragments that can be referenced in a query, as per [GraphQL specification](https://facebook.github.io/graphql/#sec-Language.Fragments). The point is that if there are multiple parts which query the same set of fields, you can define a fragment and refer to it multiple times instead. Fragments can be nested inside fragments, but no cycles are allowed. Here is one contrived example. 3194 3195 ```sh 3196 curl -H "Content-Type: application/graphql+-" localhost:8080/query -XPOST -d $' 3197 query { 3198 debug(func: uid(1)) { 3199 name@en 3200 ...TestFrag 3201 } 3202 } 3203 fragment TestFrag { 3204 initial_release_date 3205 ...TestFragB 3206 } 3207 fragment TestFragB { 3208 country 3209 }' | python -m json.tool | less 3210 ``` 3211 3212 ## GraphQL Variables 3213 3214 `Variables` can be defined and used in queries which helps in query reuse and avoids costly string building in clients at runtime by passing a separate variable map. A variable starts with a `$` symbol. 3215 For **HTTP requests** with GraphQL Variables, we must use `Content-Type: application/json` header and pass data with a JSON object containing `query` and `variables`. 3216 3217 ```sh 3218 curl -H "Content-Type: application/json" localhost:8080/query -XPOST -d $'{ 3219 "query": "query test($a: string) { test(func: eq(name, $a)) { \n uid \n name \n } }", 3220 "variables": { "$a": "Alice" } 3221 }' | python -m json.tool | less 3222 ``` 3223 3224 {{< runnable vars="{\"$a\": \"5\", \"$b\": \"10\", \"$name\": \"Steven Spielberg\"}" >}} 3225 query test($a: int, $b: int, $name: string) { 3226 me(func: allofterms(name@en, $name)) { 3227 name@en 3228 director.film (first: $a, offset: $b) { 3229 name @en 3230 genre(first: $a) { 3231 name@en 3232 } 3233 } 3234 } 3235 } 3236 {{< /runnable >}} 3237 3238 * Variables can have default values. In the example below, `$a` has a default value of `2`. Since the value for `$a` isn't provided in the variable map, `$a` takes on the default value. 3239 * Variables whose type is suffixed with a `!` can't have a default value but must have a value as part of the variables map. 3240 * The value of the variable must be parsable to the given type, if not, an error is thrown. 3241 * The variable types that are supported as of now are: `int`, `float`, `bool` and `string`. 3242 * Any variable that is being used must be declared in the named query clause in the beginning. 3243 3244 {{< runnable vars="{\"$b\": \"10\", \"$name\": \"Steven Spielberg\"}" >}} 3245 query test($a: int = 2, $b: int!, $name: string) { 3246 me(func: allofterms(name@en, $name)) { 3247 director.film (first: $a, offset: $b) { 3248 genre(first: $a) { 3249 name@en 3250 } 3251 } 3252 } 3253 } 3254 {{< /runnable >}} 3255 3256 You can also use array with GraphQL Variables. 3257 3258 {{< runnable vars="{\"$b\": \"10\", \"$aName\": \"Steven Spielberg\", \"$bName\": \"Quentin Tarantino\"}" >}} 3259 query test($a: int = 2, $b: int!, $aName: string, $bName: string) { 3260 me(func: eq(name@en, [$aName, $bName])) { 3261 director.film (first: $a, offset: $b) { 3262 genre(first: $a) { 3263 name@en 3264 } 3265 } 3266 } 3267 } 3268 {{< /runnable >}} 3269 3270 We also support variable substituion in facets now. 3271 {{< runnable vars="{\"$name\": \"Alice\"}" >}} 3272 query test($name: string = "Alice") { 3273 data(func: eq(name, $name)) { 3274 friend @facets(eq(close, true)) { 3275 name 3276 } 3277 } 3278 } 3279 {{</ runnable >}} 3280 3281 {{% notice "note" %}} 3282 If you want to input a list of uids as a GraphQL variable value, you can have the variable as string type and 3283 have the value surrounded by square brackets like `["13", "14"]`. 3284 {{% /notice %}} 3285 3286 ## Indexing with Custom Tokenizers 3287 3288 Dgraph comes with a large toolkit of builtin indexes, but sometimes for niche 3289 use cases they're not always enough. 3290 3291 Dgraph allows you to implement custom tokenizers via a plugin system in order 3292 to fill the gaps. 3293 3294 ### Caveats 3295 3296 The plugin system uses Go's [`pkg/plugin`](https://golang.org/pkg/plugin/). 3297 This brings some restrictions to how plugins can be used. 3298 3299 - Plugins must be written in Go. 3300 3301 - As of Go 1.9, `pkg/plugin` only works on Linux. Therefore, plugins will only 3302 work on Dgraph instances deployed in a Linux environment. 3303 3304 - The version of Go used to compile the plugin should be the same as the version 3305 of Go used to compile Dgraph itself. Dgraph always uses the latest version of 3306 Go (and so should you!). 3307 3308 ### Implementing a plugin 3309 3310 {{% notice "note" %}} 3311 You should consider Go's [plugin](https://golang.org/pkg/plugin/) documentation 3312 to be supplementary to the documentation provided here. 3313 {{% /notice %}} 3314 3315 Plugins are implemented as their own main package. They must export a 3316 particular symbol that allows Dgraph to hook into the custom logic the plugin 3317 provides. 3318 3319 The plugin must export a symbol named `Tokenizer`. The type of the symbol must 3320 be `func() interface{}`. When the function is called the result returned should 3321 be a value that implements the following interface: 3322 3323 ``` 3324 type PluginTokenizer interface { 3325 // Name is the name of the tokenizer. It should be unique among all 3326 // builtin tokenizers and other custom tokenizers. It identifies the 3327 // tokenizer when an index is set in the schema and when search/filter 3328 // is used in queries. 3329 Name() string 3330 3331 // Identifier is a byte that uniquely identifiers the tokenizer. 3332 // Bytes in the range 0x80 to 0xff (inclusive) are reserved for 3333 // custom tokenizers. 3334 Identifier() byte 3335 3336 // Type is a string representing the type of data that is to be 3337 // tokenized. This must match the schema type of the predicate 3338 // being indexed. Allowable values are shown in the table below. 3339 Type() string 3340 3341 // Tokens should implement the tokenization logic. The input is 3342 // the value to be tokenized, and will always have a concrete type 3343 // corresponding to Type(). The return value should be a list of 3344 // the tokens generated. 3345 Tokens(interface{}) ([]string, error) 3346 } 3347 ``` 3348 3349 The return value of `Type()` corresponds to the concrete input type of 3350 `Tokens(interface{})` in the following way: 3351 3352 `Type()` return value | `Tokens(interface{})` input type 3353 -----------------------|---------------------------------- 3354 `"int"` | `int64` 3355 `"float"` | `float64` 3356 `"string"` | `string` 3357 `"bool"` | `bool` 3358 `"datetime"` | `time.Time` 3359 3360 ### Building the plugin 3361 3362 The plugin has to be built using the `plugin` build mode so that an `.so` file 3363 is produced instead of a regular executable. For example: 3364 3365 ```sh 3366 go build -buildmode=plugin -o myplugin.so ~/go/src/myplugin/main.go 3367 ``` 3368 3369 ### Running Dgraph with plugins 3370 3371 When starting Dgraph, use the `--custom_tokenizers` flag to tell Dgraph which 3372 tokenizers to load. It accepts a comma separated list of plugins. E.g. 3373 3374 ```sh 3375 dgraph ...other-args... --custom_tokenizers=plugin1.so,plugin2.so 3376 ``` 3377 3378 {{% notice "note" %}} 3379 Plugin validation is performed on startup. If a problem is detected, Dgraph 3380 will refuse to initialise. 3381 {{% /notice %}} 3382 3383 ### Adding the index to the schema 3384 3385 To use a tokenization plugin, an index has to be created in the schema. 3386 3387 The syntax is the same as adding any built-in index. To add an custom index 3388 using a tokenizer plugin named `foo` to a `string` predicate named 3389 `my_predicate`, use the following in the schema: 3390 3391 ```sh 3392 my_predicate: string @index(foo) . 3393 ``` 3394 3395 ### Using the index in queries 3396 3397 There are two functions that can use custom indexes: 3398 3399 Mode | Behaviour 3400 --------|------- 3401 `anyof` | Returns nodes that match on *any* of the tokens generated 3402 `allof` | Returns nodes that match on *all* of the tokens generated 3403 3404 The functions can be used either at the query root or in filters. 3405 3406 There behaviour here an analogous to `anyofterms`/`allofterms` and 3407 `anyoftext`/`alloftext`. 3408 3409 ### Examples 3410 3411 The following examples should make the process of writing a tokenization plugin 3412 more concrete. 3413 3414 #### Unicode Characters 3415 3416 This example shows the type of tokenization that is similar to term 3417 tokenization of full-text search. Instead of being broken down into terms or 3418 stem words, the text is instead broken down into its constituent unicode 3419 codepoints (in Go terminology these are called *runes*). 3420 3421 {{% notice "note" %}} 3422 This tokenizer would create a very large index that would be expensive to 3423 manage and store. That's one of the reasons that text indexing usually occurs 3424 at a higher level; stem words for full-text search or terms for term search. 3425 {{% /notice %}} 3426 3427 The implementation of the plugin looks like this: 3428 3429 ```go 3430 package main 3431 3432 import "encoding/binary" 3433 3434 func Tokenizer() interface{} { return RuneTokenizer{} } 3435 3436 type RuneTokenizer struct{} 3437 3438 func (RuneTokenizer) Name() string { return "rune" } 3439 func (RuneTokenizer) Type() string { return "string" } 3440 func (RuneTokenizer) Identifier() byte { return 0xfd } 3441 3442 func (t RuneTokenizer) Tokens(value interface{}) ([]string, error) { 3443 var toks []string 3444 for _, r := range value.(string) { 3445 var buf [binary.MaxVarintLen32]byte 3446 n := binary.PutVarint(buf[:], int64(r)) 3447 tok := string(buf[:n]) 3448 toks = append(toks, tok) 3449 } 3450 return toks, nil 3451 } 3452 ``` 3453 3454 **Hints and tips:** 3455 3456 - Inside `Tokens`, you can assume that `value` will have concrete type 3457 corresponding to that specified by `Type()`. It's safe to do a type 3458 assertion. 3459 3460 - Even though the return value is `[]string`, you can always store non-unicode 3461 data inside the string. See [this blogpost](https://blog.golang.org/strings) 3462 for some interesting background how string are implemented in Go and why they 3463 can be used to store non-textual data. By storing arbitrary data in the string, 3464 you can make the index more compact. In this case, varints are stored in the 3465 return values. 3466 3467 Setting up the indexing and adding data: 3468 ``` 3469 name: string @index(rune) . 3470 ``` 3471 3472 3473 ``` 3474 { 3475 set{ 3476 _:ad <name> "Adam" . 3477 _:ad <dgraph.type> "Person" . 3478 _:aa <name> "Aaron" . 3479 _:aa <dgraph.type> "Person" . 3480 _:am <name> "Amy" . 3481 _:am <dgraph.type> "Person" . 3482 _:ro <name> "Ronald" . 3483 _:ro <dgraph.type> "Person" . 3484 } 3485 } 3486 ``` 3487 Now queries can be performed. 3488 3489 The only person that has all of the runes `A` and `n` in their `name` is Aaron: 3490 ``` 3491 { 3492 q(func: allof(name, rune, "An")) { 3493 name 3494 } 3495 } 3496 => 3497 { 3498 "data": { 3499 "q": [ 3500 { "name": "Aaron" } 3501 ] 3502 } 3503 } 3504 ``` 3505 But there are multiple people who have both of the runes `A` and `m`: 3506 ``` 3507 { 3508 q(func: allof(name, rune, "Am")) { 3509 name 3510 } 3511 } 3512 => 3513 { 3514 "data": { 3515 "q": [ 3516 { "name": "Amy" }, 3517 { "name": "Adam" } 3518 ] 3519 } 3520 } 3521 ``` 3522 Case is taken into account, so if you search for all names containing `"ron"`, 3523 you would find `"Aaron"`, but not `"Ronald"`. But if you were to search for 3524 `"no"`, you would match both `"Aaron"` and `"Ronald"`. The order of the runes in 3525 the strings doesn't matter. 3526 3527 It's possible to search for people that have *any* of the supplied runes in 3528 their names (rather than *all* of the supplied runes). To do this, use `anyof` 3529 instead of `allof`: 3530 ``` 3531 { 3532 q(func: anyof(name, rune, "mr")) { 3533 name 3534 } 3535 } 3536 => 3537 { 3538 "data": { 3539 "q": [ 3540 { "name": "Adam" }, 3541 { "name": "Aaron" }, 3542 { "name": "Amy" } 3543 ] 3544 } 3545 } 3546 ``` 3547 `"Ronald"` doesn't contain `m` or `r`, so isn't found by the search. 3548 3549 {{% notice "note" %}} 3550 Understanding what's going on under the hood can help you intuitively 3551 understand how `Tokens` method should be implemented. 3552 3553 When Dgraph sees new edges that are to be indexed by your tokenizer, it 3554 will tokenize the value. The resultant tokens are used as keys for posting 3555 lists. The edge subject is then added to the posting list for each token. 3556 3557 When a query root search occurs, the search value is tokenized. The result of 3558 the search is all of the nodes in the union or intersection of the corresponding 3559 posting lists (depending on whether `anyof` or `allof` was used). 3560 {{% /notice %}} 3561 3562 #### CIDR Range 3563 3564 Tokenizers don't always have to be about splitting text up into its constituent 3565 parts. This example indexes [IP addresses into their CIDR 3566 ranges](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing). This 3567 allows you to search for all IP addresses that fall into a particular CIDR 3568 range. 3569 3570 The plugin code is more complicated than the rune example. The input is an IP 3571 address stored as a string, e.g. `"100.55.22.11/32"`. The output are the CIDR 3572 ranges that the IP address could possibly fall into. There could be up to 32 3573 different outputs (`"100.55.22.11/32"` does indeed have 32 possible ranges, one 3574 for each mask size). 3575 3576 ```go 3577 package main 3578 3579 import "net" 3580 3581 func Tokenizer() interface{} { return CIDRTokenizer{} } 3582 3583 type CIDRTokenizer struct{} 3584 3585 func (CIDRTokenizer) Name() string { return "cidr" } 3586 func (CIDRTokenizer) Type() string { return "string" } 3587 func (CIDRTokenizer) Identifier() byte { return 0xff } 3588 3589 func (t CIDRTokenizer) Tokens(value interface{}) ([]string, error) { 3590 _, ipnet, err := net.ParseCIDR(value.(string)) 3591 if err != nil { 3592 return nil, err 3593 } 3594 ones, bits := ipnet.Mask.Size() 3595 var toks []string 3596 for i := ones; i >= 1; i-- { 3597 m := net.CIDRMask(i, bits) 3598 tok := net.IPNet{ 3599 IP: ipnet.IP.Mask(m), 3600 Mask: m, 3601 } 3602 toks = append(toks, tok.String()) 3603 } 3604 return toks, nil 3605 } 3606 ``` 3607 An example of using the tokenizer: 3608 3609 Setting up the indexing and adding data: 3610 ``` 3611 ip: string @index(cidr) . 3612 3613 ``` 3614 3615 ``` 3616 { 3617 set{ 3618 _:a <ip> "100.55.22.11/32" . 3619 _:b <ip> "100.33.81.19/32" . 3620 _:c <ip> "100.49.21.25/32" . 3621 _:d <ip> "101.0.0.5/32" . 3622 _:e <ip> "100.176.2.1/32" . 3623 } 3624 } 3625 ``` 3626 ``` 3627 { 3628 q(func: allof(ip, cidr, "100.48.0.0/12")) { 3629 ip 3630 } 3631 } 3632 => 3633 { 3634 "data": { 3635 "q": [ 3636 { "ip": "100.55.22.11/32" }, 3637 { "ip": "100.49.21.25/32" } 3638 ] 3639 } 3640 } 3641 ``` 3642 The CIDR ranges of `100.55.22.11/32` and `100.49.21.25/32` are both 3643 `100.48.0.0/12`. The other IP addresses in the database aren't included in the 3644 search result, since they have different CIDR ranges for 12 bit masks 3645 (`100.32.0.0/12`, `101.0.0.0/12`, `100.154.0.0/12` for `100.33.81.19/32`, 3646 `101.0.0.5/32`, and `100.176.2.1/32` respectively). 3647 3648 Note that we're using `allof` instead of `anyof`. Only `allof` will work 3649 correctly with this index. Remember that the tokenizer generates all possible 3650 CIDR ranges for an IP address. If we were to use `anyof` then the search result 3651 would include all IP addresses under the 1 bit mask (in this case, `0.0.0.0/1`, 3652 which would match all IPs in this dataset). 3653 3654 #### Anagram 3655 3656 Tokenizers don't always have to return multiple tokens. If you just want to 3657 index data into groups, have the tokenizer just return an identifying member of 3658 that group. 3659 3660 In this example, we want to find groups of words that are 3661 [anagrams](https://en.wikipedia.org/wiki/Anagram) of each 3662 other. 3663 3664 A token to correspond to a group of anagrams could just be the letters in the 3665 anagram in sorted order, as implemented below: 3666 3667 ```go 3668 package main 3669 3670 import "sort" 3671 3672 func Tokenizer() interface{} { return AnagramTokenizer{} } 3673 3674 type AnagramTokenizer struct{} 3675 3676 func (AnagramTokenizer) Name() string { return "anagram" } 3677 func (AnagramTokenizer) Type() string { return "string" } 3678 func (AnagramTokenizer) Identifier() byte { return 0xfc } 3679 3680 func (t AnagramTokenizer) Tokens(value interface{}) ([]string, error) { 3681 b := []byte(value.(string)) 3682 sort.Slice(b, func(i, j int) bool { return b[i] < b[j] }) 3683 return []string{string(b)}, nil 3684 } 3685 ``` 3686 In action: 3687 3688 Setting up the indexing and adding data: 3689 ``` 3690 word: string @index(anagram) . 3691 ``` 3692 3693 ``` 3694 { 3695 set{ 3696 _:1 <word> "airmen" . 3697 _:2 <word> "marine" . 3698 _:3 <word> "beat" . 3699 _:4 <word> "beta" . 3700 _:5 <word> "race" . 3701 _:6 <word> "care" . 3702 } 3703 } 3704 ``` 3705 ``` 3706 { 3707 q(func: allof(word, anagram, "remain")) { 3708 word 3709 } 3710 } 3711 => 3712 { 3713 "data": { 3714 "q": [ 3715 { "word": "airmen" }, 3716 { "word": "marine" } 3717 ] 3718 } 3719 } 3720 ``` 3721 3722 Since a single token is only ever generated, it doesn't matter if `anyof` or 3723 `allof` is used. The result will always be the same. 3724 3725 #### Integer prime factors 3726 3727 All of the custom tokenizers shown previously have worked with strings. 3728 However, other data types can be used as well. This example is contrived, but 3729 nonetheless shows some advanced usages of custom tokenizers. 3730 3731 The tokenizer creates a token for each prime factor in the input. 3732 3733 ``` 3734 package main 3735 3736 import ( 3737 "encoding/binary" 3738 "fmt" 3739 ) 3740 3741 func Tokenizer() interface{} { return FactorTokenizer{} } 3742 3743 type FactorTokenizer struct{} 3744 3745 func (FactorTokenizer) Name() string { return "factor" } 3746 func (FactorTokenizer) Type() string { return "int" } 3747 func (FactorTokenizer) Identifier() byte { return 0xfe } 3748 3749 func (FactorTokenizer) Tokens(value interface{}) ([]string, error) { 3750 x := value.(int64) 3751 if x <= 1 { 3752 return nil, fmt.Errorf("Cannot factor int <= 1: %d", x) 3753 } 3754 var toks []string 3755 for p := int64(2); x > 1; p++ { 3756 if x%p == 0 { 3757 toks = append(toks, encodeInt(p)) 3758 for x%p == 0 { 3759 x /= p 3760 } 3761 } 3762 } 3763 return toks, nil 3764 3765 } 3766 3767 func encodeInt(x int64) string { 3768 var buf [binary.MaxVarintLen64]byte 3769 n := binary.PutVarint(buf[:], x) 3770 return string(buf[:n]) 3771 } 3772 ``` 3773 {{% notice "note" %}} 3774 Notice that the return of `Type()` is `"int"`, corresponding to the concrete 3775 type of the input to `Tokens` (which is `int64`). 3776 {{% /notice %}} 3777 3778 This allows you do things like search for all numbers that share prime 3779 factors with a particular number. 3780 3781 In particular, we search for numbers that contain any of the prime factors of 3782 15, i.e. any numbers that are divisible by either 3 or 5. 3783 3784 Setting up the indexing and adding data: 3785 ``` 3786 num: int @index(factor) . 3787 ``` 3788 3789 ``` 3790 { 3791 set{ 3792 _:2 <num> "2"^^<xs:int> . 3793 _:3 <num> "3"^^<xs:int> . 3794 _:4 <num> "4"^^<xs:int> . 3795 _:5 <num> "5"^^<xs:int> . 3796 _:6 <num> "6"^^<xs:int> . 3797 _:7 <num> "7"^^<xs:int> . 3798 _:8 <num> "8"^^<xs:int> . 3799 _:9 <num> "9"^^<xs:int> . 3800 _:10 <num> "10"^^<xs:int> . 3801 _:11 <num> "11"^^<xs:int> . 3802 _:12 <num> "12"^^<xs:int> . 3803 _:13 <num> "13"^^<xs:int> . 3804 _:14 <num> "14"^^<xs:int> . 3805 _:15 <num> "15"^^<xs:int> . 3806 _:16 <num> "16"^^<xs:int> . 3807 _:17 <num> "17"^^<xs:int> . 3808 _:18 <num> "18"^^<xs:int> . 3809 _:19 <num> "19"^^<xs:int> . 3810 _:20 <num> "20"^^<xs:int> . 3811 _:21 <num> "21"^^<xs:int> . 3812 _:22 <num> "22"^^<xs:int> . 3813 _:23 <num> "23"^^<xs:int> . 3814 _:24 <num> "24"^^<xs:int> . 3815 _:25 <num> "25"^^<xs:int> . 3816 _:26 <num> "26"^^<xs:int> . 3817 _:27 <num> "27"^^<xs:int> . 3818 _:28 <num> "28"^^<xs:int> . 3819 _:29 <num> "29"^^<xs:int> . 3820 _:30 <num> "30"^^<xs:int> . 3821 } 3822 } 3823 ``` 3824 ``` 3825 { 3826 q(func: anyof(num, factor, 15)) { 3827 num 3828 } 3829 } 3830 => 3831 { 3832 "data": { 3833 "q": [ 3834 { "num": 3 }, 3835 { "num": 5 }, 3836 { "num": 6 }, 3837 { "num": 9 }, 3838 { "num": 10 }, 3839 { "num": 12 }, 3840 { "num": 15 }, 3841 { "num": 18 } 3842 { "num": 20 }, 3843 { "num": 21 }, 3844 { "num": 25 }, 3845 { "num": 24 }, 3846 { "num": 27 }, 3847 { "num": 30 }, 3848 ] 3849 } 3850 } 3851 ```