github.com/unigraph-dev/dgraph@v1.1.1-0.20200923154953-8b52b426f765/present/sydney5mins/g.slide (about)

     1  Dgraph
     2  Go Meetup Sydney
     3  29 Oct 2015
     4  Tags: go golang dgraph graph
     5  
     6  Manish R Jain
     7  Backend Developer / Freelancer
     8  Ex-Google - Web Search & Knowledge Infra
     9  manishrjain@gmail.com
    10  @manishrjain
    11  
    12  * What is Graph
    13  
    14  - Abstract data type to represent relationships between objects.
    15  - Made up of Entities and Edges. Directed edge in triple format:
    16  		Entity      --attribute-->  Entity/Value
    17  		[Tom Hanks] --married_to->  [Rita Wilson]
    18  		[Tom Hanks] --born_on---->  [July 9, 1956]
    19  - Popular graphs: Facebook Social Graph, Google Knowledge Graph.
    20  
    21  .image graph.png
    22  
    23  * What is Dgraph
    24  
    25  [[https://github.com/dgraph-io/dgraph][https://github.com/dgraph-io/dgraph]]
    26  
    27  Dgraph is an open source, distributed, low-latency graph serving system written in Go.
    28  
    29  - *Low*Latency*: Minimize the latency of query execution.
    30  
    31  		Minimize the number of network calls required to run the query.
    32  		Linear time complexity, based on attributes/depth of query, not number of results.
    33  		Meant to be run in production, serving real time user queries.
    34  
    35  - *Distributed*: Automatically distribute data to and serve from provided servers.
    36  		Handle shard splits, shard merges, and shard movement.
    37  
    38  - *Highly*Availability*: Automatic data replication and failover.
    39  - *Resilience*: Automatically handle server failures, and shard reassignment to healthy servers.
    40  
    41  * Implementation
    42  
    43  - Use Flatbuffers for on-disk, in-memory and over-network representation.
    44  - Entities assigned `uint64` uids for optimized representation.
    45  - Use RocksDB to store data internally in posting list format.
    46  	Optimized for seeks: ram, ssd, disk
    47  	Used at Facebook, CockroachDB.
    48  - Posting List = all directed edges from a given attribute
    49  		[attribute, entity] -> [sorted list of entities / value]
    50  - Generally one complete posting list would be served by a server.
    51  - If posting list is too _big_, chunk it into shards.
    52  - Shard is the most granular data to be served or moved around.
    53  - Server can serve many shards.
    54  - Each shard replicated across at least 3 different servers.
    55  
    56  * Example: Names of Friends of Friends of ME
    57  
    58  	- GraphQL query received. Parse into internal query rep.
    59  
    60  	[Network call]
    61  
    62  	- Pick server serving posting list `friend`.
    63  	- Seek to `friend, me`. Get a list of friends uids, and return.
    64  
    65  	[Network call]
    66  
    67  	- Send all uids again. For each friend uid_i, seek to `friend, uid_i`
    68  	- Get lists of lists of uids. Merge them into one big list, and return.
    69  
    70  	[Network call]
    71  
    72  	- Pick server serving posting list `name`.
    73  	- For each uid_i, seek to `name, uid_i`, and return.
    74  
    75  - 3 RT network calls in total
    76  - Network calls: O(m) where m = depth + attributes in query
    77  - RocksDB Seeks: O(n) where n = total results.
    78  
    79  * Minimum Viable Product
    80  - Planning to launch in mid-November.
    81  - Non-distributed. Runs on only one server.
    82  - Focus on low-latency.
    83  - Support a subset of GraphQL. Responses in JSON.
    84  - Launch with performance comparisons against popular Neo4J.
    85  
    86  Do try it out!