github.com/dgraph-io/dgraph@v1.2.8/wiki/content/dgraph-compared-to-other-databases/index.md (about)

     1  +++
     2  title = "Dgraph compared to other databases"
     3  +++
     4  
     5  This page attempts to draw a comparison between Dgraph and other popular graph databases/datastores. The summaries that follow are brief descriptions that may help a person decide if Dgraph will suit their needs.
     6  
     7  # Batch based
     8  Batch based graph processing frameworks provide a very high throughput to do periodic processing of data. This is useful to convert graph data into a shape readily usable by other systems to then serve the data to end users.
     9  
    10  ## Pregel
    11  * [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf), is a system for large-scale graph processing by Google. You can think of it as equivalent to MapReduce/Hadoop.
    12  * Pregel isn't designed to be exposed directly to users, i.e. run with real-time updates and execute arbitrary complexity queries. Dgraph is designed to be able to respond to arbitrarily complex user queries in low latency and allow user interaction.
    13  * Pregel can be used along side Dgraph for complementary processing of the graph, to allow for queries which would take over a minute to run via Dgraph, or produce too much data to be consumed by clients directly.
    14  
    15  ---
    16  
    17  # Database
    18  Graph databases optimize internal data representation to be able to do graph operations efficiently.
    19  
    20  ## Neo4j
    21  [Neo4j](https://neo4j.com/) is the most popular graph database according to [db-engines.com](http://db-engines.com/en/ranking/graph+dbms) and has been around since 2007. Dgraph is a much newer graph database built to scale to Google web scale and for serious production usage as the primary database.
    22  
    23  ### Language
    24  
    25  Dgraph supports [GraphQL+-]({{< relref "query-language/index.md#graphql">}}),
    26  a variation of [GraphQL](https://graphql.org/), a query language created by
    27  Facebook. 
    28  GraphQL+-, as GraphQL itself, allows results to be produced as subgraph rather than lists.
    29  Schema validation is also useful to ensure data correctness during both input and output.
    30  
    31  ### Scalability
    32  
    33  Neo4j runs on a single server. The enterprise version of Neo4j only runs
    34  universal data replicas. As the data scales, this requires user to vertically
    35  scale their servers. [Vertical scaling is expensive.][vert]
    36  
    37  Dgraph has a distributed architecture. You can split your data among many Dgraph
    38  servers to distribute it horizontally. As you add more data, you can just add
    39  more commodity hardware to serve it. Dgraph bakes more performance features like
    40  reducing network calls in a cluster and a highly concurrent execution of
    41  queries, to achieve a high query throughput. Dgraph does consistent replication
    42  of each shard, which makes it crash resilient, and protects users from server
    43  downtime.
    44  
    45  [vert]: https://blog.openshift.com/best-practices-for-horizontal-application-scaling/
    46  
    47  ### Transactions
    48  
    49  Both systems provide ACID transactions. Neo4j supports ACID transactions in its
    50  single server architecture. Dgraph, despite being a distributed and consistently
    51  replicated system, supports ACID transactions with snapshot isolation.
    52  
    53  ### Replication
    54  
    55  Neo4j's universal data replication is only available to users who purchase their
    56  [enterprise license][neo4je]. At Dgraph, we consider horizontal scaling and
    57  consistent replication the basic necessities of any application built today.
    58  Dgraph not only would automatically shard your data, it would move data around
    59  to rebalance these shards, so users achieve the best machine utilization and
    60  query latency possible.
    61  
    62  Dgraph is consistently replicated. Any read followed by a write would be visible
    63  to the client, irrespective of which replica it hit. In short, we achieve
    64  linearizable reads.
    65  
    66  [neo4je]: https://neo4j.com/subscriptions/#editions
    67  
    68  ***For a more thorough comparison of Dgraph vs Neo4j, you can read our [blog](https://open.dgraph.io/post/benchmark-neo4j)***
    69  
    70  ---
    71  
    72  # Datastore
    73  Graph datastores act like a graph layer above some other SQL/NoSQL database to do the data management for them. This other database is the one responsible for backups, snapshots, server failures and data integrity.
    74  
    75  ## Cayley
    76  * Both [Cayley](https://cayley.io/) and Dgraph are written primarily in Go language and inspired from different projects at Google.
    77  * Cayley acts like a graph layer, providing a clean storage interface that could be implemented by various stores, for, e.g., PostGreSQL, RocksDB for a single machine, MongoDB to allow distribution. In other words, Cayley hands over data to other databases. While Dgraph uses [Badger](https://github.com/dgraph-io/badger), it assumes complete ownership over the data and tightly couples data storage and management to allow for efficient distributed queries.
    78  * Cayley's design suffers from high fan-out issues. In that, if intermediate steps cause a lot of results to be returned, and the data is distributed, it would result in many network calls between Cayley and the underlying data layer. Dgraph's design minimizes the number of network calls, to reduce the number of servers it needs to touch to respond to a query. This design produces better and predictable query latencies in a cluster, even as cluster size increases.
    79  
    80  ***For a comparison of query and data loading benchmarks for Dgraph vs Cayley, you can read [Differences between Dgraph and Cayley](https://discuss.dgraph.io/t/differences-between-dgraph-and-cayley/23/3)***.