vitess.io/vitess@v0.16.2/doc/Vision.md (about)

     1  # Motivation and Vision
     2  
     3  MySQL is an easy relational database to get started with.
     4  It's easy to setup and has a short learning curve.
     5  However, as your system starts to scale, it begins to run out of steam.
     6  This is mainly because it's non-trivial to shard a MySQL database after the fact.
     7  Among other problems, the growing number of connections also becomes an
     8  unbearable overhead.
     9  
    10  On the other end of the spectrum, there are NoSQL databases.
    11  However, they suffer from problems that mainly stem from the fact that they're new.
    12  Those who have adopted them have struggled with the lack of secondary indexes,
    13  table joins and transactions.
    14  
    15  Vitess tries to bring the best of both worlds by trading off
    16  some of MySQL's consistency features in order to achieve the
    17  kind of scalability that NoSQL databases provide.
    18  
    19  ### Priorities
    20  
    21  * *Scalability*: This is achieved by replication and sharding.
    22  * *Efficiency*: This is achieved by a proxy server (vttablet) that
    23  multiplexes queries into a fixed-size connection pool, and rewrites
    24  updates by primary key to speed up slave applier threads.
    25  * *Manageability*: As soon as you add replication and sharding that span
    26  across multiple data centers, the number of servers spirals out of control.
    27  Vitess provides a set of tools backed by a lockserver (zookeeper) to
    28  track and administer them.
    29  * *Simplicity*: As the complexity grows, it's important to hide this
    30  from the application.
    31  The vtgate servers give you a unified view of the fleet that makes
    32  it feel like you're just interacting with one database.
    33  
    34  ### Trade-offs
    35  
    36  Scalability and availability require some trade-offs:
    37  
    38  * *Consistency*: In a typical web application, not all reads have to be
    39  fully consistent.
    40  Vitess lets you specify the kind of consistency you want on your read.
    41  It's generally recommended that you use replica reads as they're easier to scale.
    42  You can always request for primary reads if you want up-to-date data.
    43  You can also additionally perform 'for update' reads that ensure that
    44  a row will not change until you've committed your changes.
    45  * *Transactions*: Relational transactions are prohibitively expensive
    46  across distributed systems.
    47  Vitess eases this constraint and guarantees transactional integrity
    48  'per keyspace id', which is restricted to one shard.
    49  Heuristically, this tends to cover most of an application's transactions.
    50  For the few cases that don't, you can sequence your changes in such a way
    51  that the system looks consistent even if a distributed transaction fails
    52  in the middle.
    53  * *Latency*: There is some negligible latency introduced by the proxy servers.
    54  However, they make up for the fact that you can extract more throughput from
    55  MySQL than you would otherwise be able to without them.
    56  
    57  ### Preserved MySQL features
    58  
    59  Since the underlying storage layer is still MySQL, we still get to preserve
    60  its other important features:
    61  
    62  * *Indexes*: You can create secondary indexes on your tables. This allows you
    63  to efficiently query rows using more than one key.
    64  * *Joins*:  MySQL allows you to split one-to-many and many-to-many relational data
    65  into separate tables, and lets you join them on demand.
    66  This flexibility generally results in more efficient storage as each piece of
    67  data is stored only once, and fetched only if needed.
    68  
    69  ### The Vitess spectrum
    70  
    71  The following diagram illustrates where vitess fits in the spectrum of storage solutions:
    72  
    73  ![Spectrum](https://raw.github.com/vitessio/vitess/main/doc/VitessSpectrum.png)