vitess.io/vitess@v0.16.2/doc/Vision.md (about) 1 # Motivation and Vision 2 3 MySQL is an easy relational database to get started with. 4 It's easy to setup and has a short learning curve. 5 However, as your system starts to scale, it begins to run out of steam. 6 This is mainly because it's non-trivial to shard a MySQL database after the fact. 7 Among other problems, the growing number of connections also becomes an 8 unbearable overhead. 9 10 On the other end of the spectrum, there are NoSQL databases. 11 However, they suffer from problems that mainly stem from the fact that they're new. 12 Those who have adopted them have struggled with the lack of secondary indexes, 13 table joins and transactions. 14 15 Vitess tries to bring the best of both worlds by trading off 16 some of MySQL's consistency features in order to achieve the 17 kind of scalability that NoSQL databases provide. 18 19 ### Priorities 20 21 * *Scalability*: This is achieved by replication and sharding. 22 * *Efficiency*: This is achieved by a proxy server (vttablet) that 23 multiplexes queries into a fixed-size connection pool, and rewrites 24 updates by primary key to speed up slave applier threads. 25 * *Manageability*: As soon as you add replication and sharding that span 26 across multiple data centers, the number of servers spirals out of control. 27 Vitess provides a set of tools backed by a lockserver (zookeeper) to 28 track and administer them. 29 * *Simplicity*: As the complexity grows, it's important to hide this 30 from the application. 31 The vtgate servers give you a unified view of the fleet that makes 32 it feel like you're just interacting with one database. 33 34 ### Trade-offs 35 36 Scalability and availability require some trade-offs: 37 38 * *Consistency*: In a typical web application, not all reads have to be 39 fully consistent. 40 Vitess lets you specify the kind of consistency you want on your read. 41 It's generally recommended that you use replica reads as they're easier to scale. 42 You can always request for primary reads if you want up-to-date data. 43 You can also additionally perform 'for update' reads that ensure that 44 a row will not change until you've committed your changes. 45 * *Transactions*: Relational transactions are prohibitively expensive 46 across distributed systems. 47 Vitess eases this constraint and guarantees transactional integrity 48 'per keyspace id', which is restricted to one shard. 49 Heuristically, this tends to cover most of an application's transactions. 50 For the few cases that don't, you can sequence your changes in such a way 51 that the system looks consistent even if a distributed transaction fails 52 in the middle. 53 * *Latency*: There is some negligible latency introduced by the proxy servers. 54 However, they make up for the fact that you can extract more throughput from 55 MySQL than you would otherwise be able to without them. 56 57 ### Preserved MySQL features 58 59 Since the underlying storage layer is still MySQL, we still get to preserve 60 its other important features: 61 62 * *Indexes*: You can create secondary indexes on your tables. This allows you 63 to efficiently query rows using more than one key. 64 * *Joins*: MySQL allows you to split one-to-many and many-to-many relational data 65 into separate tables, and lets you join them on demand. 66 This flexibility generally results in more efficient storage as each piece of 67 data is stored only once, and fetched only if needed. 68 69 ### The Vitess spectrum 70 71 The following diagram illustrates where vitess fits in the spectrum of storage solutions: 72 73 