github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20180413_alter_primary_key.md

github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20180413_alter_primary_key.md (about)

1 - Feature Name: Alter Primary Key
2 - Status: Draft
3 - Start Date: 2018-04-11
4 - Authors: David Taylor, Rohan Yadav, Solon Gordon
5 - RFC PR: [25208](https://github.com/cockroachdb/cockroach/pull/25208)
6 - Cockroach Issues: #19141
7
8 # Table of Contents
9 - [Summary](#summary)
10 - [Motivation](#motivation)
11 - [High-level summary](#high-level-summary)
12 - [Implementation plan](#implementation-plan)
13 - [Follow-up, Out-of-Scope Work](#follow-up-out-of-scope-work)
14
15 # Summary
16
17 Users should be able to change the primary key of an existing table.
18
19 # Motivation
20
21 Schemas change over time, and the primary key is particularly special in
22 Cockroach: it determines where the actual row data is stored.
23
24 The choice of primary key -- the choice of columns and their order -- determines
25 how a table's rows are physically stored in the database, which can be critical
26 in determining performance of various workloads and, with the introduction of
27 user-controlled partitioning based on primary key prefixes, can have significant
28 operational considerations as well.
29
30 As applications, workloads and operational requirements evolve over time, the
31 choice of primary key for a given table may evolve as well, making the ability
32 to change it an important feature of a maintainable database. In particular,
33 the inability to alter primary keys is a painful roadblock for customers who
34 start with a single-region Cockroach cluster and later want to expand into a
35 partitioned, global deployment. The ability to change primary keys also reduces
36 friction during prototyping when users are still figuring out their schemas.
37
38 # High-level Summary
39
40 Note: This document assumes familiarity with the content of
41 [`Structured data encoding in CockroachDB SQL`](`docs/tech-notes/encoding.md`).
42
43 Our approach for changing primary keys is simple at a high level. The user will
44 specify which columns they want their new primary key to contain, and then
45 CockroachDB will start a long-running schema change operation. It will compute
46 which secondary indexes need to be rewritten, start a job where these new
47 indexes and the new primary key are created, and finally swap out the old
48 indexes for the new ones.
49
50 There are two reasons that a secondary index may need to be rewritten. The
51 first is if it does not index or store all the columns from the new primary
52 key. This property is required to support index joins, where we first look up
53 into a secondary index and then look up the corresponding value in the primary
54 index.
55
56 The second is that a secondary index may depend on the primary index for
57 "uniqueness." In particular, inverted indexes, non-unique indexes, and unique
58 indexes with nullable columns all encode primary key columns in their keys to
59 guarantee that each key is unique. This means that if the primary key changes
60 and the old primary key is deleted, that uniqueness guarantee no longer holds.
61
62 In order to prevent unexpected performance degradation to existing queries, we
63 will also rewrite the old primary index as a secondary index with the same key,
64 unless otherwise specified by the user. (This step can be omitted if the old
65 primary key columns form a prefix of the new ones.)
66
67 # Detailed Explanation
68
69 ## Syntax
70
71 We propose to use the syntax `ALTER TABLE t ALTER PRIMARY KEY (col1, col2,
72 ...)`. This will initiate the long-running schema change described above.
73
74 This differs from Postgres syntax, where the old primary key constraint is
75 first dropped via `ALTER TABLE t DROP CONSTRAINT...` and then the new index is
76 added via `ALTER TABLE t ADD PRIMARY KEY...`. This is viable in Postgres
77 because the primary key does not affect the underlying storage; it is just a
78 special index which is unique and non-null. This is not the case in Cockroach,
79 where truly dropping the primary key would mean rewriting the entire table
80 before a new key is even added.
81
82 ### Example
83 Say that we have the following `users` table:
84 ```sql
85 CREATE TABLE users (
86 id UUID PRIMARY KEY,
87 email STRING,
88 name STRING,
89 INDEX users_name_idx (name)
90 );
91 ```
92
93 Now we are moving to a global deployment, so we would like to add a new
94 `region` column which is part of the primary key. First we add the column,
95 which must not be nullable:
96 ```sql
97 ALTER TABLE users ADD COLUMN region STRING NOT NULL;
98 ```
99
100 Next, we change the primary key.
101 ```sql
102 ALTER TABLE users ALTER PRIMARY KEY (region, id);
103 ```
104
105 The old primary index has been rewritten as a secondary index named
106 `user_id_idx`. It can be dropped it if no longer needed.
107 ```sql
108 DROP INDEX users@users_id_idx;
109 ```
110
111 ## Rewriting indexes
112
113 While simple at a high level, there are several key technical details involved
114 in applying the steps above. Consider again the example of prepending a
115 `region` column to the `users` table's primary key. The steps we want to
116 perform behind the scenes are:
117 1. Build a new primary index on `(region, id)`
118 2. Build a new secondary index `users_name_idx_rewritten` that uses the new
119 primary index's columns for key uniqueness.
120 3. Build a new secondary index `users_id_idx` which indexes the same columns as
121 the old primary key.
122 4. Swap the old primary index with the new primary index and `users_name_idx`
123 with `users_name_idx_rewritten`
124 5. Delete the old `users_name_idx` and the old primary key.
125
126 There are several reasons we cannot accomplish these steps using existing SQL
127 syntax.
128 1. The new primary index needs to be *built and encoded as a primary index*. We
129 cannot build it as a secondary index and suddenly switch to using it as a
130 primary index.
131 2. The new index `users_name_idx_rewritten` needs to be built using the new
132 primary key columns as the extra columns that it stores or indexes. This can
133 be done by manually changing the `ExtraColumnIDs` field on the
134 `IndexDescriptor` for `users_name_idx_rewritten` as the `IndexDescriptor` is
135 created.
136 3. `users_name_idx_rewritten` *cannot* be visible for reading until the primary
137 key of the table is actually changed, because depending on the new primary
138 key columns, `users_name_idx_rewritten` might not contain enough information
139 to lookup back into the old primary key.
140 4. The old index `users_name_idx` *must not be visible for reads* as soon as
141 the primary key swap occurs, as it will not have enough information to
142 lookup into the new primary key.
143 5. The new indexes being promoted into read, the primary key swap, and the old
144 indexes demoted from read must all occur in the *same atomic write*.
145
146 To address these issues, we have prototyped two new constructs.
147
148 ### Covering Indexes
149
150 In order to treat some secondary indexes like primary keys at the encoding
151 level, we introduce a notion of a `covering index`. These indexes are denoted
152 by a new bit on the `IndexDescriptor`. If an index is covering, it is
153 implicitly assumed to store all columns and be encoded as a primary key, even
154 if its not marked as a primary key on `TableDescriptor`. This bit is handled
155 in `EncodeSecondaryIndex`, where a covering index is encoded using the primary
156 key encoding. Nothing else needs to change to accommodate this addition, as it
157 is meant to be used only during primary key changes.
158
159 ### Primary Key Swap Mutation
160
161 The `TableDescriptor` struct has a field `Mutations` that is a list of
162 `DescriptorMutation` objects. These correspond to pending mutations that need
163 to be applied to the `TableDescriptor`. This list includes operations such as
164 adding and dropping indexes. In order to solve the index visibility and index
165 swap problems from above, we introduce a new `DescriptorMutation` type that
166 represents the work that needs to get done when processing a primary key swap
167 operation.
168
169 The new mutation type specifies what index ID is the new primary index and
170 which secondary indexes need to be swapped when the primary key is changed.
171 When the `SchemaChangeManager` processes this mutation, it performs all the
172 specified swaps and places index deletion mutations for the old indexes onto
173 the `TableDescriptor`'s `Mutations` queue. This all-at-once operation changes
174 the visibility of all the relevant indexes at once so that no one read
175 operations are performed operations on invalid indexes.
176
177 ## Interleaving
178
179 To support interleaving a new primary key, the ALTER PRIMARY KEY statement will
180 support an INTERLEAVE IN PARENT clause which behaves the same as it does on a
181 CREATE TABLE statement. When the new primary index is written, it will be
182 interleaved into the specified parent exactly as if it were a new table.
183
184 However, the scenario where a user wants to run ALTER PRIMARY KEY on an
185 interleaved _parent_ is trickier, because we do not support dropping an
186 interleaved parent without CASCADE, i.e. dropping its children as well. Due to
187 this restriction, we will initially disallow ALTER PRIMARY KEY on interleaved
188 parents. If a user wants to alter an interleaved parent, they will need to
189 first rewrite the children so that they are no longer interleaved into that
190 table. If the child is a primary index, this means altering the primary key. If
191 it is a secondary index, this means writing a new index and then dropping the
192 old one. Once the parent index is altered, the user may want to alter the
193 children once more to interleave them into the new parent.
194
195 Admittedly this is a somewhat cumbersome process but it represents the minimum
196 work necessary to support the interleaved use case. See the Follow-up Work
197 section for potential improvements.
198
199 ## Column Families
200
201 We currently enforce that primary key columns can only be in the first column
202 family (ID 0). However, this invariant may be violated when the primary key
203 changes. We propose removing this restriction so that primary key columns may
204 appear in any column family. Note that in practice this only affects secondary
205 indexes, since the primary key columns are encoded in every column family's keys
206 for the primary index.
207
208 ## Foreign Keys
209
210 Foreign keys currently have a hard dependency on an index of the table they
211 reference. We intend to remove that dependency and instead have foreign
212 key checks use whatever indexes are available. However, this work will not
213 necessarily be completed for 20.1.
214
215 Because of this, whenever we rewrite a secondary index while altering a primary
216 key, we must update any foreign key references to point to the new index.
217 Furthermore, if any foreign keys depend on the old primary index, we must update
218 them to instead depend on the new secondary index which represents that index.
219 Specifically, this means updating the LegacyReferencedIndex field for each
220 relevant ForeignKeyConstraint.
221
222 ## Rollbacks
223 Viewing the primary key change process as 3 distinct stages greatly complicates
224 the procedure for rolling back from an in-progress primary key change. However,
225 we can view the asynchronous index cleanup as not part of the job, and consider
226 the primary key change finished at the point of the atomic index swap. Coincidentally,
227 that is how it appears to users as well! In this case, the only place where we have
228 to worry about a cancellation/failure occurring is during the index building phase.
229 If the primary key change process is requested to rollback during the index building
230 phase, each new index can be rolled back, and the primary key swap mutation can be
231 removed.
232
233 ## Test Plan
234
235 We plan to test primary key changes at several levels:
236
237 ### Unit Tests
238 Implement a series of test hooks within the process of primary key changes.
239 These test hooks can be swapped out with waiting/notification functions
240 during unit tests to verify properties like index visibility at different
241 points of the primary index change process.
242
243 ### Roachtests
244 Run some long-running workloads that interact with a large table before,
245 during, and after a primary key change.
246
247 ### Logictests
248 Implement a logictest mutation that when encountering a `CREATE TABLE`
249 statement, alters the primary key to a new primary key with the same columns as
250 the old primary key. This test will give us a large amount of coverage of SQL
251 operations on the transformed table.
252
253 ### Schema Changer Specific Tests
254 Set up a variety of schema change style tests, including cancellations,
255 rollbacks, node failures, etc.
256
257 # Follow-up, Out-of-Scope Work
258
259 These are intended as a starting point for future work, capturing any discussion
260 that was had in the process of forming the above plan, but is *not* in scope at
261 this time.
262
263 ## Supporting Postgres Syntax
264
265 Mainly for ORM compatibility, we should try to support the Postgres syntax for
266 changing a primary key, where the old constraint is dropped and then the new
267 one is added. We might accomplish this by detecting when these statements
268 appear together in the same transaction and then proceeding with ALTER PRIMARY
269 KEY as usual. In this case, we would not create a corresponding secondary index
270 for the old primary key, since the user explicitly specified that it should be
271 dropped.
272
273 ## Preserving the Old Primary Index
274
275 Rather than rewriting the old primary index as an equivalent secondary index,
276 we could keep it around as a secondary index with the Covering bit set. This
277 has two potential advantages. One is that we would rewrite less data. The other
278 is that we could allow ALTER PRIMARY KEY on interleaved parents, because the
279 children could remain interleaved into the old primary key. One challenge here
280 would be keeping the old primary index up-to-date when columns are added or
281 dropped. There would also be new complexity from allowing tables to be
282 interleaved into non-primary indexes, which is currently not permitted.
283
284 ## Supporting DROP on Interleaved Parents
285
286 We could also start allowing ALTER PRIMARY KEY on interleaved parents if
287 https://github.com/cockroachdb/cockroach/issues/8036 were completed. This is
288 perhaps less ideal than preserving the old primary index, because the child
289 tables would be interleaved into nothing until their primary keys were also
290 altered.
291
292 ## Prepending Columns to Indexes
293
294 We anticipate that a common use case for index changes is when a customer is
295 trying to scale their database to multiple regions. In this case, it is common
296 for users to need to prepend something like a `region` column so that they can
297 easily apply geo-partitioning strategies. This command syntactically could be
298 something like `ALTER TABLE t PREFIX INDEXES BY (region)` and would involve
299 rewriting each index in the table. This is not directly applicable to the
300 general use case of changing the primary key, but is a step that could help the
301 "path to global" user story.