github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20180413_alter_primary_key.md (about) 1 - Feature Name: Alter Primary Key 2 - Status: Draft 3 - Start Date: 2018-04-11 4 - Authors: David Taylor, Rohan Yadav, Solon Gordon 5 - RFC PR: [25208](https://github.com/cockroachdb/cockroach/pull/25208) 6 - Cockroach Issues: #19141 7 8 # Table of Contents 9 - [Summary](#summary) 10 - [Motivation](#motivation) 11 - [High-level summary](#high-level-summary) 12 - [Implementation plan](#implementation-plan) 13 - [Follow-up, Out-of-Scope Work](#follow-up-out-of-scope-work) 14 15 # Summary 16 17 Users should be able to change the primary key of an existing table. 18 19 # Motivation 20 21 Schemas change over time, and the primary key is particularly special in 22 Cockroach: it determines where the actual row data is stored. 23 24 The choice of primary key -- the choice of columns and their order -- determines 25 how a table's rows are physically stored in the database, which can be critical 26 in determining performance of various workloads and, with the introduction of 27 user-controlled partitioning based on primary key prefixes, can have significant 28 operational considerations as well. 29 30 As applications, workloads and operational requirements evolve over time, the 31 choice of primary key for a given table may evolve as well, making the ability 32 to change it an important feature of a maintainable database. In particular, 33 the inability to alter primary keys is a painful roadblock for customers who 34 start with a single-region Cockroach cluster and later want to expand into a 35 partitioned, global deployment. The ability to change primary keys also reduces 36 friction during prototyping when users are still figuring out their schemas. 37 38 # High-level Summary 39 40 Note: This document assumes familiarity with the content of 41 [`Structured data encoding in CockroachDB SQL`](`docs/tech-notes/encoding.md`). 42 43 Our approach for changing primary keys is simple at a high level. The user will 44 specify which columns they want their new primary key to contain, and then 45 CockroachDB will start a long-running schema change operation. It will compute 46 which secondary indexes need to be rewritten, start a job where these new 47 indexes and the new primary key are created, and finally swap out the old 48 indexes for the new ones. 49 50 There are two reasons that a secondary index may need to be rewritten. The 51 first is if it does not index or store all the columns from the new primary 52 key. This property is required to support index joins, where we first look up 53 into a secondary index and then look up the corresponding value in the primary 54 index. 55 56 The second is that a secondary index may depend on the primary index for 57 "uniqueness." In particular, inverted indexes, non-unique indexes, and unique 58 indexes with nullable columns all encode primary key columns in their keys to 59 guarantee that each key is unique. This means that if the primary key changes 60 and the old primary key is deleted, that uniqueness guarantee no longer holds. 61 62 In order to prevent unexpected performance degradation to existing queries, we 63 will also rewrite the old primary index as a secondary index with the same key, 64 unless otherwise specified by the user. (This step can be omitted if the old 65 primary key columns form a prefix of the new ones.) 66 67 # Detailed Explanation 68 69 ## Syntax 70 71 We propose to use the syntax `ALTER TABLE t ALTER PRIMARY KEY (col1, col2, 72 ...)`. This will initiate the long-running schema change described above. 73 74 This differs from Postgres syntax, where the old primary key constraint is 75 first dropped via `ALTER TABLE t DROP CONSTRAINT...` and then the new index is 76 added via `ALTER TABLE t ADD PRIMARY KEY...`. This is viable in Postgres 77 because the primary key does not affect the underlying storage; it is just a 78 special index which is unique and non-null. This is not the case in Cockroach, 79 where truly dropping the primary key would mean rewriting the entire table 80 before a new key is even added. 81 82 ### Example 83 Say that we have the following `users` table: 84 ```sql 85 CREATE TABLE users ( 86 id UUID PRIMARY KEY, 87 email STRING, 88 name STRING, 89 INDEX users_name_idx (name) 90 ); 91 ``` 92 93 Now we are moving to a global deployment, so we would like to add a new 94 `region` column which is part of the primary key. First we add the column, 95 which must not be nullable: 96 ```sql 97 ALTER TABLE users ADD COLUMN region STRING NOT NULL; 98 ``` 99 100 Next, we change the primary key. 101 ```sql 102 ALTER TABLE users ALTER PRIMARY KEY (region, id); 103 ``` 104 105 The old primary index has been rewritten as a secondary index named 106 `user_id_idx`. It can be dropped it if no longer needed. 107 ```sql 108 DROP INDEX users@users_id_idx; 109 ``` 110 111 ## Rewriting indexes 112 113 While simple at a high level, there are several key technical details involved 114 in applying the steps above. Consider again the example of prepending a 115 `region` column to the `users` table's primary key. The steps we want to 116 perform behind the scenes are: 117 1. Build a new primary index on `(region, id)` 118 2. Build a new secondary index `users_name_idx_rewritten` that uses the new 119 primary index's columns for key uniqueness. 120 3. Build a new secondary index `users_id_idx` which indexes the same columns as 121 the old primary key. 122 4. Swap the old primary index with the new primary index and `users_name_idx` 123 with `users_name_idx_rewritten` 124 5. Delete the old `users_name_idx` and the old primary key. 125 126 There are several reasons we cannot accomplish these steps using existing SQL 127 syntax. 128 1. The new primary index needs to be *built and encoded as a primary index*. We 129 cannot build it as a secondary index and suddenly switch to using it as a 130 primary index. 131 2. The new index `users_name_idx_rewritten` needs to be built using the new 132 primary key columns as the extra columns that it stores or indexes. This can 133 be done by manually changing the `ExtraColumnIDs` field on the 134 `IndexDescriptor` for `users_name_idx_rewritten` as the `IndexDescriptor` is 135 created. 136 3. `users_name_idx_rewritten` *cannot* be visible for reading until the primary 137 key of the table is actually changed, because depending on the new primary 138 key columns, `users_name_idx_rewritten` might not contain enough information 139 to lookup back into the old primary key. 140 4. The old index `users_name_idx` *must not be visible for reads* as soon as 141 the primary key swap occurs, as it will not have enough information to 142 lookup into the new primary key. 143 5. The new indexes being promoted into read, the primary key swap, and the old 144 indexes demoted from read must all occur in the *same atomic write*. 145 146 To address these issues, we have prototyped two new constructs. 147 148 ### Covering Indexes 149 150 In order to treat some secondary indexes like primary keys at the encoding 151 level, we introduce a notion of a `covering index`. These indexes are denoted 152 by a new bit on the `IndexDescriptor`. If an index is covering, it is 153 implicitly assumed to store all columns and be encoded as a primary key, even 154 if its not marked as a primary key on `TableDescriptor`. This bit is handled 155 in `EncodeSecondaryIndex`, where a covering index is encoded using the primary 156 key encoding. Nothing else needs to change to accommodate this addition, as it 157 is meant to be used only during primary key changes. 158 159 ### Primary Key Swap Mutation 160 161 The `TableDescriptor` struct has a field `Mutations` that is a list of 162 `DescriptorMutation` objects. These correspond to pending mutations that need 163 to be applied to the `TableDescriptor`. This list includes operations such as 164 adding and dropping indexes. In order to solve the index visibility and index 165 swap problems from above, we introduce a new `DescriptorMutation` type that 166 represents the work that needs to get done when processing a primary key swap 167 operation. 168 169 The new mutation type specifies what index ID is the new primary index and 170 which secondary indexes need to be swapped when the primary key is changed. 171 When the `SchemaChangeManager` processes this mutation, it performs all the 172 specified swaps and places index deletion mutations for the old indexes onto 173 the `TableDescriptor`'s `Mutations` queue. This all-at-once operation changes 174 the visibility of all the relevant indexes at once so that no one read 175 operations are performed operations on invalid indexes. 176 177 ## Interleaving 178 179 To support interleaving a new primary key, the ALTER PRIMARY KEY statement will 180 support an INTERLEAVE IN PARENT clause which behaves the same as it does on a 181 CREATE TABLE statement. When the new primary index is written, it will be 182 interleaved into the specified parent exactly as if it were a new table. 183 184 However, the scenario where a user wants to run ALTER PRIMARY KEY on an 185 interleaved _parent_ is trickier, because we do not support dropping an 186 interleaved parent without CASCADE, i.e. dropping its children as well. Due to 187 this restriction, we will initially disallow ALTER PRIMARY KEY on interleaved 188 parents. If a user wants to alter an interleaved parent, they will need to 189 first rewrite the children so that they are no longer interleaved into that 190 table. If the child is a primary index, this means altering the primary key. If 191 it is a secondary index, this means writing a new index and then dropping the 192 old one. Once the parent index is altered, the user may want to alter the 193 children once more to interleave them into the new parent. 194 195 Admittedly this is a somewhat cumbersome process but it represents the minimum 196 work necessary to support the interleaved use case. See the Follow-up Work 197 section for potential improvements. 198 199 ## Column Families 200 201 We currently enforce that primary key columns can only be in the first column 202 family (ID 0). However, this invariant may be violated when the primary key 203 changes. We propose removing this restriction so that primary key columns may 204 appear in any column family. Note that in practice this only affects secondary 205 indexes, since the primary key columns are encoded in every column family's keys 206 for the primary index. 207 208 ## Foreign Keys 209 210 Foreign keys currently have a hard dependency on an index of the table they 211 reference. We intend to remove that dependency and instead have foreign 212 key checks use whatever indexes are available. However, this work will not 213 necessarily be completed for 20.1. 214 215 Because of this, whenever we rewrite a secondary index while altering a primary 216 key, we must update any foreign key references to point to the new index. 217 Furthermore, if any foreign keys depend on the old primary index, we must update 218 them to instead depend on the new secondary index which represents that index. 219 Specifically, this means updating the LegacyReferencedIndex field for each 220 relevant ForeignKeyConstraint. 221 222 ## Rollbacks 223 Viewing the primary key change process as 3 distinct stages greatly complicates 224 the procedure for rolling back from an in-progress primary key change. However, 225 we can view the asynchronous index cleanup as not part of the job, and consider 226 the primary key change finished at the point of the atomic index swap. Coincidentally, 227 that is how it appears to users as well! In this case, the only place where we have 228 to worry about a cancellation/failure occurring is during the index building phase. 229 If the primary key change process is requested to rollback during the index building 230 phase, each new index can be rolled back, and the primary key swap mutation can be 231 removed. 232 233 ## Test Plan 234 235 We plan to test primary key changes at several levels: 236 237 ### Unit Tests 238 Implement a series of test hooks within the process of primary key changes. 239 These test hooks can be swapped out with waiting/notification functions 240 during unit tests to verify properties like index visibility at different 241 points of the primary index change process. 242 243 ### Roachtests 244 Run some long-running workloads that interact with a large table before, 245 during, and after a primary key change. 246 247 ### Logictests 248 Implement a logictest mutation that when encountering a `CREATE TABLE` 249 statement, alters the primary key to a new primary key with the same columns as 250 the old primary key. This test will give us a large amount of coverage of SQL 251 operations on the transformed table. 252 253 ### Schema Changer Specific Tests 254 Set up a variety of schema change style tests, including cancellations, 255 rollbacks, node failures, etc. 256 257 # Follow-up, Out-of-Scope Work 258 259 These are intended as a starting point for future work, capturing any discussion 260 that was had in the process of forming the above plan, but is *not* in scope at 261 this time. 262 263 ## Supporting Postgres Syntax 264 265 Mainly for ORM compatibility, we should try to support the Postgres syntax for 266 changing a primary key, where the old constraint is dropped and then the new 267 one is added. We might accomplish this by detecting when these statements 268 appear together in the same transaction and then proceeding with ALTER PRIMARY 269 KEY as usual. In this case, we would not create a corresponding secondary index 270 for the old primary key, since the user explicitly specified that it should be 271 dropped. 272 273 ## Preserving the Old Primary Index 274 275 Rather than rewriting the old primary index as an equivalent secondary index, 276 we could keep it around as a secondary index with the Covering bit set. This 277 has two potential advantages. One is that we would rewrite less data. The other 278 is that we could allow ALTER PRIMARY KEY on interleaved parents, because the 279 children could remain interleaved into the old primary key. One challenge here 280 would be keeping the old primary index up-to-date when columns are added or 281 dropped. There would also be new complexity from allowing tables to be 282 interleaved into non-primary indexes, which is currently not permitted. 283 284 ## Supporting DROP on Interleaved Parents 285 286 We could also start allowing ALTER PRIMARY KEY on interleaved parents if 287 https://github.com/cockroachdb/cockroach/issues/8036 were completed. This is 288 perhaps less ideal than preserving the old primary index, because the child 289 tables would be interleaved into nothing until their primary keys were also 290 altered. 291 292 ## Prepending Columns to Indexes 293 294 We anticipate that a common use case for index changes is when a customer is 295 trying to scale their database to multiple regions. In this case, it is common 296 for users to need to prepend something like a `region` column so that they can 297 easily apply geo-partitioning strategies. This command syntactically could be 298 something like `ALTER TABLE t PREFIX INDEXES BY (region)` and would involve 299 rewriting each index in the table. This is not directly applicable to the 300 general use case of changing the primary key, but is a step that could help the 301 "path to global" user story.