github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20180413_alter_primary_key.md (about)

     1  - Feature Name: Alter Primary Key
     2  - Status: Draft
     3  - Start Date: 2018-04-11
     4  - Authors: David Taylor, Rohan Yadav, Solon Gordon
     5  - RFC PR: [25208](https://github.com/cockroachdb/cockroach/pull/25208)
     6  - Cockroach Issues: #19141
     7  
     8  # Table of Contents
     9  - [Summary](#summary)
    10  - [Motivation](#motivation)
    11  - [High-level summary](#high-level-summary)
    12  - [Implementation plan](#implementation-plan)
    13  - [Follow-up, Out-of-Scope Work](#follow-up-out-of-scope-work)
    14  
    15  # Summary
    16  
    17  Users should be able to change the primary key of an existing table.
    18  
    19  # Motivation
    20  
    21  Schemas change over time, and the primary key is particularly special in
    22  Cockroach: it determines where the actual row data is stored.
    23  
    24  The choice of primary key -- the choice of columns and their order -- determines
    25  how a table's rows are physically stored in the database, which can be critical
    26  in determining performance of various workloads and, with the introduction of
    27  user-controlled partitioning based on primary key prefixes, can have significant
    28  operational considerations as well.
    29  
    30  As applications, workloads and operational requirements evolve over time, the
    31  choice of primary key for a given table may evolve as well, making the ability
    32  to change it an important feature of a maintainable database. In particular,
    33  the inability to alter primary keys is a painful roadblock for customers who
    34  start with a single-region Cockroach cluster and later want to expand into a
    35  partitioned, global deployment. The ability to change primary keys also reduces
    36  friction during prototyping when users are still figuring out their schemas.
    37  
    38  # High-level Summary
    39  
    40  Note: This document assumes familiarity with the content of
    41  [`Structured data encoding in CockroachDB SQL`](`docs/tech-notes/encoding.md`).
    42  
    43  Our approach for changing primary keys is simple at a high level. The user will
    44  specify which columns they want their new primary key to contain, and then
    45  CockroachDB will start a long-running schema change operation. It will compute
    46  which secondary indexes need to be rewritten, start a job where these new
    47  indexes and the new primary key are created, and finally swap out the old
    48  indexes for the new ones.
    49  
    50  There are two reasons that a secondary index may need to be rewritten. The
    51  first is if it does not index or store all the columns from the new primary
    52  key. This property is required to support index joins, where we first look up
    53  into a secondary index and then look up the corresponding value in the primary
    54  index.
    55  
    56  The second is that a secondary index may depend on the primary index for
    57  "uniqueness." In particular, inverted indexes, non-unique indexes, and unique
    58  indexes with nullable columns all encode primary key columns in their keys to
    59  guarantee that each key is unique. This means that if the primary key changes
    60  and the old primary key is deleted, that uniqueness guarantee no longer holds.
    61  
    62  In order to prevent unexpected performance degradation to existing queries, we
    63  will also rewrite the old primary index as a secondary index with the same key,
    64  unless otherwise specified by the user. (This step can be omitted if the old
    65  primary key columns form a prefix of the new ones.)
    66  
    67  # Detailed Explanation
    68  
    69  ## Syntax
    70  
    71  We propose to use the syntax `ALTER TABLE t ALTER PRIMARY KEY (col1, col2,
    72  ...)`. This will initiate the long-running schema change described above.
    73  
    74  This differs from Postgres syntax, where the old primary key constraint is
    75  first dropped via `ALTER TABLE t DROP CONSTRAINT...` and then the new index is
    76  added via  `ALTER TABLE t ADD PRIMARY KEY...`. This is viable in Postgres
    77  because the primary key does not affect the underlying storage; it is just a
    78  special index which is unique and non-null. This is not the case in Cockroach,
    79  where truly dropping the primary key would mean rewriting the entire table
    80  before a new key is even added.
    81  
    82  ### Example
    83  Say that we have the following `users` table:
    84  ```sql
    85  CREATE TABLE users (
    86    id UUID PRIMARY KEY,
    87    email STRING,
    88    name STRING,
    89    INDEX users_name_idx (name)
    90  );
    91  ```
    92  
    93  Now we are moving to a global deployment, so we would like to add a new
    94  `region` column which is part of the primary key. First we add the column,
    95  which must not be nullable:
    96  ```sql
    97  ALTER TABLE users ADD COLUMN region STRING NOT NULL;
    98  ```
    99  
   100  Next, we change the primary key.
   101  ```sql
   102  ALTER TABLE users ALTER PRIMARY KEY (region, id);
   103  ```
   104  
   105  The old primary index has been rewritten as a secondary index named
   106  `user_id_idx`. It can be dropped it if no longer needed.
   107  ```sql
   108  DROP INDEX users@users_id_idx;
   109  ```
   110  
   111  ## Rewriting indexes
   112  
   113  While simple at a high level, there are several key technical details involved
   114  in applying the steps above. Consider again the example of prepending a
   115  `region` column to the `users` table's primary key. The steps we want to
   116  perform behind the scenes are:
   117  1. Build a new primary index on `(region, id)`
   118  2. Build a new secondary index `users_name_idx_rewritten` that uses the new
   119     primary index's columns for key uniqueness.
   120  3. Build a new secondary index `users_id_idx` which indexes the same columns as
   121     the old primary key.
   122  4. Swap the old primary index with the new primary index and `users_name_idx`
   123     with `users_name_idx_rewritten`
   124  5. Delete the old `users_name_idx` and the old primary key.
   125  
   126  There are several reasons we cannot accomplish these steps using existing SQL
   127  syntax.
   128  1. The new primary index needs to be *built and encoded as a primary index*. We
   129     cannot build it as a secondary index and suddenly switch to using it as a
   130     primary index.
   131  2. The new index `users_name_idx_rewritten` needs to be built using the new
   132     primary key columns as the extra columns that it stores or indexes. This can
   133     be done by manually changing the `ExtraColumnIDs` field on the
   134     `IndexDescriptor` for `users_name_idx_rewritten` as the `IndexDescriptor` is
   135     created.
   136  3. `users_name_idx_rewritten` *cannot* be visible for reading until the primary
   137     key of the table is actually changed, because depending on the new primary
   138     key columns, `users_name_idx_rewritten` might not contain enough information
   139     to lookup back into the old primary key.
   140  4. The old index `users_name_idx` *must not be visible for reads* as soon as
   141     the primary key swap occurs, as it will not have enough information to
   142     lookup into the new primary key.
   143  5. The new indexes being promoted into read, the primary key swap, and the old
   144     indexes demoted from read must all occur in the *same atomic write*.
   145  
   146  To address these issues, we have prototyped two new constructs.
   147  
   148  ### Covering Indexes
   149  
   150  In order to treat some secondary indexes like primary keys at the encoding
   151  level, we introduce a notion of a `covering index`. These indexes are denoted
   152  by a new bit on the `IndexDescriptor`. If an index is covering, it is
   153  implicitly assumed to store all columns and be encoded as a primary key, even
   154  if its not marked as a primary key on `TableDescriptor`. This bit is handled
   155  in `EncodeSecondaryIndex`, where a covering index is encoded using the primary
   156  key encoding. Nothing else needs to change to accommodate this addition, as it
   157  is meant to be used only during primary key changes.
   158  
   159  ### Primary Key Swap Mutation
   160  
   161  The `TableDescriptor` struct has a field `Mutations` that is a list of
   162  `DescriptorMutation` objects. These correspond to pending mutations that need
   163  to be applied to the `TableDescriptor`. This list includes operations such as
   164  adding and dropping indexes. In order to solve the index visibility and index
   165  swap problems from above, we introduce a new `DescriptorMutation` type that
   166  represents the work that needs to get done when processing a primary key swap
   167  operation.
   168  
   169  The new mutation type specifies what index ID is the new primary index and
   170  which secondary indexes need to be swapped when the primary key is changed.
   171  When the `SchemaChangeManager` processes this mutation, it performs all the
   172  specified swaps and places index deletion mutations for the old indexes onto
   173  the `TableDescriptor`'s `Mutations` queue. This all-at-once operation changes
   174  the visibility of all the relevant indexes at once so that no one read
   175  operations are performed operations on invalid indexes.
   176  
   177  ## Interleaving
   178  
   179  To support interleaving a new primary key, the ALTER PRIMARY KEY statement will
   180  support an INTERLEAVE IN PARENT clause which behaves the same as it does on a
   181  CREATE TABLE statement. When the new primary index is written, it will be
   182  interleaved into the specified parent exactly as if it were a new table.
   183  
   184  However, the scenario where a user wants to run ALTER PRIMARY KEY on an
   185  interleaved _parent_ is trickier, because we do not support dropping an
   186  interleaved parent without CASCADE, i.e. dropping its children as well. Due to
   187  this restriction, we will initially disallow ALTER PRIMARY KEY on interleaved
   188  parents. If a user wants to alter an interleaved parent, they will need to
   189  first rewrite the children so that they are no longer interleaved into that
   190  table. If the child is a primary index, this means altering the primary key. If
   191  it is a secondary index, this means writing a new index and then dropping the
   192  old one. Once the parent index is altered, the user may want to alter the
   193  children once more to interleave them into the new parent.
   194  
   195  Admittedly this is a somewhat cumbersome process but it represents the minimum
   196  work necessary to support the interleaved use case. See the Follow-up Work
   197  section for potential improvements.
   198  
   199  ## Column Families
   200  
   201  We currently enforce that primary key columns can only be in the first column
   202  family (ID 0). However, this invariant may be violated when the primary key
   203  changes. We propose removing this restriction so that primary key columns may
   204  appear in any column family. Note that in practice this only affects secondary
   205  indexes, since the primary key columns are encoded in every column family's keys
   206  for the primary index.
   207  
   208  ## Foreign Keys
   209  
   210  Foreign keys currently have a hard dependency on an index of the table they
   211  reference. We intend to remove that dependency and instead have foreign
   212  key checks use whatever indexes are available. However, this work will not
   213  necessarily be completed for 20.1.
   214  
   215  Because of this, whenever we rewrite a secondary index while altering a primary
   216  key, we must update any foreign key references to point to the new index.
   217  Furthermore, if any foreign keys depend on the old primary index, we must update
   218  them to instead depend on the new secondary index which represents that index.
   219  Specifically, this means updating the LegacyReferencedIndex field for each
   220  relevant ForeignKeyConstraint.
   221  
   222  ## Rollbacks
   223  Viewing the primary key change process as 3 distinct stages greatly complicates
   224  the procedure for rolling back from an in-progress primary key change. However,
   225  we can view the asynchronous index cleanup as not part of the job, and consider
   226  the primary key change finished at the point of the atomic index swap. Coincidentally,
   227  that is how it appears to users as well! In this case, the only place where we have
   228  to worry about a cancellation/failure occurring is during the index building phase.
   229  If the primary key change process is requested to rollback during the index building
   230  phase, each new index can be rolled back, and the primary key swap mutation can be
   231  removed.
   232  
   233  ## Test Plan
   234  
   235  We plan to test primary key changes at several levels:
   236  
   237  ### Unit Tests
   238  Implement a series of test hooks within the process of primary key changes.
   239  These test hooks can be swapped out with waiting/notification functions
   240  during unit tests to verify properties like index visibility at different
   241  points of the primary index change process.
   242  
   243  ### Roachtests
   244  Run some long-running workloads that interact with a large table before,
   245  during, and after a primary key change.
   246  
   247  ### Logictests
   248  Implement a logictest mutation that when encountering a `CREATE TABLE`
   249  statement, alters the primary key to a new primary key with the same columns as
   250  the old primary key. This test will give us a large amount of coverage of SQL
   251  operations on the transformed table.
   252  
   253  ### Schema Changer Specific Tests
   254  Set up a variety of schema change style tests, including cancellations,
   255  rollbacks, node failures, etc.
   256  
   257  # Follow-up, Out-of-Scope Work
   258  
   259  These are intended as a starting point for future work, capturing any discussion
   260  that was had in the process of forming the above plan, but is *not* in scope at
   261  this time.
   262  
   263  ## Supporting Postgres Syntax
   264  
   265  Mainly for ORM compatibility, we should try to support the Postgres syntax for
   266  changing a primary key, where the old constraint is dropped and then the new
   267  one is added. We might accomplish this by detecting when these statements
   268  appear together in the same transaction and then proceeding with ALTER PRIMARY
   269  KEY as usual. In this case, we would not create a corresponding secondary index
   270  for the old primary key, since the user explicitly specified that it should be
   271  dropped.
   272  
   273  ## Preserving the Old Primary Index
   274  
   275  Rather than rewriting the old primary index as an equivalent secondary index,
   276  we could keep it around as a secondary index with the Covering bit set. This
   277  has two potential advantages. One is that we would rewrite less data. The other
   278  is that we could allow ALTER PRIMARY KEY on interleaved parents, because the
   279  children could remain interleaved into the old primary key. One challenge here
   280  would be keeping the old primary index up-to-date when columns are added or
   281  dropped. There would also be new complexity from allowing tables to be
   282  interleaved into non-primary indexes, which is currently not permitted.
   283  
   284  ## Supporting DROP on Interleaved Parents
   285  
   286  We could also start allowing ALTER PRIMARY KEY on interleaved parents if
   287  https://github.com/cockroachdb/cockroach/issues/8036 were completed. This is
   288  perhaps less ideal than preserving the old primary index, because the child
   289  tables would be interleaved into nothing until their primary keys were also
   290  altered.
   291  
   292  ## Prepending Columns to Indexes
   293  
   294  We anticipate that a common use case for index changes is when a customer is
   295  trying to scale their database to multiple regions. In this case, it is common
   296  for users to need to prepend something like a `region` column so that they can
   297  easily apply geo-partitioning strategies. This command syntactically could be
   298  something like `ALTER TABLE t PREFIX INDEXES BY (region)` and would involve
   299  rewriting each index in the table. This is not directly applicable to the
   300  general use case of changing the primary key, but is a step that could help the
   301  "path to global" user story.