github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20160624_sql_interleaved_tables.md (about)

     1  - Feature Name: sql_interleaved_tables
     2  - Status: completed
     3  - Start Date: 2016-06-24
     4  - Authors: Daniel Harrison
     5  - RFC PR: [#7465](https://github.com/cockroachdb/cockroach/pull/7465)
     6  - Cockroach Issue: [#2972](https://github.com/cockroachdb/cockroach/issues/2972)
     7  
     8  # Summary
     9  
    10  Support interleaving the data for sql tables such that the rows alternate in the
    11  kv map. This allows a user to optimize reads on tables that are frequently
    12  joined.
    13  
    14  The following example illustrates interleaving order data next to the associated
    15  customers. This means that a join on `customer.id = order.customer_id` will be
    16  computable without remote kv lookups.
    17  
    18      Customer 1 data
    19        Customer 1 Order 1 data
    20        Customer 1 Order 2 data
    21      Customer 2 data
    22        Customer 2 Order 1 data
    23  
    24  # Motivation
    25  
    26  A join of two tables accesses different parts of the kv map that are usually on
    27  different ranges. If data from one table is frequently accessed by joining it to
    28  another table, it is advantageous for performance to keep them nearby in the kv
    29  map so they're likely to be on the same range. Additionally, secondary index
    30  data can be interleaved, keeping it close to the primary data.
    31  
    32  # Detailed design
    33  
    34  Interleaved tables will be surfaced to users by a syntax addition: `CREATE TABLE
    35  orders (customer_id INT, order_id INT, cost DECIMAL, PRIMARY KEY (customer_id,
    36  order_id)) INTERLEAVE IN PARENT customers (field_1, field_2)`.
    37  
    38  It is required that the field types of the primary key of the target (or
    39  "parent") table are a prefix of the field types of the primary key of the
    40  created (or "interleaved") table. In the interest of explicitness, the declared
    41  fields must match this common prefix.
    42  
    43  Alternately, a secondary index could be interleaved into the parent table
    44  instead. The syntax for this is `CREATE INDEX name (fields) INTERLEAVE INTO
    45  parent` or `CREATE TABLE ... (..., INDEX name (fields) INTERLEAVE INTO parent)`.
    46  Note that each of a table's indexes (primary and secondary) can be interleaved
    47  into different parent tables.
    48  
    49  It is currently not possible to change the primary index of a table after
    50  creation; similarly the interleave relationship of the primary index also cannot
    51  be changed. A secondary index's interleave can be changed by adding a new
    52  secondary index and then dropping the old one.
    53  
    54  It is frequently desirable to enforce that there is never a row in an
    55  interleaved table without a corresponding row in the parent table. This can be
    56  done with a table-level foreign key constaint, but a shorthand will also be
    57  created: `INTERLEAVE IN PARENT customers ON DELETE CASCADE|RESTRICT`. `ON DELETE
    58  CASCADE` will delete interleaved rows if a parent is deleted. `ON DELETE
    59  RESTRICT` will allow removal of parent rows with no interleaved rows, but will
    60  error if interleaved rows would be orphaned. A missing `ON DELETE` clause will
    61  fall back to table-level (or no) constraints.
    62  
    63  A new field, `repeated Interleaves InterleaveDescriptor` will be added to
    64  TableDescriptor. `InterleaveDescriptor` will contain `Parent sqlbase.TableID`
    65  with the ID of the parent table, `Index sqlbase.IndexID` with the id of the
    66  index to be interleaved, and `PrefixLen` with a count of how many fields are
    67  shared between the index and the parent primary key.
    68  
    69  The keys for a sql table represented in the kv map are
    70  
    71      /<tableID>/1/<pkCol1>/<pkCol2>/0
    72      /<tableID>/1/<pkCol1>/<pkCol2>/<familyID>/<suffix>
    73  
    74  Where `pkColN` is the key encoding of the Nth primary index column, there is one
    75  row per [family of columns](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20151214_sql_column_families.md),
    76  and `suffix` counts how many bytes are stripped off the end to determine where
    77  ranges can split. The 0 row is always present and acts as a sentinel for the
    78  existence of the row. The single `0` on the end is an optimization that reuses
    79  the suffix value as a `familyID`, (it's effectively the same as `/0/1`).
    80  
    81  A table `orders` interleaved into a table `customers` will be encoded as
    82  
    83      # Customer data
    84      /customers/1/<customerID>/0
    85      /customers/1/<customerID>/<familyID>/1
    86      # Order data
    87      /customers/1/<customerID>/<interleavedTableSentinel>/orders/<indexID>/<orderID>/0
    88      /customers/1/<customerID>/<interleavedTableSentinel>/orders/<indexID>/<orderID>/<familyID>/suffix
    89  
    90  Where `interleavedTableSentinel` is the byte `0xfe`, which is the same as
    91  `EncodeNotNullDescending` and so is available to use here.
    92  
    93  We currently choose `suffix` such that each row in a table is forced to stay
    94  entirely in one range. To avoid size limitations on the interleaved table, its
    95  rows are treated similarly (i.e., not forced into the same range as the parent
    96  table). In practice, they'll be on the same range most of the time.
    97  
    98  Range splits are currently along hard boundaries. It would be advantageous to
    99  introduce prefer-not-to-split boundaries where splits are allowed, but avoided
   100  if possible. These would be placed between interleaved table rows so that they
   101  stay on the same range if possible, but not at the cost of unwieldy range sizes.
   102  This is considered future work and out of scope for the initial implementation.
   103  
   104  This encoding supports interleaving multiple tables into one parent. It also
   105  supports arbitrary levels of nesting: grandparent and great-grandparent
   106  interleaves.
   107  
   108  Interleaving row data of multiple tables inherently adds complexity to scans and
   109  deletes. Deletes such as `DELETE FROM orders WHERE id < 100` currently rely on a
   110  fast path which will have to be disabled for interleaved tables, unless the
   111  table was declared as `ON DELETE CASCASE`. Fortunately, it is expected that
   112  `CASCADE` will be the common case, and so it may not be a problem in practice.
   113  If it is, a new kv operation with knowledge of sql keys can be created to delete
   114  a range of data while skipping interleaved rows.
   115  
   116  When scanning either a parent or interleaved table, the scan code will have to
   117  skip over the other table's data. This logic will initially be implemented using
   118  existing kv operations. One approach is to iterate the entire range and ignore
   119  the other table's rows. Or, when a key from the other table is enountered, a
   120  `Scan` can be constructed to skip past it. The former may overfetch many keys
   121  and the latter may involve many round-trips, but the two approaches can be
   122  combined to minimize the worst case behaviors. Additionally, with distributed
   123  SQL, this logic would normally run on the range lease holder so it would be a local
   124  operation. Finally, if performance issues do surface in practice, a new kv
   125  operation, similar to the one mentioned for deletes, can be created which scans
   126  a range of data while skipping interleaved rows.
   127  
   128  # Drawbacks
   129  
   130  - Even with the mentioned optimizations, the scan and delete-range operations
   131  will be slower for any table interleaved with another. This can be mitigated
   132  somewhat by pushing knowledge of the sql key structure down into kv, but this
   133  still won't be quite as fast and will add additional complexity to kv. However,
   134  adding the feature doesn't affect the speed of non-interleaved tables.
   135  
   136  # Alternatives
   137  
   138  - A syntax extension could be added to, when the data is known to be small,
   139  allow the user to force all rows of an interleaved table to stay with their
   140  parent rows. Cockroach avoids tuning parameters when possible, so instead the
   141  prefer-not-to-split is offered as future work.