github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20181003_schemas.md

github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20181003_schemas.md (about)

     1  - Feature Name: User-defined SQL logical schemas
     2  - Status: superseded by #48276
     3  - Start Date: 2018-10-03
     4  - Authors: (your name here!), knz
     5  - RFC PR: #30916, originally #13319
     6  - Cockroach Issue: #26443
     7  
     8  **Remember, you can submit a PR with your RFC before the text is
     9  complete. Refer to the [README](README.md#rfc-process) for details.**
    10  
    11  # Summary
    12  
    13  This RFC proposes to introduce an additional level in the namespace
    14  structure of stored objects (tables, views, sequences), called the
    15  *schema namespace*, in between the database and objects themselves,
    16  and enable SQL client to create their own schemas side-by-side inside
    17  a database.
    18  
    19  This makes it possible to have two tables with the same name, say
    20  `orders`, in the same database, by making them live in separate
    21  schemas: for example, `mydb.testschema.orders` and
    22  `mydb.prodschema.orders`. This is useful for compatibility and to ease
    23  hosting of multiple users/apps on a single cluster.
    24  
    25  # Motivation
    26  
    27  There are two motivations:
    28  
    29  A. compatibility with PostgreSQL -- ORMs and client apps expect this to
    30     work.
    31  B. simplify the management of multi-user or multi-app clusters.
    32  
    33  The scenario for the point B goes as follows:
    34  
    35  1. suppose hosting company HostedInc, with customer TheUser.
    36  2. TheUser staff wants to deploy multiple apps over time but does not
    37     want to inform HostedInc continuously about this.
    38  3. Meanwhile HostedInc doesn't want to provide root access to TheUser,
    39     because that would cause risks to the stability of the hosted cluster.
    40  4. The question thus arises: how to enable TheUser to deploy multiple
    41     apps without providing root access? *Currently root access is
    42     required to create new databases.*
    43  
    44  The proposed solution, which incidentally is both industry-standard
    45  and already expected by users, goes as follows:
    46  
    47  1. HostedInc creates a single database for TheUser.
    48  2. HostedInc grants the "create schema" privilege to TheUser.
    49  3. TheUser creates 1 schema per app in their database, and grants
    50     finer-tuned permissions on each schema for each app.
    51  4. The app developers (or the apps themselves) create the tables/objects
    52     they need in their respective schemas.
    53  
    54  
    55  # Guide-level explanation
    56  
    57  Currently CockroachDB supports a hierarchical system of two levels to
    58  organize stored objects (tables/views/sequences):
    59  
    60  ```
    61  database
    62    |
    63    +---- object
    64  ```
    65  
    66  For example:
    67  
    68  ```
    69  bank
    70    |
    71    +---- users
    72    |
    73    +---- accounts
    74  ```
    75  
    76  To *access* an object, CockroachDB currently allows queries to specify
    77  a name with 1, 2 or 3 components:
    78  
    79  ```
    80     > select * from users;
    81     > select * from bank.users;
    82     > select * from bank.public.users;
    83  ```
    84  
    85  The 1-part form assumes that `bank` has been configured as "current
    86  database" with `SET database = bank` or `USE bank`.
    87  
    88  In the 3-part form, the middle part (`public` in this example) is
    89  currently ignored [1].
    90  
    91  This RFC proposes to change the hierarchical model for organizing
    92  stored objects as follows:
    93  
    94  ```
    95  database
    96    |
    97    +---- schema
    98            |
    99            +---- object
   100  ```
   101  
   102  For example:
   103  
   104  ```
   105  bank
   106    |
   107    +---- production
   108    |        |
   109    |        +---- users
   110    |        |
   111    |        +---- accounts
   112    |
   113    +---- testing
   114             |
   115             +---- users
   116             |
   117             +---- accounts
   118  ```
   119  
   120  With this structure, an app can access a table as follows:
   121  
   122  - using a simple 1-part name, e.g. `users`, both the "current database"
   123    (`SET database` / `USE`) and the *current schema* (`SET
   124    search_path`) decide which object is designated.
   125  - using a 3-part name, e.g. `bank.production.users` all the components
   126    are clearly specified.
   127  - using a 2-part name, we have to deal with two different cases:
   128  
   129    - PostgreSQL-compatible applications will use e.g. `production.users`, where
   130      the first part specifies the schema and the second part the object.
   131    - CockroachDB-specific applications developed pre-2.2 will use e.g.
   132      `bank.users`.
   133  	
   134    This ambiguity is *already* supported in CockroachDB since version
   135    2.0 as follows:
   136    
   137    - if the name has 2 components and the 1st component refers to a valid
   138      schema name, then it is understood to refer to that schema.
   139    - otherwise if the 1st component refers to a valid database name,
   140      then use that database and the schema name `public`.
   141    - otherwise an error is reported.
   142  
   143  [1] the explanation above fully avoids discussing virtual schemas like
   144  `pg_catalog` and `information_schema`. These are not relevant to
   145  this section.
   146  
   147  # Reference-level explanation
   148  
   149  See the previous section.
   150  
   151  In addition, here one should understand that CockroachDB internally
   152  already has a very good understanding of PostgreSQL schemas, because
   153  that understanding is needed to properly support `pg_catalog` and
   154  `information_schema` (these are schemas).
   155  
   156  So really, CockroachDB already has adequate schema support in that the
   157  name resolution rules in the SQL dialect are already suitable.
   158  
   159  The missing block is the ability to create 1) additional 2) stored 3)
   160  user-defined schemas.
   161  
   162  1. additional: currently just one stored schema is supported. It is anonymous
   163     in storage (`system.namespace` knows nothing about it) and called
   164     `public` in SQL. We want other schemas beside `public`.
   165  2. stored: the schemas must continue to exist across cluster restarts,
   166     so they cannot be in-memory data structures only.
   167  3. user-defined: client apps must be able to manage their own schemas
   168     with CREATE/ALTER/DROP SCHEMA, subject to  syntax and
   169     permission checks compatible with PostgreSQL.
   170  
   171  ## Detailed design
   172  
   173  Technically the changes impact:
   174  
   175  - the format of table descriptors: the parent of a table is now a schema not a database.
   176  
   177  - the `system.namespace` table: this must support name resolution for schemas.
   178  
   179  - cluster initialization/migration: a stored schema named `public`
   180    must be created upon cluster initialization.
   181  
   182  - the SQL/KV name resolution code (`sqlbase/resolver.go`): this must resolve other
   183    schema names besides "public" and virtual schema names.
   184  
   185  - if new descriptors are added for schemas, we must add new leasing
   186    logic for schemas too like we have for tables.
   187  
   188  - the introspection facilities in `pg_catalog`, `information_schema` and `crdb_internal`: this
   189    must iterate over all stored schemas.
   190  
   191  - Logic for GRANT/REVOKE: this must support privileges on schemas in addition to databases/tables.
   192  
   193  - permission checking for CREATE (and perhaps SHOW): this must test the create privilege on
   194    the target schema not database.
   195  
   196  - `cockroach dump`: this must be extended to support dumping a single
   197    schema inside a database. It must also be extended to properly
   198    qualify schema names inside the dump data.
   199  
   200  - BACKUP/RESTORE: this must be extended to support backing up / restoring a single schema, and
   201    naming schemas when listing target objects.
   202  
   203  - IMPORT CSV: this must be extended to recognize a schema name in the import target.
   204  
   205  - IMPORT PGDUMP: this must be extended to disable the "flattening" of
   206    the schema information contained in the dump to "public", and
   207    instead preserve/use the schema information in the dump.
   208  
   209  - EXPORT: probably no change needed, but QA must verify that users can
   210    use EXPORT for single schemas, or tables across different schemas.
   211  
   212  - the web UI which presents the database objects in a cluster and the
   213    admin RPC endpoints that support these features in the web UI. This
   214    must reveal schemas between databases and tables.
   215  
   216    Here maybe no technical changes are required (the web UI already
   217    knows about schemas, insofar it already knows about virtual
   218    schemas), but QA must verify user-defined schemas are surfaced
   219    properly.
   220  
   221  Optionally:
   222  
   223  - add zone configuration propagation/inheritance from database to
   224    schema, then from schema to table. This is optional insofar that
   225    schemas don't have data of their own, so it could be possible to
   226    inherit directly from database to table even with schemas defined.
   227  
   228    The ability to customize zones per schema would be a feature that
   229    eases the life of operators/developers.
   230  
   231    (Ben notes: “I think this is worth doing at the same time that we introduce user-defined schemas.”)
   232  
   233  ### Data structures
   234  
   235  Relevant quote:
   236  
   237  "Show me your flowcharts [code] and conceal your tables [data
   238  structures], and I shall continue to be mystified. Show me your tables
   239  [data structures], and I won’t usually need your flowcharts [code];
   240  they’ll be obvious." -- Fred Brooks, Turing Awards recipient and
   241  author of the Mythical Man-Month.
   242  
   243  Central to the new feature in this RFC is the question of how the name
   244  resolution works.
   245  
   246  ### Current mechanism
   247  
   248  Currently in CockroachDB we have the following connected components:
   249  
   250  - database descriptors, that know nothing about tables;
   251  
   252  - table descriptors, that have their own object ID, and a ParentID
   253    field that refers to the database descriptor.
   254  
   255  - the table `system.namespace` maps:
   256  
   257    - IDs to names (for both databases and tables), with an index on names.
   258      This supports the name resolution name -> ID.
   259  	
   260      `select id from system.namespace where name = <requested>`
   261  	
   262  	Note: the name resolution for databases? (TBD)
   263  
   264    - IDs to ParentIDs. This supports listing "all tables of a
   265      given database".
   266  	
   267  	`select id from system.namespace where parentID = <requested>`
   268  
   269    In `system.namespace`, the listed ParentID for a database descriptor
   270    is 0. This value is used to distinguish databases from tables and
   271    can be used to list all databases:
   272    
   273      `select id, name from system.namespace where parentID = 0`
   274  
   275  In this environment, renaming a table (even across databases) only
   276  requires rewriting entries in `system.namespace`.
   277  
   278  This RFC proposes to extend this system in one of the ways suggested
   279  in the following sub-sections, subject to discussion and
   280  experimentation.
   281  
   282  ### Alternative A: lightweight schemas
   283  
   284  In this solution, 1) schemas do not have their own *descriptors*. 2)
   285  low-level schema IDs are global across all databases.
   286  
   287  (I like the idea to avoid descriptors because that would ask complex
   288  questions about leases -- for reference, currently only table
   289  descriptors are treated transactionally. Database descriptors are not,
   290  and this was deemed acceptable because of some arbitrary perception
   291  that databases are only rarely created/dropped/renamed. When/if we
   292  implement user-defined schemas, we can't make the same assumption
   293  about schemas as we did about databases, and so we must add lease
   294  logic for schemas too. This may be complex.)
   295  
   296  - the field `ParentID` in `TableDescriptor` would then target schemas, not databases.
   297  - database descriptors would remain unchanged.
   298  - `system.namespace` does not change.
   299  - CREATE DATABASE and the migration for pre-20.1 clusters would create
   300    a `public` schema in every database and re-populate
   301    `ParentID` in every table descriptor to target its respective `public` schema.
   302  
   303  - CREATE SCHEMA would simply add a row to the `system.namespace` table.
   304  
   305  - DROP DATABASE, DROP SCHEMA would iterate over `system.namespace` accordingly.
   306  
   307  Example:
   308  
   309  ```
   310  bank
   311    |
   312    +---- production
   313    |        |
   314    |        +---- users
   315    |        |
   316    |        +---- accounts
   317    |
   318    +---- testing
   319             |
   320             +---- users
   321             |
   322             +---- accounts
   323  ```
   324  
   325  In this solution `system.namespace` contains:
   326  
   327  | parentID | name       | Id |
   328  |----------|------------|----|
   329  | 0        | bank       | 1  |
   330  | 1        | public     | 2  |
   331  | 1        | production | 3  |
   332  | 1        | testing    | 4  |
   333  | 3        | users      | 5  |
   334  | 3        | accounts   | 6  |
   335  | 4        | users      | 7  |
   336  | 4        | accounts   | 8  |
   337  
   338  The queries to access tables become:
   339  
   340  - to resolve the parent schema and database of a table:
   341  
   342    ```
   343  	 select n1.parentID as databaseID, n1.id as schemaID
   344  	   from system.namespace n1, system.namespace n2
   345      where n1.id = n2.parentID
   346  	    and n2.id = <requested>
   347    ```
   348  
   349  - to list all schemas inside a database:
   350  
   351    `select id from system.namespace where parentID = <requested>`
   352  
   353  - to list all tables inside a *schema*:
   354  
   355    `select id from system.namespace where parentID = <requested>`
   356  
   357  - to list all tables inside a *database*:
   358  
   359    ```
   360  	select n1.id
   361  	  from system.namespace n1, system.namespace n2
   362     where n1.parentID = n2.id
   363  	   and n2.parentID = <requested>
   364    ```
   365  
   366  In this solution, any additional metadata associated with a schema
   367  (for example, privileges and grants for creating new tables) would be
   368  stored in a separate system table. (to be defined by further experimentation - see also [#2939])
   369  
   370  [#2939] https://github.com/cockroachdb/cockroach/issues/2939
   371  
   372  Note that it is not necessary to verify schema-level privileges for
   373  non-DDL SQL queries, so the creation of a new separate table to hold
   374  these privileges would not add additional costs to the "hot path" of
   375  latency.
   376  
   377  ### Alternative B: schemas with descriptors
   378  
   379  Similar as above, but we create descriptors for schemas and put the privileges inside.
   380  
   381  Question then remains what to do about leasing, range IDs, etc.
   382  
   383  ### Alternative C: schemas inside database descriptors
   384  
   385  This is also similar to A but here low-level schema IDs are local
   386  within 1 database.
   387  
   388  - the table descriptor is changed to have `ParentDatabaseID` and `ParentSchemaID`.
   389  - the database descriptor would be extended to contain a list of schemas descriptors.
   390    Each of these descriptors would be identified *within* the database descriptor with
   391    an ID starting at 0.
   392  - every new database descriptor is instantiated with 1 schema descriptor within with ID 0. This will become
   393    the `public` schema thanks to the next point.
   394  - 2 new columns `schemaID` and `schemaName` are added to `system.namespace`, prefilled with default 0 and "public". These are used both:
   395  
   396    - for databases, to list all schemas in the db.
   397    - for tables, to tell which schema of its parent DB a table is contained in.
   398  
   399  - CREATE DATABASE and the migration for pre-20.1 clusters would add the 2 new columns to `system.namespace` with
   400    defaults 0 and `public`. This would ensure the `public` schema automatically exists in every DB, and that
   401    every existing table gets connected to it automatically.
   402  
   403  - CREATE SCHEMA would add a new schema descriptor inside the parent db descriptor, then add a row to the `system.namespace` table.
   404  
   405  - DROP DATABASE, DROP SCHEMA would iterate over `system.namespace` accordingly.
   406  
   407  Example:
   408  
   409  ```
   410  bank
   411    |
   412    +---- production
   413    |        |
   414    |        +---- users
   415    |        |
   416    |        +---- accounts
   417    |
   418    +---- testing
   419             |
   420             +---- users
   421             |
   422             +---- accounts
   423  ```
   424  
   425  In this solution `system.namespace` contains:
   426  
   427  | parentID | name     | Id | SchemaID | SchemaName |
   428  |----------|----------|----|----------|------------|
   429  | 0        | bank     | 1  | 0        | public     |
   430  | 0        | bank     | 1  | 1        | production |
   431  | 0        | bank     | 1  | 2        | testing    |
   432  | 1        | users    | 2  | 1        |            |
   433  | 1        | accounts | 3  | 1        |            |
   434  | 1        | users    | 4  | 2        |            |
   435  | 1        | accounts | 5  | 2        |            |
   436  
   437  - the queries to resolve names become:
   438  
   439    - to resolve the parent schema and database of a table: look up the parent db descriptor
   440      from `ParentDatabaseID`, then look up the schema inside the db descriptor from `ParentSchemaID`.
   441  	
   442  	This is faster than for alternative A above.
   443  
   444    - to list all schemas inside a database:
   445  
   446      - schema details: simply iterate over the schema descriptors inside the db descriptor
   447  	
   448  	- schema names:
   449  	
   450        `select schemaID, schemaName from system.namespace where id = <requested>`
   451  
   452    - to list all tables inside a *schema*:
   453  
   454      `select id from system.namespace where parentID = <dbId> and schemaID = <schemaId>`
   455  	
   456    - to list all tables inside a *database*:
   457    
   458      ```
   459  	select id from system.namespace where parentID = <requested>
   460      ```
   461  
   462  
   463  In this solution, any additional metadata associated with a schema
   464  (for example, privileges and grants for creating new tables) would be
   465  stored in the parent db descriptor.
   466  
   467  ### Alternative D: recycle the database descriptors
   468  
   469  - the `DatabaseDescriptor` gets a new field `ParentID` that indicates, when populated,
   470    that the desc is for a schema and the ParentID is its parent DB.
   471    The parent desc of a schema must have a ParentID of 0 (we're not proposing to support arbitrary
   472    depth in the hierarchy).
   473  - the field `ParentID` in `TableDescriptor` would  remain as-is, and will be expected
   474    to refer to a db descriptor that's also a schema descriptor.
   475  
   476    In the back-compat case where the parent dbdesc of a table is an
   477    actual db, not a schema desc (its own parentID is 0) then we'll
   478    consider the table to actually have the schema "public" as
   479    parent.
   480  
   481    (Bob says: it's interesting to preserve this property "parent of
   482    table desc is actual db not schema => parent schema is public"
   483    because it enables not breaking old version nodes for existing
   484    table/dbs when new schemas become supported, in mixed-version
   485    clusters)
   486    
   487    (knz/bob: however we found out that "public" is not special and must
   488    be droppable. If we preserve this property, "public" becomes special
   489    and cannot be dropped.)
   490  
   491  - `system.namespace` does not change.
   492  
   493  - CREATE DATABASE and the migration for pre-2.2 clusters would create
   494    a `public` schema in every database, and populate a
   495    `system.namespace` entry with `public` for every db.
   496  
   497  - CREATE SCHEMA would create a db descriptor with a suitable `ParentID` value,
   498    then populate `system.namespace` accordingly.
   499  
   500  - DROP DATABASE, DROP SCHEMA would iterate over `system.namespace` accordingly.
   501  
   502  Example:
   503  
   504  ```
   505  bank
   506    |
   507    +---- production
   508    |        |
   509    |        +---- users
   510    |        |
   511    |        +---- accounts
   512    |
   513    +---- testing
   514             |
   515             +---- users
   516             |
   517             +---- accounts
   518  ```
   519  
   520  In this solution `system.namespace` contains:
   521  
   522  | parentID | name       | Id |
   523  |----------|------------|----|
   524  | 0        | bank       | 1  |
   525  | 1        | public     | 2  |
   526  | 1        | production | 3  |
   527  | 1        | testing    | 4  |
   528  | 3        | users      | 5  |
   529  | 3        | accounts   | 6  |
   530  | 4        | users      | 7  |
   531  | 4        | accounts   | 8  |
   532  
   533  The queries to access tables become:
   534  
   535  - to resolve the parent schema and database of a table:
   536  
   537    ```
   538  	 select n1.parentID as databaseID, n1.id as schemaID
   539  	   from system.namespace n1, system.namespace n2
   540      where n1.id = n2.parentID
   541  	    and n2.id = <requested>
   542    ```
   543  
   544  - to list all schemas inside a database:
   545  
   546    `select id from system.namespace where parentID = <requested>`
   547  
   548  - to list all tables inside a *schema*:
   549  
   550    `select id from system.namespace where parentID = <requested>`
   551  	
   552  - to list all tables inside a *database*:
   553  
   554    ```
   555  	select n1.id
   556  	  from system.namespace n1, system.namespace n2
   557     where n1.parentID = n2.id
   558  	   and n2.parentID = <requested>
   559    ```
   560  
   561  In this solution, any additional metadata associated with a schema
   562  (for example, privileges and grants for creating new tables) would be
   563  stored in the (db) desc, just like regular databases.
   564  
   565  Note: after alternative D is implemented, in a mixed-version cluster,
   566  after `CREATE SCHEMA db.foo; CREATE TABLE db.foo.t(...); CREATE TABLE
   567  db.public.u(...)` is run in a 2.2 node, a 19.2 node that runs `SHOW
   568  TABLES` (or look at `crdb_internal.tables`) will see:
   569  
   570  | table name | database name |
   571  |------------|---------------|
   572  | `u`        | `db`          |
   573  | `t`        | `foo`         |
   574  
   575  Is that a problem? - suspicious because we can also have `CREATE
   576  DATABASE foo; CREATE TABLE foo.public.t(...)` and that will cause a
   577  duplicate row in crdb_internal.tables (and hence SHOW TABLES).
   578  
   579  Bob's reaction: maybe we can gate CREATE SCHEMA/DROP SCHEMA upon
   580  bumping the cluster version.
   581  
   582  ### Test scenario proposal
   583  
   584  We propose to implement a validation test to evaluate the various
   585  strategies, that runs the following SQL scenario at different stages
   586  of a cluster lifecycle:
   587  
   588  ```sql
   589  -- 19.2 cluster
   590  
   591  CREATE SCHEMA; -- error "unsupported"
   592  DROP SCHEMA; -- ditto
   593  
   594  -- 19.2+20.1 cluster, cluster version still 19.2
   595  
   596  CREATE SCHEMA; -- error "disabled until version update"
   597  DROP SCHEMA; -- ditto
   598  
   599  CREATE DATABASE d21; CREATE TABLE d21.public.t21(x int); -- in node 19.2
   600  CREATE DATABASE d22; CREATE TABLE d22.public.t22(x int); -- in node 20.1
   601  
   602  CREATE TABLE d21.public.t22(x int); -- in node 20.1
   603  CREATE TABLE d22.public.t21(x int); -- in node 19.2
   604  
   605  ALTER DATABASE d21 RENAME TO d21_new; -- in node 19.2
   606  ALTER DATABASE d22 RENAME TO d22_new; -- in node 19.2
   607  
   608  SHOW DATABASES; -- in node 19.2, verify all is well;
   609  SHOW DATABASES; -- in node 20.1, verify all is well;
   610  
   611  SHOW SCHEMAS FROM d21_new; -- in node 19.2, verify all is well;
   612  SHOW SCHEMAS FROM d22_new; -- in node 19.2, verify all is well;
   613  SHOW SCHEMAS FROM d21_new; -- in node 20.1, verify all is well;
   614  SHOW SCHEMAS FROM d22_new; -- in node 20.1, verify all is well;
   615  
   616  SHOW TABLES FROM d21_new; -- in node 19.2, verify all is well
   617  SHOW TABLES FROM d21_new; -- in node 20.1, verify all is well
   618  SHOW TABLES FROM d22_new; -- in node 19.2, verify all is well
   619  SHOW TABLES FROM d22_new; -- in node 20.1, verify all is well
   620  
   621  DROP DATABASE d21 CASCADE; -- in node 19.2, verify no leftovers in system.namespace?
   622  DROP DATABASE d22 CASCADE; -- in node 19.2, verify no leftovers in system.namespace?
   623  
   624  -- FIXME: some GRANT tests here.
   625  
   626  -- 20.1 cluster with version bumped
   627  
   628  CREATE SCHEMA; -- OK
   629  DROP SCHEMA; -- ditto
   630  
   631  -- take the tests from above and verify the behavior is still OK
   632  ```
   633  
   634  ## Drawbacks
   635  
   636  Why should we *not* do this?
   637  
   638  TBD
   639  
   640  ## Rationale and Alternatives
   641  
   642  Arguments from Ben about the choice between A, B, C:
   643  
   644  > I prefer option B.
   645  >
   646  > I dislike option A because it creates such a difference between
   647  > databases (which have descriptors) and schemas (which have some other
   648  > to-be-determined place to store the things that are currently in the
   649  > DB descriptor). Option A would be more appealing to me if we planned
   650  > to move away from database descriptors too and make them use the same
   651  > system as for schemas.
   652  >
   653  > Option C is acceptable, but it feels hackier to me than option A or
   654  > B. I think the concerns about database descriptor leases are more
   655  > severe for option C (for option B, operations on schemas would not be
   656  > substantially more common or more complex than operations on databases
   657  > are today, so it doesn't make things much worse, but in option C the
   658  > kinds of changes we'd make to database descriptors become more
   659  > complex). Vivek's schema leases should take care of this no matter
   660  > which alternative we choose here, though.
   661  
   662  Discussion with Bob:
   663  
   664  - Q: can we not just store the schema of a table as an additional field in the table descriptor?
   665  
   666    A: yes that is option A.
   667  
   668  - Q: can we reuse the DatabaseDescriptor as a schema descriptor, with
   669    perhaps some additional field that marks it as a "Schema" (and not a
   670    db) and indicates its db parent?
   671  
   672    A: this would be a new alternative (D) - however today (CRDB 19.2) we
   673    do not have proper transactional semantics on database DDL; clients
   674    may be ok with lack of txn sems on DBs but the word is that they
   675    care much more about txn sems on schema DDL. If we do option D,
   676    we'll need to introduce txn semantics on db descs.
   677  
   678    A: alternative D raises an interesting question about the status of
   679    schema "public" in mixed-version clusters. What happens when 20.1
   680    nodes are started with some 19.2 nodes still active?
   681  
   682    - with the "base" version of alt D, existing/non-modified tb descs
   683      refer to an actual db desc as parent, and this is interpreted as
   684      "public" however such a public schema cannot be edited (eg
   685      dropped).
   686  	
   687  	Q: Is that a problem? When can the restriction be lifted?
   688  	
   689      A: presumably when the cluster version is bumped.
   690  	
   691  	Q: can other (non-"public") schemas be used in the meantime?
   692  	
   693  	A: not really, see remarks at end of section for alt D.
   694  
   695  - Q: does any of the alternatives support actually sane mixed-version
   696    behavior?
   697    
   698    A: not definitive answer yet?
   699    
   700    - Discussion above suggests not in alt D. What about the others?
   701    
   702    - With alt B (separate descriptors for schemas) it _appears_ that the
   703      19.2 code would "skip over" descriptor types that it does not know
   704      about (the type casts from Descriptor to Table/DatabaseDescriptor
   705      are conditional). However we need to audit all such casts not just
   706      those used for `SHOW` / `crdb_internal` / `pg_catalog` /
   707      `information_schema`.
   708  
   709    - With alt A presumably the schemas would be entirely invisible to
   710      the 19.2 nodes. However we need to ensure that any schema
   711      information populated by 20.1 nodes do not get destroyed by
   712      subsequent DDL in 19.2 nodes. (Would the new tb desc `ParentID`
   713      field be overwritten by DDL in 19.2?)
   714  	
   715    - With alt C fundamentally the same quesitons as A, in addition need
   716      to check that db DDL by 19.2 nodes do not destroy the schema data
   717      in db descs edited by 20.1 nodes.
   718  
   719    Alts A and C raise uncomfortable questions because the corresponding
   720    investigation needed to establish confidence about mixed-ver compat
   721    is difficult. The investigation for alt B is also somewhat
   722    difficult.
   723    
   724    Arguably, in order to keep the problem tractable, out of
   725    pragmaticism, we may choose to simply disallow schema DDL until
   726    cluster version is bumped, in which case we can skip concerns about
   727    mixed-ver compat from the decision making altogether.
   728    
   729  - Q: suppose we restrict access to schema DDL until cluster version is
   730    bumped.  What of 20.1 nodes in a mixed-ver cluster, before the
   731    cluster version is bumped? Do we need two code paths and a switch on
   732    the cluster version?
   733    
   734    A: depends.
   735    
   736    - alt C: no, because of reasonable defaults
   737    
   738    - alt D: completely fine because of reasonable defaults
   739  
   740    - alt A: we need a migration when the first 20.1 node starts, to add
   741      the "public" schema entries in system.namespace. Then the code
   742      will work because of sane defaults (schema ID for "public" is 0).
   743  	
   744  	Q: what happens with CREATE DATABASE in 19.2 nodes after the 20.1
   745      node starts? These will not get their "public" schema entry in
   746      system.namespace!
   747  	
   748  	A: TBD
   749  	
   750    - alt B: same questions as alt A
   751  
   752  - Q: what of the impl of alt D, in the intermediate situation (20.1
   753    node active, cluster version not bumped yet)?
   754  
   755    A: combination of multiple things:
   756    
   757    - CREATE SCHEMA/DROP SCHEMA is disabled (until ver is bumped)
   758    - SHOW TABLES must work; this is possible by an additional conditional
   759      (described in alt D section) that says if the parent of a tb is a "real" db, interpret this
   760  	as schema "public" instead.
   761    - GRANT of schema-level permissions are still disabled at that point, because
   762      we can't ensure that we have a valid desc for "public" (for compat with 19.2 nodes)
   763    - migration upon bumping the cluster ver, to fix up all the table descs
   764      to refer to an actual "public" schema desc instead.
   765    - after that migration has ran, schema DDL and GRANT on schemas become available.
   766  
   767  
   768  
   769  ## Unresolved questions
   770  
   771  - Which alternative design above will be selected.
   772  - Whether or not schemas should have their own zone config.