github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20170925_jsonb_scope.md (about)

     1  - Feature Name: jsonb_scope
     2  - Status: accepted
     3  - Start Date: 2017-09-25
     4  - Authors: Masha Schneider, Justin Jaffray
     5  - RFC PR: #18739
     6  - Cockroach Issue: [#2969](https://github.com/cockroachdb/cockroach/issues/2969)
     7  
     8  # Summary
     9  
    10  Support JSONB in CockroachDB.
    11  JSONB is a column type supported by Postgres that allows users to store and
    12  manipulate JSON documents directly in their database.
    13  It's distinguished from Postgres's plain `JSON` column type in that it stores
    14  a binary representation of the documents, rather than the raw JSON string.
    15  The high-level overview of the features we intend to include is:
    16  * JSON encoding/decoding,
    17  * a key encoding for JSON values,
    18  * a binary JSON format supporting fast field access,
    19  * support for Postgres style JSON operators and some functions,
    20  * support for inverted indexes.
    21  
    22  As a complement to this RFC, there is a
    23  [repo](https://github.com/cockroachdb/jsonb-spec) outlining the behaviour of
    24  JSON in more detail.
    25  
    26  This RFC is primarily concerned with the scope of the JSONB feature, there will
    27  be a follow-up RFC outlining the technical details of the feature, such as the
    28  precise binary encoding.
    29  
    30  # Motivation
    31  
    32  JSONB is a popular feature of PostgreSQL and one of the most requested features
    33  for CockroachDB, with the
    34  [issue](https://github.com/cockroachdb/cockroach/issues/2969) having by far
    35  the largest number of thumbs-ups of any feature request.
    36  Many users of CockroachDB would prefer to store JSON blobs in columns, either
    37  in addition to a traditional relational schema, or as a replacement of it
    38  (storing all data in a single JSON column, plus a primary key).
    39  
    40  Reasons one might want to use JSON with CockroachDB include:
    41  * Ease of storing and manipulating existing JSON data, without having to do a
    42    manual conversion to a relational schema
    43  * Storing data that the schema author has little-to-no control over in a
    44    structured, queryable way
    45  * Having access to CockroachDB's transactions and strong consistency while
    46    working with the ergonomics of a Mongo-style loose-schema database
    47  
    48  ## Design
    49  
    50  Postgres supports two data types - `JSON` and `JSONB`.
    51  Our datatype will be more similar to `JSONB` in its behaviour, and as a result we
    52  will refer to it as `JSONB`, rather than `JSON`.
    53  `JSON` in Postgres refers to JSON data that is stored as text,
    54  preserving whitespace, duplicate keys, and the ordering of keys, and only
    55  verifying that the value of that column is valid JSON. `JSONB` stores only
    56  the actual meaning of the JSON value, throwing away whitespace, duplicate
    57  keys, and key ordering, allowing for more efficient storage, and removing the
    58  need to parse the entire object to retrieve a specific field.
    59  
    60  There are two main directions we could go with the implementation of JSONB -
    61  broadly, we could split JSON objects across kv entries or put them entirely in a
    62  single kv entry.
    63  This is a material decision as it affects what can plausibly be accomplished
    64  with JSONB columns.
    65  We opt to store JSONB blobs entirely within a single kv entry, while keeping
    66  the format flexible enough that it could be forwards-compatible with an
    67  eventual multi-kv format if that becomes necessary.
    68  
    69  ## Details
    70  
    71  ### Operators
    72  
    73  We plan to support all of Postgres’s JSON manipulation and access operators
    74  (`->`, `->>`, `||`, `-`, `#-`), in addition to the ones which can be accelerated by
    75  inverted indexes (`@>`, `?`, `?|`, `?&`).
    76  
    77  The behaviour of each of these operators is described below:
    78  
    79  | Operator                  | Description                                           | Example                                                                            |
    80  | ------------------------- | ----------------------------------------------------- | ---------------------------------------------------------------------------------- |
    81  | `->`                      | Access a JSON field, returning a JSON value.          | `'{"foo":"bar"}'::JSONB->'foo'` = `'bar'::JSONB`                                   |
    82  | `->>`                     | Access a JSON field, returning a string.              | `'{"foo":"bar"}'::JSONB->>'foo'` = `'bar'::STRING`                                 |
    83  | <code>&#124;&#124;</code> | Concatenate JSON arrays, or append to a JSON array.   | <code>'[1, 2]'::JSONB &#124;&#124; '[3]'::JSONB</code> = `'[1, 2, 3]'::JSONB`      |
    84  | `-`                       | Removes a key from a JSON object or array.            | `'{"foo": "bar"}'::JSONB - 'foo'` = `'{}'::JSONB`                                  |
    85  | `#-`                      | Removes a path from a JSON object or array.           | `'{"foo": {"bar":"baz"}}'::JSONB #- '{foo,bar}'` = `'{"foo": {}}'::JSONB`          |
    86  | `@>`                      | Path-value check                                      | `'{"foo": {"baz": 3}, "bar": 2}'::JSONB @> '{"foo": {"baz": 3}}'::JSONB"` = `true` |
    87  | `?`                       | Key-contains check                                    | `'{"foo": 4, "bar": 2}'::JSONB ? 'foo'` = `true`                                   |
    88  | <code>?&#124;</code>      | Some key-contains check                               | <code>'{"foo": 4, "bar": 2}'::JSONB ?&#124; '{foo,baz}'</code> = `true`             |
    89  | `?&`                      | All key-contains check                                | `'{"foo": 4, "bar": 2}'::JSONB ?& '{foo,bar}'` = `true`                            |
    90  
    91  A more precise specification of the behaviour of these operators can be found
    92  in the [spec repo](https://github.com/cockroachdb/jsonb-spec).
    93  
    94  
    95  [The SQL/JSON spec](http://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip)
    96  specifies a sub-language (“the SQL/JSON path language”, or JSONPath) for
    97  querying data from JSON objects within SQL.
    98  JSONPath support is not in any current release of Postgres, but is slated for Postgres 11.
    99  This is out of scope for 1.2, but may be in scope for 1.3.
   100  The challenges of supporting JSONPath are primarily implementing it in a way that
   101  can rewrite JSONPath queries as SQL operators so our existing SQL machinery can
   102  deal with them efficiently.
   103  
   104  ### Indexing
   105  
   106  Postgres GIN indexing is supported on JSONB columns to speed up queries
   107  involving the `?`, `?|`, `?&`, and `@>` operators.
   108  This is an important feature for users dealing with unstructured data they
   109  don't have much control over, and we should support it.
   110  There will be a dedicated RFC for inverted indexes in CockroachDB.
   111  
   112  ### Lower Priority Features
   113  
   114  There are some nice-to-haves that are less essential for the initial version of
   115  JSONB.
   116  
   117  These include
   118  * Functions which can be implemented on the client side, `JSON_BUILD_OBJECT`
   119    and [similar functions](https://github.com/cockroachdb/jsonb-spec/blob/master/processing-functions.spec.md)
   120  * DistSQL support for JSON aggregates, `JSON_ARRAY_AGG` and `JSON_OBJECT_AGG`.
   121  * Computed indexes
   122  
   123  ## Drawbacks
   124  
   125  The `?` key-exists operator will collide with our current contextual help
   126  token `?`. We propose to resolve this by changing the contextual help token to
   127  `??`.
   128  
   129  # Unresolved questions