github.com/matrixorigin/matrixone@v1.2.0/pkg/container/bytejson/README.md (about)

     1  # **JSON Types Overview**
     2  
     3  MatrixOne JSON types conforms with MySQL JSON types definition.
     4  
     5  Reference: <https://dev.mysql.com/doc/refman/8.0/en/json.html>
     6  
     7  ## **JSON Binary Format**
     8  
     9  Reference: <https://dev.mysql.com/doc/dev/mysql-server/8.0.28/json__binary_8h.html>
    10  
    11  ### **Description**
    12  
    13  JSON binary format is a binary format for storing JSON data.
    14  
    15  ### **Detail**
    16  
    17   ```
    18   JSON doc ::= type value
    19      type ::=
    20          0x01 |       // object
    21          0x03 |       // array
    22          0x04 |       // literal (true/false/null)
    23          0x09 |       // int64
    24          0x0a |       // uint64
    25          0x0b |       // float64
    26          0x0c |       // string
    27  
    28      value ::=
    29          object  |
    30          array   |
    31          literal |
    32          number  |
    33          string  |
    34  
    35      object ::= element-count size key-entry* value-entry* key* value*
    36  
    37      array ::= element-count size value-entry* value*
    38  
    39      // number of members in object or number of elements in array
    40      element-count ::= uint32
    41  
    42      // number of bytes in the binary representation of the object or array
    43      size ::= uint32
    44  
    45      key-entry ::= key-offset key-length
    46  
    47      key-offset ::= uint32
    48  
    49      key-length ::= uint16    // key length must be less than 64KB
    50  
    51      value-entry ::= type offset-or-inlined-value
    52  
    53      // This field holds either the offset to where the value is stored,
    54      // or the value itself if it is small enough to be inlined (that is,
    55      // if it is a JSON literal)
    56      offset-or-inlined-value ::= uint32
    57  
    58      key ::= utf8-data
    59  
    60      literal ::=
    61          0x00 |   // JSON null literal
    62          0x01 |   // JSON true literal
    63          0x02 |   // JSON false literal
    64  
    65      number ::=  ....    // little-endian format for [u]int64 and float64
    66  
    67      string ::= data-length utf8-data
    68  
    69      data-length ::= uint8*    // Variable size encoding, if the high bit of a byte is 1, the length
    70                                // field is continued in the next byte,
    71                                // otherwise it is the last byte of the length
    72                                // field. So we need 1 byte to represent
    73                                // lengths up to 127, 2 bytes to represent
    74                                // lengths up to 16383, and so on...
    75   ```
    76  
    77  ### **Implementations**
    78  
    79  1. All the values are wrapped in a JSON doc. A JSON doc consists of a type and a value. The type field is a single byte
    80     that indicates the type of the value. The value field is a binary representation of the value.
    81  2. The value of object type is stored as a sequence of element-count, size, key-entry list, value-entry list, key list
    82     and value list, where element-count is the number of members in the object, size is the number of bytes in the
    83     binary. The key-entry consists of key-offset and key-length, key-offset is the offset to the key in the key list, and
    84     key-length is the length of the key. The value-entry consists of type and offset-or-inlined-value, type is the type
    85     of the value, offset-or-inlined-value is the offset to the value in the value list or the value itself if it is small
    86     enough to be inlined. The key is a string and the value is a binary representation of the value type described above.
    87  3. The value of array type is stored as a sequence of element-count, size, value-entry list, value list, where
    88     element-count
    89     is the number of elements in the array, size is the number of bytes in the binary. The value-entry consists of type
    90     and
    91     offset-or-inlined-value, type is the type of the value, offset-or-inlined-value is the offset to the value in the
    92     value list
    93     or the value itself if it is small enough to be inlined. The value is a binary representation of the value type
    94     described
    95     above.
    96  4. The value of literal type is stored as a single byte that indicates the literal type.
    97  5. The value of number type is stored as a binary representation of the number whose format is little-endian.
    98  6. The value of string type is stored as a sequence of data-length and utf8-data, data-length which is stored as a
    99     variable size encoding is the length of the utf8-data, utf8-data is the utf8-encoded string.
   100  
   101  ### **Example**
   102  
   103  ```sql
   104  drop table if exists t;
   105  create table t
   106  (
   107      a json,
   108      b int
   109  );
   110  
   111  insert into t(a, b)
   112  values ('{"a": [1, "2", {"aa": "bb"}]}', 1),
   113         ('[1, 2, 3]', 2),
   114         ('null', 3),
   115         ('true', 4),
   116         ('false', 5),
   117         ('1', 6),
   118         ('1.1', 7),
   119         ('"a"', 8);
   120  
   121  select *
   122  from t;
   123  +-------------------------------------+---+
   124  | a                                   | b |
   125  +-------------------------------------+---+
   126  | {"a": [1, "2", {"aa": "bb"}]}       | 1 |
   127  | [1, 2, 3]                           | 2 |
   128  | null                                | 3 |
   129  | true                                | 4 |
   130  | false                               | 5 |
   131  | 1                                   | 6 |
   132  | 1.1                                 | 7 |
   133  | "a"                                 | 8 |
   134  +-------------------------------------+---+
   135  
   136  delete
   137  from t
   138  where b = 3;
   139  
   140  update t
   141  set a = '{"a": 1}'
   142  where b = 1;
   143  
   144  select *
   145  from t;
   146  +-------------------------------------+---+
   147  | a                                   | b |
   148  +-------------------------------------+---+
   149  | {"a": 1}                            | 1 |
   150  | [1, 2, 3]                           | 2 |
   151  | true                                | 4 |
   152  | false                               | 5 |
   153  | 1                                   | 6 |
   154  | 1.1                                 | 7 |
   155  | "a"                                 | 8 |
   156  ```
   157  
   158  ## **JSON Path Syntax**
   159  
   160  Reference: <https://dev.mysql.com/doc/refman/8.0/en/json.html#json-path-syntax>
   161  
   162  ### **Description**
   163  
   164  JSON Path is a path expression that can be used to access a value in a JSON document.
   165  
   166  ### **Detail**
   167  
   168   ```
   169   pathExpression:
   170       scope[(pathLeg)*]
   171  
   172   pathLeg:
   173       member | arrayLocation | arrayRange | doubleAsterisk
   174  
   175   member:
   176       period ( keyName | asterisk )
   177  
   178   arrayLocation:
   179       leftBracket ( nonNegativeInteger | asterisk ) rightBracket
   180   
   181   arrayRange:
   182       leftBracket (nonNegativeInteger| lastExpression) colon (nonNegativeInteger| lastExpression) rightBracket
   183       
   184   lastExpression:
   185       lastToken substract nonNegativeInteger
   186  
   187   keyName:
   188       ESIdentifier | doubleQuotedString
   189  
   190   doubleAsterisk:
   191       '**'
   192  
   193   period:
   194       '.'
   195  
   196   asterisk:
   197       '*'
   198  
   199   leftBracket:
   200       '['
   201  
   202   rightBracket:
   203       ']'
   204       
   205   colon:
   206       'to'
   207       
   208   lastToken:
   209       'last'
   210       
   211   substract:
   212       '-'
   213   ```
   214  
   215  *In MatrixOne, the scope of the path is always the document being operated on, represented as $. You can use '$' as a
   216  synonym for the document in JSON path expressions.*
   217  
   218  ### **Notes**
   219  
   220  0. The path expression must start with `$`.
   221  1. The array location is a left bracket followed by a non-negative integer or an asterisk, followed by a right bracket,
   222     if user gives a negative number, it will return an error.
   223  2. The key name is an identifier or a double-quoted string.
   224  3. `prefix**suffix` represents all paths beginning with prefix and ending with suffix. prefix is optional, while suffix
   225     is required; in other words, a path may not end in `**`.
   226  
   227  ### **Examples**
   228  
   229  ```
   230  $ -> represents the whole document
   231  $[0] -> represents second element of the array document
   232  $.* -> represents all values of the object document
   233  $.store -> represents the store object
   234  $.store.* -> represents all values of the store object
   235  $.store.book[0] -> represents the first book element in the store object
   236  $**.a -> represents values of the document keys which ends with 'a', such as 'a', 'b.a', 'c.b.a', etc.
   237  $.a**.b -> represents values of the document keys which starts with 'a' and ends with 'b', such as 'a.b', 'a.x.b', 'a.x.y.b', etc.
   238  $[last] -> represents the last element of the array document
   239  $[last-1] -> represents the second last element of the array document
   240  $[0 to 2] -> represents the first three elements of the array document
   241  $[0 to last-1] -> represents all elements of the array document except the last one
   242  $[0 to last-2] -> represents all elements of the array document except the last two
   243  $[last - 5 to last] -> represents the last five elements of the array document
   244  ```
   245  
   246  ## **JSON EXTREACT**
   247  
   248  ### **Description**
   249  
   250  json_extract is a JSON query function that can be used to query JSON documents.
   251  
   252  ### **Syntax**
   253  
   254  ```sql
   255  select json_extract(jsonDoc, pathExpression);
   256  ```
   257  
   258  *jsonDoc is the JSON document to be queried,which can be a JSON text string or a JSON column in a table.*
   259  
   260  ### **Implementation**
   261  
   262  1. The implementation is based on the JSONPath syntax.
   263  2. If the path expression is not valid, return an error.
   264  3. If the path expression is valid, return the value of the path expression. Rules for query:
   265      1. If the path expression is empty, return the query result.
   266      2. Use *restPath* to represent the *current* path expression without the front path leg.
   267      3. If the front leg is a member, query the value of the member if current doc is an object, or return null if
   268         current doc is an array.
   269      4. If the front leg is an array location, query the value of the array location if current doc is an array, or
   270         return null if current doc is an object.
   271      5. If the front leg is a double asterisk:
   272          1. First, use the *restPath* expression to query the value of the current doc.
   273          2. Second, use the *current* path expression to query the value of each sub doc in the current doc.
   274          3. Return the union of the two results.
   275      6. If the front leg is an asterisk, use the *rest* path expression to query the value of each sub doc in the current
   276         doc, and return the union of the results.
   277      7. If the result is not a single value, return the result as a JSON array.
   278  4. see `matrixone.pkg.container.bytejson.Query` for more details.
   279  
   280  ### **Examples**
   281  
   282  ```sql
   283  select json_extract('[1,2,3]', '$[*]');
   284  +-----------------------+
   285  | json_extract('[1,2,3]', '$[*]') |
   286  +-----------------------+
   287  | [1,2,3]               |
   288  +-----------------------+
   289  
   290  select json_extract('[1,2,3]', '$[0]');
   291  +------------------+
   292  | json_extract([1,2,3],$[0]) |
   293  +------------------+
   294  | 1                |
   295  +------------------+
   296  
   297  select json_extract('{"a":1,"b":2,"c":3}', '$.*');
   298  +-----------------------------+
   299  | json_extract({"a":1,"b":2,"c":3},$.*) |
   300  +-----------------------------+
   301  | [1, 2, 3]                   |
   302  +-----------------------------+
   303  
   304  select json_extract('{"a":1,"b":2,"c":3}', '$.a');
   305  +-----------------------------+
   306  | json_extract({"a":1,"b":2,"c":3},$.a) |
   307  +-----------------------------+
   308  | 1                           |
   309  +-----------------------------+
   310  
   311  select json_extract('{"a":1,"b":2,"c":3,"d":{"a":"x"}}', '$**.a');
   312  +---------------------------------------------+
   313  | json_extract({"a":1,"b":2,"c":3,"d":{"a":"x"}},$**.a) |
   314  +---------------------------------------------+
   315  | [1, "x"]                                    |
   316  +---------------------------------------------+
   317  
   318  drop table if exists t;
   319  create table t
   320  (
   321      a json
   322  );
   323  
   324  insert into t
   325  values ('{"a":1,"b":2,"c":3}');
   326  
   327  select json_extract(a, '$.a')
   328  from t;
   329  +----------------------+
   330  | json_extract(a,$.a)            |
   331  +----------------------+
   332  | 1                    |
   333  +----------------------+
   334  
   335  insert into t
   336  values ('{"a":5,"b":6,"c":7}');
   337  select json_extract(a, '$.a')
   338  from t;
   339  +----------------------+
   340  | json_extract(a,$.a)            |
   341  +----------------------+
   342  | 1                    |
   343  | 5                    |
   344  +----------------------+
   345  ```
   346  
   347