github.com/matrixorigin/matrixone@v1.2.0/pkg/container/bytejson/README.md (about) 1 # **JSON Types Overview** 2 3 MatrixOne JSON types conforms with MySQL JSON types definition. 4 5 Reference: <https://dev.mysql.com/doc/refman/8.0/en/json.html> 6 7 ## **JSON Binary Format** 8 9 Reference: <https://dev.mysql.com/doc/dev/mysql-server/8.0.28/json__binary_8h.html> 10 11 ### **Description** 12 13 JSON binary format is a binary format for storing JSON data. 14 15 ### **Detail** 16 17 ``` 18 JSON doc ::= type value 19 type ::= 20 0x01 | // object 21 0x03 | // array 22 0x04 | // literal (true/false/null) 23 0x09 | // int64 24 0x0a | // uint64 25 0x0b | // float64 26 0x0c | // string 27 28 value ::= 29 object | 30 array | 31 literal | 32 number | 33 string | 34 35 object ::= element-count size key-entry* value-entry* key* value* 36 37 array ::= element-count size value-entry* value* 38 39 // number of members in object or number of elements in array 40 element-count ::= uint32 41 42 // number of bytes in the binary representation of the object or array 43 size ::= uint32 44 45 key-entry ::= key-offset key-length 46 47 key-offset ::= uint32 48 49 key-length ::= uint16 // key length must be less than 64KB 50 51 value-entry ::= type offset-or-inlined-value 52 53 // This field holds either the offset to where the value is stored, 54 // or the value itself if it is small enough to be inlined (that is, 55 // if it is a JSON literal) 56 offset-or-inlined-value ::= uint32 57 58 key ::= utf8-data 59 60 literal ::= 61 0x00 | // JSON null literal 62 0x01 | // JSON true literal 63 0x02 | // JSON false literal 64 65 number ::= .... // little-endian format for [u]int64 and float64 66 67 string ::= data-length utf8-data 68 69 data-length ::= uint8* // Variable size encoding, if the high bit of a byte is 1, the length 70 // field is continued in the next byte, 71 // otherwise it is the last byte of the length 72 // field. So we need 1 byte to represent 73 // lengths up to 127, 2 bytes to represent 74 // lengths up to 16383, and so on... 75 ``` 76 77 ### **Implementations** 78 79 1. All the values are wrapped in a JSON doc. A JSON doc consists of a type and a value. The type field is a single byte 80 that indicates the type of the value. The value field is a binary representation of the value. 81 2. The value of object type is stored as a sequence of element-count, size, key-entry list, value-entry list, key list 82 and value list, where element-count is the number of members in the object, size is the number of bytes in the 83 binary. The key-entry consists of key-offset and key-length, key-offset is the offset to the key in the key list, and 84 key-length is the length of the key. The value-entry consists of type and offset-or-inlined-value, type is the type 85 of the value, offset-or-inlined-value is the offset to the value in the value list or the value itself if it is small 86 enough to be inlined. The key is a string and the value is a binary representation of the value type described above. 87 3. The value of array type is stored as a sequence of element-count, size, value-entry list, value list, where 88 element-count 89 is the number of elements in the array, size is the number of bytes in the binary. The value-entry consists of type 90 and 91 offset-or-inlined-value, type is the type of the value, offset-or-inlined-value is the offset to the value in the 92 value list 93 or the value itself if it is small enough to be inlined. The value is a binary representation of the value type 94 described 95 above. 96 4. The value of literal type is stored as a single byte that indicates the literal type. 97 5. The value of number type is stored as a binary representation of the number whose format is little-endian. 98 6. The value of string type is stored as a sequence of data-length and utf8-data, data-length which is stored as a 99 variable size encoding is the length of the utf8-data, utf8-data is the utf8-encoded string. 100 101 ### **Example** 102 103 ```sql 104 drop table if exists t; 105 create table t 106 ( 107 a json, 108 b int 109 ); 110 111 insert into t(a, b) 112 values ('{"a": [1, "2", {"aa": "bb"}]}', 1), 113 ('[1, 2, 3]', 2), 114 ('null', 3), 115 ('true', 4), 116 ('false', 5), 117 ('1', 6), 118 ('1.1', 7), 119 ('"a"', 8); 120 121 select * 122 from t; 123 +-------------------------------------+---+ 124 | a | b | 125 +-------------------------------------+---+ 126 | {"a": [1, "2", {"aa": "bb"}]} | 1 | 127 | [1, 2, 3] | 2 | 128 | null | 3 | 129 | true | 4 | 130 | false | 5 | 131 | 1 | 6 | 132 | 1.1 | 7 | 133 | "a" | 8 | 134 +-------------------------------------+---+ 135 136 delete 137 from t 138 where b = 3; 139 140 update t 141 set a = '{"a": 1}' 142 where b = 1; 143 144 select * 145 from t; 146 +-------------------------------------+---+ 147 | a | b | 148 +-------------------------------------+---+ 149 | {"a": 1} | 1 | 150 | [1, 2, 3] | 2 | 151 | true | 4 | 152 | false | 5 | 153 | 1 | 6 | 154 | 1.1 | 7 | 155 | "a" | 8 | 156 ``` 157 158 ## **JSON Path Syntax** 159 160 Reference: <https://dev.mysql.com/doc/refman/8.0/en/json.html#json-path-syntax> 161 162 ### **Description** 163 164 JSON Path is a path expression that can be used to access a value in a JSON document. 165 166 ### **Detail** 167 168 ``` 169 pathExpression: 170 scope[(pathLeg)*] 171 172 pathLeg: 173 member | arrayLocation | arrayRange | doubleAsterisk 174 175 member: 176 period ( keyName | asterisk ) 177 178 arrayLocation: 179 leftBracket ( nonNegativeInteger | asterisk ) rightBracket 180 181 arrayRange: 182 leftBracket (nonNegativeInteger| lastExpression) colon (nonNegativeInteger| lastExpression) rightBracket 183 184 lastExpression: 185 lastToken substract nonNegativeInteger 186 187 keyName: 188 ESIdentifier | doubleQuotedString 189 190 doubleAsterisk: 191 '**' 192 193 period: 194 '.' 195 196 asterisk: 197 '*' 198 199 leftBracket: 200 '[' 201 202 rightBracket: 203 ']' 204 205 colon: 206 'to' 207 208 lastToken: 209 'last' 210 211 substract: 212 '-' 213 ``` 214 215 *In MatrixOne, the scope of the path is always the document being operated on, represented as $. You can use '$' as a 216 synonym for the document in JSON path expressions.* 217 218 ### **Notes** 219 220 0. The path expression must start with `$`. 221 1. The array location is a left bracket followed by a non-negative integer or an asterisk, followed by a right bracket, 222 if user gives a negative number, it will return an error. 223 2. The key name is an identifier or a double-quoted string. 224 3. `prefix**suffix` represents all paths beginning with prefix and ending with suffix. prefix is optional, while suffix 225 is required; in other words, a path may not end in `**`. 226 227 ### **Examples** 228 229 ``` 230 $ -> represents the whole document 231 $[0] -> represents second element of the array document 232 $.* -> represents all values of the object document 233 $.store -> represents the store object 234 $.store.* -> represents all values of the store object 235 $.store.book[0] -> represents the first book element in the store object 236 $**.a -> represents values of the document keys which ends with 'a', such as 'a', 'b.a', 'c.b.a', etc. 237 $.a**.b -> represents values of the document keys which starts with 'a' and ends with 'b', such as 'a.b', 'a.x.b', 'a.x.y.b', etc. 238 $[last] -> represents the last element of the array document 239 $[last-1] -> represents the second last element of the array document 240 $[0 to 2] -> represents the first three elements of the array document 241 $[0 to last-1] -> represents all elements of the array document except the last one 242 $[0 to last-2] -> represents all elements of the array document except the last two 243 $[last - 5 to last] -> represents the last five elements of the array document 244 ``` 245 246 ## **JSON EXTREACT** 247 248 ### **Description** 249 250 json_extract is a JSON query function that can be used to query JSON documents. 251 252 ### **Syntax** 253 254 ```sql 255 select json_extract(jsonDoc, pathExpression); 256 ``` 257 258 *jsonDoc is the JSON document to be queried,which can be a JSON text string or a JSON column in a table.* 259 260 ### **Implementation** 261 262 1. The implementation is based on the JSONPath syntax. 263 2. If the path expression is not valid, return an error. 264 3. If the path expression is valid, return the value of the path expression. Rules for query: 265 1. If the path expression is empty, return the query result. 266 2. Use *restPath* to represent the *current* path expression without the front path leg. 267 3. If the front leg is a member, query the value of the member if current doc is an object, or return null if 268 current doc is an array. 269 4. If the front leg is an array location, query the value of the array location if current doc is an array, or 270 return null if current doc is an object. 271 5. If the front leg is a double asterisk: 272 1. First, use the *restPath* expression to query the value of the current doc. 273 2. Second, use the *current* path expression to query the value of each sub doc in the current doc. 274 3. Return the union of the two results. 275 6. If the front leg is an asterisk, use the *rest* path expression to query the value of each sub doc in the current 276 doc, and return the union of the results. 277 7. If the result is not a single value, return the result as a JSON array. 278 4. see `matrixone.pkg.container.bytejson.Query` for more details. 279 280 ### **Examples** 281 282 ```sql 283 select json_extract('[1,2,3]', '$[*]'); 284 +-----------------------+ 285 | json_extract('[1,2,3]', '$[*]') | 286 +-----------------------+ 287 | [1,2,3] | 288 +-----------------------+ 289 290 select json_extract('[1,2,3]', '$[0]'); 291 +------------------+ 292 | json_extract([1,2,3],$[0]) | 293 +------------------+ 294 | 1 | 295 +------------------+ 296 297 select json_extract('{"a":1,"b":2,"c":3}', '$.*'); 298 +-----------------------------+ 299 | json_extract({"a":1,"b":2,"c":3},$.*) | 300 +-----------------------------+ 301 | [1, 2, 3] | 302 +-----------------------------+ 303 304 select json_extract('{"a":1,"b":2,"c":3}', '$.a'); 305 +-----------------------------+ 306 | json_extract({"a":1,"b":2,"c":3},$.a) | 307 +-----------------------------+ 308 | 1 | 309 +-----------------------------+ 310 311 select json_extract('{"a":1,"b":2,"c":3,"d":{"a":"x"}}', '$**.a'); 312 +---------------------------------------------+ 313 | json_extract({"a":1,"b":2,"c":3,"d":{"a":"x"}},$**.a) | 314 +---------------------------------------------+ 315 | [1, "x"] | 316 +---------------------------------------------+ 317 318 drop table if exists t; 319 create table t 320 ( 321 a json 322 ); 323 324 insert into t 325 values ('{"a":1,"b":2,"c":3}'); 326 327 select json_extract(a, '$.a') 328 from t; 329 +----------------------+ 330 | json_extract(a,$.a) | 331 +----------------------+ 332 | 1 | 333 +----------------------+ 334 335 insert into t 336 values ('{"a":5,"b":6,"c":7}'); 337 select json_extract(a, '$.a') 338 from t; 339 +----------------------+ 340 | json_extract(a,$.a) | 341 +----------------------+ 342 | 1 | 343 | 5 | 344 +----------------------+ 345 ``` 346 347