k8s.io/kube-openapi@v0.0.0-20240228011516-70dd3763d340/pkg/internal/third_party/go-json-experiment/json/README.md (about)

     1  # JSON Serialization (v2)
     2  
     3  [![GoDev](https://img.shields.io/static/v1?label=godev&message=reference&color=00add8)](https://pkg.go.dev/github.com/go-json-experiment/json)
     4  [![Build Status](https://github.com/go-json-experiment/json/actions/workflows/test.yml/badge.svg?branch=master)](https://github.com/go-json-experiment/json/actions)
     5  
     6  This module hosts an experimental implementation of v2 `encoding/json`.
     7  The API is unstable and breaking changes will regularly be made.
     8  Do not depend on this in publicly available modules.
     9  
    10  ## Goals and objectives
    11  
    12  * **Mostly backwards compatible:** If possible, v2 should aim to be _mostly_
    13  compatible with v1 in terms of both API and default behavior to ease migration.
    14  For example, the `Marshal` and `Unmarshal` functions are the most widely used
    15  declarations in the v1 package. It seems sensible for equivalent functionality
    16  in v2 to be named the same and have the same signature.
    17  Behaviorally, we should aim for 95% to 99% backwards compatibility.
    18  We do not aim for 100% compatibility since we want the freedom to break
    19  certain behaviors that are now considered to have been a mistake.
    20  We may provide options that can bring the v2 implementation to 100% compatibility,
    21  but it will not be the default.
    22  
    23  * **More flexible:** There is a
    24  [long list of feature requests](https://github.com/golang/go/issues?q=is%3Aissue+is%3Aopen+encoding%2Fjson+in%3Atitle).
    25  We should aim to provide the most flexible features that addresses most usages.
    26  We do not want to over fit the v2 API to handle every possible use case.
    27  Ideally, the features provided should be orthogonal in nature such that
    28  any combination of features results in as few surprising edge cases as possible.
    29  
    30  * **More performant:** JSON serialization is widely used and any bit of extra
    31  performance gains will be greatly appreciated. Some rarely used behaviors of v1
    32  may be dropped in favor of better performance. For example,
    33  despite `Encoder` and `Decoder` operating on an `io.Writer` and `io.Reader`,
    34  they do not operate in a truly streaming manner,
    35  leading to a loss in performance. The v2 implementation should aim to be truly
    36  streaming by default (see [#33714](https://golang.org/issue/33714)).
    37  
    38  * **Easy to use (hard to misuse):** The v2 API should aim to make
    39  the common case easy and the less common case at least possible.
    40  The API should avoid behavior that goes contrary to user expectation,
    41  which may result in subtle bugs (see [#36225](https://golang.org/issue/36225)).
    42  
    43  * **v1 and v2 maintainability:** Since the v1 implementation must stay forever,
    44  it would be beneficial if v1 could be implemented under the hood with v2,
    45  allowing for less maintenance burden in the future. This probably implies that
    46  behavioral changes in v2 relative to v1 need to be exposed as options.
    47  
    48  * **Avoid unsafe:** Standard library packages generally avoid the use of
    49  package `unsafe` even if it could provide a performance boost.
    50  We aim to preserve this property.
    51  
    52  ## Expectations
    53  
    54  While this module aims to possibly be the v2 implementation of `encoding/json`,
    55  there is no guarantee that this outcome will occur. As with any major change
    56  to the Go standard library, this will eventually go through the
    57  [Go proposal process](https://github.com/golang/proposal#readme).
    58  At the present moment, this is still in the design and experimentation phase
    59  and is not ready for a formal proposal.
    60  
    61  There are several possible outcomes from this experiment:
    62  1. We determine that a v2 `encoding/json` would not provide sufficient benefit
    63  over the existing v1 `encoding/json` package. Thus, we abandon this effort.
    64  2. We propose a v2 `encoding/json` design, but it is rejected in favor of some
    65  other design that is considered superior.
    66  3. We propose a v2 `encoding/json` design, but rather than adding an entirely
    67  new v2 `encoding/json` package, we decide to merge its functionality into
    68  the existing v1 `encoding/json` package.
    69  4. We propose a v2 `encoding/json` design and it is accepted, resulting in
    70  its addition to the standard library.
    71  5. Some other unforeseen outcome (among the infinite number of possibilities).
    72  
    73  ## Development
    74  
    75  This module is primarily developed by
    76  [@dsnet](https://github.com/dsnet),
    77  [@mvdan](https://github.com/mvdan), and
    78  [@johanbrandhorst](https://github.com/johanbrandhorst)
    79  with feedback provided by
    80  [@rogpeppe](https://github.com/rogpeppe),
    81  [@ChrisHines](https://github.com/ChrisHines), and
    82  [@rsc](https://github.com/rsc).
    83  
    84  Discussion about semantics occur semi-regularly, where a
    85  [record of past meetings can be found here](https://docs.google.com/document/d/1rovrOTd-wTawGMPPlPuKhwXaYBg9VszTXR9AQQL5LfI/edit?usp=sharing).
    86  
    87  ## Design overview
    88  
    89  This package aims to provide a clean separation between syntax and semantics.
    90  Syntax deals with the structural representation of JSON (as specified in
    91  [RFC 4627](https://tools.ietf.org/html/rfc4627),
    92  [RFC 7159](https://tools.ietf.org/html/rfc7159),
    93  [RFC 7493](https://tools.ietf.org/html/rfc7493),
    94  [RFC 8259](https://tools.ietf.org/html/rfc8259), and
    95  [RFC 8785](https://tools.ietf.org/html/rfc8785)).
    96  Semantics deals with the meaning of syntactic data as usable application data.
    97  
    98  The `Encoder` and `Decoder` types are streaming tokenizers concerned with the
    99  packing or parsing of JSON data. They operate on `Token` and `RawValue` types
   100  which represent the common data structures that are representable in JSON.
   101  `Encoder` and `Decoder` do not aim to provide any interpretation of the data.
   102  
   103  Functions like `Marshal`, `MarshalFull`, `MarshalNext`, `Unmarshal`,
   104  `UnmarshalFull`, and `UnmarshalNext` provide semantic meaning by correlating
   105  any arbitrary Go type with some JSON representation of that type (as stored in
   106  data types like `[]byte`, `io.Writer`, `io.Reader`, `Encoder`, or `Decoder`).
   107  
   108  ![API overview](api.png)
   109  
   110  This diagram provides a high-level overview of the v2 `json` package.
   111  Purple blocks represent types, while blue blocks represent functions or methods.
   112  The arrows and their direction represent the approximate flow of data.
   113  The bottom half of the diagram contains functionality that is only concerned
   114  with syntax, while the upper half contains functionality that assigns
   115  semantic meaning to syntactic data handled by the bottom half.
   116  
   117  In contrast to v1 `encoding/json`, options are represented as separate types
   118  rather than being setter methods on the `Encoder` or `Decoder` types.
   119  
   120  ## Behavior changes
   121  
   122  The v2 `json` package changes the default behavior of `Marshal` and `Unmarshal`
   123  relative to the v1 `json` package to be more sensible.
   124  Some of these behavior changes have options and workarounds to opt into
   125  behavior similar to what v1 provided.
   126  
   127  This table shows an overview of the changes:
   128  
   129  | v1 | v2 | Details |
   130  | -- | -- | ------- |
   131  | JSON object members are unmarshaled into a Go struct using a **case-insensitive name match**. | JSON object members are unmarshaled into a Go struct using a **case-sensitive name match**. | [CaseSensitivity](/diff_test.go#:~:text=TestCaseSensitivity) |
   132  | When marshaling a Go struct, a struct field marked as `omitempty` is omitted if **the field value is an empty Go value**, which is defined as false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string. | When marshaling a Go struct, a struct field marked as `omitempty` is omitted if **the field value would encode as an empty JSON value**, which is defined as a JSON null, or an empty JSON string, object, or array. | [OmitEmptyOption](/diff_test.go#:~:text=TestOmitEmptyOption) |
   133  | The `string` option **does affect** Go bools. | The `string` option **does not affect** Go bools. | [StringOption](/diff_test.go#:~:text=TestStringOption) |
   134  | The `string` option **does not recursively affect** sub-values of the Go field value. | The `string` option **does recursively affect** sub-values of the Go field value. | [StringOption](/diff_test.go#:~:text=TestStringOption) |
   135  | The `string` option **sometimes accepts** a JSON null escaped within a JSON string. | The `string` option **never accepts** a JSON null escaped within a JSON string. | [StringOption](/diff_test.go#:~:text=TestStringOption) |
   136  | A nil Go slice is marshaled as a **JSON null**. | A nil Go slice is marshaled as an **empty JSON array**. | [NilSlicesAndMaps](/diff_test.go#:~:text=TestNilSlicesAndMaps) |
   137  | A nil Go map is marshaled as a **JSON null**. | A nil Go map is marshaled as an **empty JSON object**. | [NilSlicesAndMaps](/diff_test.go#:~:text=TestNilSlicesAndMaps) |
   138  | A Go array may be unmarshaled from a **JSON array of any length**. | A Go array must be unmarshaled from a **JSON array of the same length**. | [Arrays](/diff_test.go#:~:text=Arrays) |
   139  | A Go byte array is represented as a **JSON array of JSON numbers**. | A Go byte array is represented as a **Base64-encoded JSON string**. | [ByteArrays](/diff_test.go#:~:text=TestByteArrays) |
   140  | `MarshalJSON` and `UnmarshalJSON` methods declared on a pointer receiver are **inconsistently called**. | `MarshalJSON` and `UnmarshalJSON` methods declared on a pointer receiver are **consistently called**. | [PointerReceiver](/diff_test.go#:~:text=TestPointerReceiver) |
   141  | A Go map is marshaled in a **deterministic order**. | A Go map is marshaled in a **non-deterministic order**. | [MapDeterminism](/diff_test.go#:~:text=TestMapDeterminism) |
   142  | JSON strings are encoded **with HTML-specific characters being escaped**. | JSON strings are encoded **without any characters being escaped** (unless necessary). | [EscapeHTML](/diff_test.go#:~:text=TestEscapeHTML) |
   143  | When marshaling, invalid UTF-8 within a Go string **are silently replaced**. | When marshaling, invalid UTF-8 within a Go string **results in an error**. | [InvalidUTF8](/diff_test.go#:~:text=TestInvalidUTF8) |
   144  | When unmarshaling, invalid UTF-8 within a JSON string **are silently replaced**. | When unmarshaling, invalid UTF-8 within a JSON string **results in an error**. | [InvalidUTF8](/diff_test.go#:~:text=TestInvalidUTF8) |
   145  | When marshaling, **an error does not occur** if the output JSON value contains objects with duplicate names. | When marshaling, **an error does occur** if the output JSON value contains objects with duplicate names. | [DuplicateNames](/diff_test.go#:~:text=TestDuplicateNames) |
   146  | When unmarshaling, **an error does not occur** if the input JSON value contains objects with duplicate names. | When unmarshaling, **an error does occur** if the input JSON value contains objects with duplicate names. | [DuplicateNames](/diff_test.go#:~:text=TestDuplicateNames) |
   147  | Unmarshaling a JSON null into a non-empty Go value **inconsistently clears the value or does nothing**. | Unmarshaling a JSON null into a non-empty Go value **always clears the value**. | [MergeNull](/diff_test.go#:~:text=TestMergeNull) |
   148  | Unmarshaling a JSON value into a non-empty Go value **follows inconsistent and bizarre behavior**. | Unmarshaling a JSON value into a non-empty Go value **always merges if the input is an object, and otherwise replaces**.  | [MergeComposite](/diff_test.go#:~:text=TestMergeComposite) |
   149  | A `time.Duration` is represented as a **JSON number containing the decimal number of nanoseconds**. | A `time.Duration` is represented as a **JSON string containing the formatted duration (e.g., "1h2m3.456s")**. | [TimeDurations](/diff_test.go#:~:text=TestTimeDurations) |
   150  | Unmarshaling a JSON number into a Go float beyond its representation **results in an error**. | Unmarshaling a JSON number into a Go float beyond its representation **uses the closest representable value (e.g., ±`math.MaxFloat`)**. | [MaxFloats](/diff_test.go#:~:text=TestMaxFloats) |
   151  | A Go struct with only unexported fields **can be serialized**. | A Go struct with only unexported fields **cannot be serialized**. | [EmptyStructs](/diff_test.go#:~:text=TestEmptyStructs) |
   152  | A Go struct that embeds an unexported struct type **can sometimes be serialized**. | A Go struct that embeds an unexported struct type **cannot be serialized**. | [EmbedUnexported](/diff_test.go#:~:text=TestEmbedUnexported) |
   153  
   154  See [diff_test.go](/diff_test.go) for details about every change.
   155  
   156  ## Performance
   157  
   158  One of the goals of the v2 module is to be more performant than v1.
   159  
   160  Each of the charts below show the performance across
   161  several different JSON implementations:
   162  
   163  * `JSONv1` is `encoding/json` at `v1.18.2`
   164  * `JSONv2` is `github.com/go-json-experiment/json` at `v0.0.0-20220524042235-dd8be80fc4a7`
   165  * `JSONIterator` is `github.com/json-iterator/go` at `v1.1.12`
   166  * `SegmentJSON` is `github.com/segmentio/encoding/json` at `v0.3.5`
   167  * `GoJSON` is `github.com/goccy/go-json` at `v0.9.7`
   168  * `SonicJSON` is `github.com/bytedance/sonic` at `v1.3.0`
   169  
   170  Benchmarks were run across various datasets:
   171  
   172  * `CanadaGeometry` is a GeoJSON (RFC 7946) representation of Canada.
   173    It contains many JSON arrays of arrays of two-element arrays of numbers.
   174  * `CITMCatalog` contains many JSON objects using numeric names.
   175  * `SyntheaFHIR` is sample JSON data from the healthcare industry.
   176    It contains many nested JSON objects with mostly string values,
   177    where the set of unique string values is relatively small.
   178  * `TwitterStatus` is the JSON response from the Twitter API.
   179    It contains a mix of all different JSON kinds, where string values
   180    are a mix of both single-byte ASCII and multi-byte Unicode.
   181  * `GolangSource` is a simple tree representing the Go source code.
   182    It contains many nested JSON objects, each with the same schema.
   183  * `StringUnicode` contains many strings with multi-byte Unicode runes.
   184  
   185  All of the implementations other than `JSONv1` and `JSONv2` make
   186  extensive use of `unsafe`. As such, we expect those to generally be faster,
   187  but at the cost of memory and type safety. `SonicJSON` goes a step even further
   188  and uses just-in-time compilation to generate machine code specialized
   189  for the Go type being marshaled or unmarshaled.
   190  Also, `SonicJSON` does not validate JSON strings for valid UTF-8,
   191  and so gains a notable performance boost on datasets with multi-byte Unicode.
   192  Benchmarks are performed based on the default marshal and unmarshal behavior
   193  of each package. Note that `JSONv2` aims to be safe and correct by default,
   194  which may not be the most performant strategy.
   195  
   196  `JSONv2` has several semantic changes relative to `JSONv1` that
   197  impacts performance:
   198  
   199  1.  When marshaling, `JSONv2` no longer sorts the keys of a Go map.
   200      This will improve performance.
   201  2.  When marshaling or unmarshaling, `JSONv2` always checks
   202      to make sure JSON object names are unique.
   203      This will hurt performance, but is more correct.
   204  3.  When marshaling or unmarshaling, `JSONv2` always
   205      shallow copies the underlying value for a Go interface and
   206      shallow copies the key and value for entries in a Go map.
   207      This is done to keep the value as addressable so that `JSONv2` can
   208      call methods and functions that operate on a pointer receiver.
   209      This will hurt performance, but is more correct.
   210  
   211  All of the charts are unit-less since the values are normalized
   212  relative to `JSONv1`, which is why `JSONv1` always has a value of 1.
   213  A lower value is better (i.e., runs faster).
   214  
   215  Benchmarks were performed on an AMD Ryzen 9 5900X.
   216  
   217  The code for the benchmarks is located at
   218  https://github.com/go-json-experiment/jsonbench.
   219  
   220  ### Marshal Performance
   221  
   222  #### Concrete types
   223  
   224  ![Benchmark Marshal Concrete](benchmark-marshal-concrete.png)
   225  
   226  * This compares marshal performance when serializing
   227    [from concrete types](/testdata_test.go).
   228  * The `JSONv1` implementation is close to optimal (without the use of `unsafe`).
   229  * Relative to `JSONv1`, `JSONv2` is generally as fast or slightly faster.
   230  * Relative to `JSONIterator`, `JSONv2` is up to 1.3x faster.
   231  * Relative to `SegmentJSON`, `JSONv2` is up to 1.8x slower.
   232  * Relative to `GoJSON`, `JSONv2` is up to 2.0x slower.
   233  * Relative to `SonicJSON`, `JSONv2` is about 1.8x to 3.2x slower
   234    (ignoring `StringUnicode` since `SonicJSON` does not validate UTF-8).
   235  * For `JSONv1` and `JSONv2`, marshaling from concrete types is
   236    mostly limited by the performance of Go reflection.
   237  
   238  #### Interface types
   239  
   240  ![Benchmark Marshal Interface](benchmark-marshal-interface.png)
   241  
   242  * This compares marshal performance when serializing from
   243    `any`, `map[string]any`, and `[]any` types.
   244  * Relative to `JSONv1`, `JSONv2` is about 1.5x to 4.2x faster.
   245  * Relative to `JSONIterator`, `JSONv2` is about 1.1x to 2.4x faster.
   246  * Relative to `SegmentJSON`, `JSONv2` is about 1.2x to 1.8x faster.
   247  * Relative to `GoJSON`, `JSONv2` is about 1.1x to 2.5x faster.
   248  * Relative to `SonicJSON`, `JSONv2` is up to 1.5x slower
   249    (ignoring `StringUnicode` since `SonicJSON` does not validate UTF-8).
   250  * `JSONv2` is faster than the alternatives.
   251    One advantange is because it does not sort the keys for a `map[string]any`,
   252    while alternatives (except `SonicJSON` and `JSONIterator`) do sort the keys.
   253  
   254  #### RawValue types
   255  
   256  ![Benchmark Marshal Rawvalue](benchmark-marshal-rawvalue.png)
   257  
   258  * This compares performance when marshaling from a `json.RawValue`.
   259    This mostly exercises the underlying encoder and
   260    hides the cost of Go reflection.
   261  * Relative to `JSONv1`, `JSONv2` is about 3.5x to 7.8x faster.
   262  * `JSONIterator` is blazingly fast because
   263    [it does not validate whether the raw value is valid](https://go.dev/play/p/bun9IXQCKRe)
   264    and simply copies it to the output.
   265  * Relative to `SegmentJSON`, `JSONv2` is about 1.5x to 2.7x faster.
   266  * Relative to `GoJSON`, `JSONv2` is up to 2.2x faster.
   267  * Relative to `SonicJSON`, `JSONv2` is up to 1.5x faster.
   268  * Aside from `JSONIterator`, `JSONv2` is generally the fastest.
   269  
   270  ### Unmarshal Performance
   271  
   272  #### Concrete types
   273  
   274  ![Benchmark Unmarshal Concrete](benchmark-unmarshal-concrete.png)
   275  
   276  * This compares unmarshal performance when deserializing
   277    [into concrete types](/testdata_test.go).
   278  * Relative to `JSONv1`, `JSONv2` is about 1.8x to 5.7x faster.
   279  * Relative to `JSONIterator`, `JSONv2` is about 1.1x to 1.6x slower.
   280  * Relative to `SegmentJSON`, `JSONv2` is up to 2.5x slower.
   281  * Relative to `GoJSON`, `JSONv2` is about 1.4x to 2.1x slower.
   282  * Relative to `SonicJSON`, `JSONv2` is up to 4.0x slower
   283    (ignoring `StringUnicode` since `SonicJSON` does not validate UTF-8).
   284  * For `JSONv1` and `JSONv2`, unmarshaling into concrete types is
   285    mostly limited by the performance of Go reflection.
   286  
   287  #### Interface types
   288  
   289  ![Benchmark Unmarshal Interface](benchmark-unmarshal-interface.png)
   290  
   291  * This compares unmarshal performance when deserializing into
   292    `any`, `map[string]any`, and `[]any` types.
   293  * Relative to `JSONv1`, `JSONv2` is about 1.tx to 4.3x faster.
   294  * Relative to `JSONIterator`, `JSONv2` is up to 1.5x faster.
   295  * Relative to `SegmentJSON`, `JSONv2` is about 1.5 to 3.7x faster.
   296  * Relative to `GoJSON`, `JSONv2` is up to 1.3x faster.
   297  * Relative to `SonicJSON`, `JSONv2` is up to 1.5x slower
   298    (ignoring `StringUnicode` since `SonicJSON` does not validate UTF-8).
   299  * Aside from `SonicJSON`, `JSONv2` is generally just as fast
   300    or faster than all the alternatives.
   301  
   302  #### RawValue types
   303  
   304  ![Benchmark Unmarshal Rawvalue](benchmark-unmarshal-rawvalue.png)
   305  
   306  * This compares performance when unmarshaling into a `json.RawValue`.
   307    This mostly exercises the underlying decoder and
   308    hides away most of the cost of Go reflection.
   309  * Relative to `JSONv1`, `JSONv2` is about 8.3x to 17.0x faster.
   310  * Relative to `JSONIterator`, `JSONv2` is up to 2.0x faster.
   311  * Relative to `SegmentJSON`, `JSONv2` is up to 1.6x faster or 1.7x slower.
   312  * Relative to `GoJSON`, `JSONv2` is up to 1.9x faster or 2.1x slower.
   313  * Relative to `SonicJSON`, `JSONv2` is up to 2.0x faster
   314    (ignoring `StringUnicode` since `SonicJSON` does not validate UTF-8).
   315  * `JSONv1` takes a
   316    [lexical scanning approach](https://talks.golang.org/2011/lex.slide#1),
   317    which performs a virtual function call for every byte of input.
   318    In contrast, `JSONv2` makes heavy use of iterative and linear parsing logic
   319    (with extra complexity to resume parsing when encountering segmented buffers).
   320  * `JSONv2` is comparable to the alternatives that use `unsafe`.
   321    Generally it is faster, but sometimes it is slower.