git.lukeshu.com/go/lowmemjson@v0.3.9-0.20230723050957-72f6d13f6fb2/README.md (about)

     1  <!--
     2  Copyright (C) 2023  Luke Shumaker <lukeshu@lukeshu.com>
     3  
     4  SPDX-License-Identifier: GPL-2.0-or-later
     5  -->
     6  
     7  # lowmemjson
     8  
     9  `lowmemjson` is a mostly-compatible alternative to the standard
    10  library's [`encoding/json`][] that has dramatically lower memory
    11  requirements for large data structures.
    12  
    13  `lowmemjson` is not targeting extremely resource-constrained
    14  environments, but rather targets being able to efficiently stream
    15  gigabytes of JSON without requiring gigabytes of memory overhead.
    16  
    17  ## Compatibility
    18  
    19  `encoding/json`'s APIs are designed around the idea that it can buffer
    20  the entire JSON document as a `[]byte`, and as intermediate steps it
    21  may have a fragment buffered multiple times while encoding; encoding a
    22  gigabyte of data may consume several gigabytes of memory.  In
    23  contrast, `lowmemjson`'s APIs are designed around streaming
    24  (`io.Writer` and `io.RuneScanner`), trying to have the memory overhead
    25  of encode and decode operations be as close to O(1) as possible.
    26  
    27  `lowmemjson` offers a high level of compatibility with the
    28  `encoding/json` APIs, but for best memory usage (avoiding storing
    29  large byte arrays inherent in `encoding/json`'s API), it is
    30  recommended to migrate to `lowmemjson`'s own APIs.
    31  
    32  ### Callee API (objects to be encoded-to/decoded-from JSON)
    33  
    34  `lowmemjson` supports `encoding/json`'s `json:` struct field tags, as
    35  well as the `encoding/json.Marshaler` and `encoding/json.Unmarshaler`
    36  interfaces; you do not need to adjust your types to successfully
    37  migrate from `encoding/json` to `lowmemjson`.
    38  
    39  That is: Given types that decode as desired with `encoding/json`,
    40  those types should decode identically with `lowmemjson`.  Given types
    41  that encode as desired with `encoding/json`, those types should encode
    42  identically with `lowmemjson` (assuming an appropriately configured
    43  `ReEncoder` to match the whitespace-handling and special-character
    44  escaping; a `ReEncoderConfig` with `Compact=true` and all other
    45  settings left as zero will match the behavior of `json.Marshal`).
    46  
    47  For better memory usage:
    48   - Instead of implementing [`json.Marshaler`][], consider implementing
    49     [`lowmemjson.Encodable`][] (or implementing both).
    50   - Instead of implementing [`json.Unmarshaler`][], consider
    51     implementing [`lowmemjson.Decodable`][] (or implementing both).
    52  
    53  ### Caller API
    54  
    55  `lowmemjson` offers a [`lowmemjson/compat/json`][] package that is a
    56  (mostly) drop-in replacement for `encoding/json` (see the package's
    57  documentation for the small incompatibilities).
    58  
    59  For better memory usage, avoid using `lowmemjson/compat/json` and
    60  instead use `lowmemjson` directly:
    61   - Instead of using <code>[json.Marshal][`json.Marshal`](val)</code>,
    62     consider using
    63     <code>[lowmemjson.NewEncoder][`lowmemjson.NewEncoder`](w).[Encode][`lowmemjson.Encoder.Encode`](val)</code>.
    64   - Instead of using
    65     <code>[json.Unmarshal][`json.Unmarshal`](dat, &val)</code>, consider
    66     using
    67     <code>[lowmemjson.NewDecoder][`lowmemjson.NewDecoder`](r).[DecodeThenEOF][`lowmemjson.Decoder.DecodeThenEOF`](&val)</code>.
    68   - Instead of using [`json.Compact`][], [`json.HTMLEscape`][], or
    69     [`json.Indent`][]; consider using a [`lowmemjson.ReEncoder`][].
    70   - Instead of using [`json.Valid`][], consider using a
    71     [`lowmemjson.ReEncoder`][] with `io.Discard` as the output.
    72  
    73  The error types returned from `lowmemjson` are different from the
    74  error types returned by `encoding/json`, but `lowmemjson/compat/json`
    75  translates them back to the types returned by `encoding/json`.
    76  
    77  ## Overview
    78  
    79  ### Caller API
    80  
    81  There are 3 main types that make up the caller API for producing and
    82  handling streams of JSON, and each of those types has some associated
    83  types that go with it:
    84  
    85   1. `type Decoder`
    86      + `type DecodeArgumentError`
    87      + `type DecodeError`
    88        * `type DecodeReadError`
    89        * `type DecodeSyntaxError`
    90        * `type DecodeTypeError`
    91  
    92   2. `type Encoder`
    93      + `type EncodeTypeError`
    94      + `type EncodeValueError`
    95      + `type EncodeMethodError`
    96  
    97   3. `type ReEncoder`
    98      + `type ReEncoderConfig`
    99      + `type ReEncodeSyntaxError`
   100      + `type BackslashEscaper`
   101        * `type BackslashEscapeMode`
   102  
   103  A `*Decoder` handles decoding a JSON stream into Go values; the most
   104  common use of it will be
   105  `lowmemjson.NewDecoder(r).DecodeThenEOF(&val)` or
   106  `lowmemjson.NewDecoder(bufio.NewReader(r)).DecodeThenEOF(&val)`.
   107  
   108  A `*ReEncoder` handles transforming a JSON stream; this is useful for
   109  prettifying, minifying, sanitizing, and/or validating JSON.  A
   110  `*ReEncoder` wraps an `io.Writer`, itself implementing `io.Writer`.
   111  The most common use of it will be something along the lines of
   112  `out = lowmemjson.NewReEncoder(out, lowmemjson.ReEncoderConfig{…})`.
   113  
   114  An `*Encoder` handles encoding Go values into a JSON stream.
   115  `*Encoder` doesn't take much care in to making its output nice; so it
   116  is usually desirable to have the output stream of an `*Encoder` be a `*ReEncoder`; the most
   117  common use of it will be
   118  `lowmemjson.NewEncoder(lowmemjson.NewReEncoder(out, lowmemjson.ReEncoderConfig{…})).Encode(val)`.
   119  
   120  `*Encoder` and `*ReEncoder` both tend to make many small writes; if
   121  writes are syscalls, you may want to wrap their output in a
   122  `bufio.Writer`.
   123  
   124  ### Callee API
   125  
   126  For defining Go types with custom JSON representations, `lowmemjson`
   127  respects all of the `json:` struct field tags of `encoding/json`, as
   128  well as respecting the same "marshaler" and "unmarshaler" interfaces
   129  as `encoding/json`.  In addition to those interfaces, `lowmemjson`
   130  adds two of its own interfaces, and some helper functions to help with
   131  implementing those interfaces:
   132  
   133   1. `type Decodable`
   134      + `func DecodeArray`
   135      + `func DecodeObject`
   136   2. `type Encodable`
   137  
   138  These are streaming variants of the standard `json.Unmarshaler` and
   139  `json.Marshaler` interfaces.
   140  
   141  <!-- packages -->
   142  [`lowmemjson`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson
   143  [`lowmemjson/compat/json`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson/compat/json
   144  [`encoding/json`]: https://pkg.go.dev/encoding/json@go1.20
   145  
   146  <!-- encoding/json symbols -->
   147  [`json.Marshaler`]: https://pkg.go.dev/encoding/json@go1.20#Marshaler
   148  [`json.Unmarshaler`]: https://pkg.go.dev/encoding/json@go1.20#Unmarshaler
   149  [`json.Marshal`]: https://pkg.go.dev/encoding/json@go1.20#Marshal
   150  [`json.Unmarshal`]: https://pkg.go.dev/encoding/json@go1.20#Unmarshal
   151  [`json.Compact`]: https://pkg.go.dev/encoding/json@go1.20#Compact
   152  [`json.HTMLEscape`]: https://pkg.go.dev/encoding/json@go1.20#HTMLEscape
   153  [`json.Indent`]: https://pkg.go.dev/encoding/json@go1.20#Indent
   154  [`json.Valid`]: https://pkg.go.dev/encoding/json@go1.20#Valid
   155  
   156  <!-- lowmemjson symbols -->
   157  [`lowmemjson.Encodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encodable
   158  [`lowmemjson.Decodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decodable
   159  [`lowmemjson.NewEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewEncoder
   160  [`lowmemjson.Encoder.Encode`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encoder.Encode
   161  [`lowmemjson.NewDecoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewDecoder
   162  [`lowmemjson.Decoder.DecodeThenEOF`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decoder.DecodeThenEOF
   163  [`lowmemjson.ReEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#ReEncoder