github.com/segmentio/encoding@v0.4.0/README.md (about) 1 # encoding ![build status](https://github.com/segmentio/encoding/actions/workflows/test.yml/badge.svg) [![Go Report Card](https://goreportcard.com/badge/github.com/segmentio/encoding)](https://goreportcard.com/report/github.com/segmentio/encoding) [![GoDoc](https://godoc.org/github.com/segmentio/encoding?status.svg)](https://godoc.org/github.com/segmentio/encoding) 2 3 Go package containing implementations of encoders and decoders for various data 4 formats. 5 6 ## Motivation 7 8 At Segment, we do a lot of marshaling and unmarshaling of data when sending, 9 queuing, or storing messages. The resources we need to provision on the 10 infrastructure are directly related to the type and amount of data that we are 11 processing. At the scale we operate at, the tools we choose to build programs 12 can have a large impact on the efficiency of our systems. It is important to 13 explore alternative approaches when we reach the limits of the code we use. 14 15 This repository includes experiments for Go packages for marshaling and 16 unmarshaling data in various formats. While the focus is on providing a high 17 performance library, we also aim for very low development and maintenance overhead 18 by implementing APIs that can be used as drop-in replacements for the default 19 solutions. 20 21 ## Requirements and Maintenance Schedule 22 23 This package has no dependencies outside of the core runtime of Go. It 24 requires a recent version of Go. 25 26 This package follows the same maintenance schedule as the [Go 27 project](https://github.com/golang/go/wiki/Go-Release-Cycle#release-maintenance), 28 meaning that issues relating to versions of Go which aren't supported by the Go 29 team, or versions of this package which are older than 1 year, are unlikely to 30 be considered. 31 32 Additionally, we have fuzz tests which aren't a runtime required dependency but 33 will be pulled in when running `go mod tidy`. Please don't include these go.mod 34 updates in change requests. 35 36 ## encoding/json [![GoDoc](https://godoc.org/github.com/segmentio/encoding/json?status.svg)](https://godoc.org/github.com/segmentio/encoding/json) 37 38 More details about _how_ this package achieves a lower CPU and memory footprint 39 can be found [in the package README](json/README.md). 40 41 The `json` sub-package provides a re-implementation of the functionalities 42 offered by the standard library's [`encoding/json`](https://golang.org/pkg/encoding/json/) 43 package, with a focus on lowering the CPU and memory footprint of the code. 44 45 The exported API of this package mirrors the standard library's 46 [`encoding/json`](https://golang.org/pkg/encoding/json/) package, the only 47 change needed to take advantage of the performance improvements is the import 48 path of the `json` package, from: 49 ```go 50 import ( 51 "encoding/json" 52 ) 53 ``` 54 to 55 ```go 56 import ( 57 "github.com/segmentio/encoding/json" 58 ) 59 ``` 60 61 The improvement can be significant for code that heavily relies on serializing 62 and deserializing JSON payloads. The CI pipeline runs benchmarks to compare the 63 performance of the package with the standard library and other popular 64 alternatives; here's an overview of the results: 65 66 **Comparing to encoding/json (`v1.16.2`)** 67 ``` 68 name old time/op new time/op delta 69 Marshal/*json.codeResponse2 6.40ms ± 2% 3.82ms ± 1% -40.29% (p=0.008 n=5+5) 70 Unmarshal/*json.codeResponse2 28.1ms ± 3% 5.6ms ± 3% -80.21% (p=0.008 n=5+5) 71 72 name old speed new speed delta 73 Marshal/*json.codeResponse2 303MB/s ± 2% 507MB/s ± 1% +67.47% (p=0.008 n=5+5) 74 Unmarshal/*json.codeResponse2 69.2MB/s ± 3% 349.6MB/s ± 3% +405.42% (p=0.008 n=5+5) 75 76 name old alloc/op new alloc/op delta 77 Marshal/*json.codeResponse2 0.00B 0.00B ~ (all equal) 78 Unmarshal/*json.codeResponse2 1.80MB ± 1% 0.02MB ± 0% -99.14% (p=0.016 n=5+4) 79 80 name old allocs/op new allocs/op delta 81 Marshal/*json.codeResponse2 0.00 0.00 ~ (all equal) 82 Unmarshal/*json.codeResponse2 76.6k ± 0% 0.1k ± 3% -99.92% (p=0.008 n=5+5) 83 ``` 84 85 *Benchmarks were run on a Core i9-8950HK CPU @ 2.90GHz.* 86 87 **Comparing to github.com/json-iterator/go (`v1.1.10`)** 88 ``` 89 name old time/op new time/op delta 90 Marshal/*json.codeResponse2 6.19ms ± 3% 3.82ms ± 1% -38.26% (p=0.008 n=5+5) 91 Unmarshal/*json.codeResponse2 8.52ms ± 3% 5.55ms ± 3% -34.84% (p=0.008 n=5+5) 92 93 name old speed new speed delta 94 Marshal/*json.codeResponse2 313MB/s ± 3% 507MB/s ± 1% +61.91% (p=0.008 n=5+5) 95 Unmarshal/*json.codeResponse2 228MB/s ± 3% 350MB/s ± 3% +53.50% (p=0.008 n=5+5) 96 97 name old alloc/op new alloc/op delta 98 Marshal/*json.codeResponse2 8.00B ± 0% 0.00B -100.00% (p=0.008 n=5+5) 99 Unmarshal/*json.codeResponse2 1.05MB ± 0% 0.02MB ± 0% -98.53% (p=0.000 n=5+4) 100 101 name old allocs/op new allocs/op delta 102 Marshal/*json.codeResponse2 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5) 103 Unmarshal/*json.codeResponse2 37.2k ± 0% 0.1k ± 3% -99.83% (p=0.008 n=5+5) 104 ``` 105 106 Although this package aims to be a drop-in replacement of [`encoding/json`](https://golang.org/pkg/encoding/json/), 107 it does not guarantee the same error messages. It will error in the same cases 108 as the standard library, but the exact error message may be different. 109 110 ## encoding/iso8601 [![GoDoc](https://godoc.org/github.com/segmentio/encoding/iso8601?status.svg)](https://godoc.org/github.com/segmentio/encoding/iso8601) 111 112 The `iso8601` sub-package exposes APIs to efficiently deal with with string 113 representations of iso8601 dates. 114 115 Data formats like JSON have no syntaxes to represent dates, they are usually 116 serialized and represented as a string value. In our experience, we often have 117 to _check_ whether a string value looks like a date, and either construct a 118 `time.Time` by parsing it or simply treat it as a `string`. This check can be 119 done by attempting to parse the value, and if it fails fallback to using the 120 raw string. Unfortunately, while the _happy path_ for `time.Parse` is fairly 121 efficient, constructing errors is much slower and has a much bigger memory 122 footprint. 123 124 We've developed fast iso8601 validation functions that cause no heap allocations 125 to remediate this problem. We added a validation step to determine whether 126 the value is a date representation or a simple string. This reduced CPU and 127 memory usage by 5% in some programs that were doing `time.Parse` calls on very 128 hot code paths.