github.com/kjk/siser@v0.0.0-20220410204903-1b1e84ea1397/README.md (about)

     1  # This has moved to https://github.com/kjk/common (package `siser`)
     2  
     3  Package `siser` is a Simple Serialization library for Go
     4  
     5  Imagine you want to write many records of somewhat structured data
     6  to a file. Think of it as structured logging.
     7  
     8  You could use csv format, but csv values are identified by a position,
     9  not name. They are also hard to read.
    10  
    11  You could serialize as json and write one line per json record but
    12  json isn't great for human readability (imagine you `tail -f` a log
    13  file with json records).
    14  
    15  This library is meant to be a middle ground:
    16  * you can serialize arbitrary records with multiple key/value pairs
    17  * the output is human-readable
    18  * it's designed to be efficient and simple to use
    19  
    20  ## API usage
    21  
    22  Imagine you want log basic info about http requests.
    23  
    24  ```go
    25  func createWriter() (*siser.Writer, error) {
    26  	f, err := os.Create("http_access.log")
    27  	if err != nil {
    28  		return nil, err
    29  	}
    30  	w := siser.NewWriter(f)
    31  	return w, nil
    32  }
    33  
    34  func logHTTPRequest(w *siser.Writer, url string, ipAddr string, statusCode int) error {
    35  	var rec siser.Record
    36  	// you can append multiple key/value pairs at once
    37  	rec.Write("url", url, "ipaddr", ipAddr)
    38  	// or assemble with multiple calls
    39  	rec.Writes("code", strconv.Itoa(statusCode))
    40  	_, err := w.WriteRecord(&rec)
    41  	return err
    42  }
    43  ```
    44  
    45  The data will be written to writer underlying `siser.Writer` as:
    46  ```
    47  61 1553488435903 httplog
    48  url: https://blog.kowalczyk.info
    49  ipaddr: 10.0.0.1
    50  code: 200
    51  ```
    52  
    53  Here's what and why:
    54  * `61` is the size of the data. This allows us to read the exact number of bytes in the record
    55  * `1553488435903` is a timestamp which is Unix epoch time in milliseconds (more precision than standard Unix time which is in seconds)
    56  * `httplog` is optional name of the record. This allows you to easily write multiple types of records to a file
    57  
    58  To read all records from the file:
    59  ```go
    60  f, err := os.Open("http_access.log")
    61  fatalIfErr(err)
    62  defer f.Close()
    63  reader := siser.NewReader(f)
    64  for reader.ReadNextRecord() {
    65  	rec := r.Record
    66  	name := rec.Name // "httplog"
    67  	timestamp := rec.Timestamp
    68  	code, ok := rec.Get("code")
    69  	// get rest of values and and do something with them
    70  }
    71  fatalIfErr(rec.Err())
    72  ```
    73  
    74  ## Usage scenarios
    75  
    76  I use `siser` for in my web services for 2 use cases:
    77  
    78  * logging to help in debugging issues after they happen
    79  * implementing poor-man's analytics
    80  
    81  Logging for debugging adds a little bit more structure over
    82  ad hoc logging. I can add some meta-data to log entries
    83  and in addition to reading the logs I can quickly write
    84  programs that filter the logs. For example if I add serving time
    85  to http request log I could easily write a program that shows
    86  requests that take over 1 second to serve.
    87  
    88  Another one is poor-man's analytics. For example, if you're building
    89  a web service that converts .png file to .ico file, it would be
    90  good to know daily statistics about how many files were converted,
    91  how much time an average conversion takes etc.
    92  
    93  ## Performance and implementation notes
    94  
    95  Some implementation decisions were made with performance in mind.
    96  
    97  Given key/value nature of the record, an easy choice would be to use map[string]string as source to encode/decode functions.
    98  
    99  However `[]string` is more efficient than a `map`. Additionally, a slice can be reused across multiple records. We can clear it by setting the size to zero and reuse the underlying array. A map would require allocating a new instance for each record, which would create a lot of work for garbage collector.
   100  
   101  When serializing, you need to use `Reset` method to get the benefit of efficient re-use of the `Record`.
   102  
   103  When reading and deserializing records, `siser.Reader` uses this optimization internally.
   104  
   105  The format avoids the need for escaping keys and values, which helps in making encoding/decoding fast.
   106  
   107  How does that play out in real life? I wrote a benchmark comparing siser vs. json.Marshal. It’s about 30% faster:
   108  
   109  ```
   110  $ go test -bench=.
   111  BenchmarkSiserMarshal-8   	 1000000	      1903 ns/op
   112  BenchmarkJSONMarshal-8    	  500000	      2905 ns/op
   113  ```
   114  
   115  The format is binary-safe and works for serializing large values e.g. you can use png image as value.
   116  
   117  It’s also very easy to implement in any language.