github.com/bir3/gocompiler@v0.9.2202/extra/compress/fse/README.md (about)

     1  # Finite State Entropy
     2  
     3  This package provides Finite State Entropy encoding and decoding.
     4              
     5  Finite State Entropy (also referenced as [tANS](https://en.wikipedia.org/wiki/Asymmetric_numeral_systems#tANS)) 
     6  encoding provides a fast near-optimal symbol encoding/decoding
     7  for byte blocks as implemented in [zstandard](https://github.com/facebook/zstd).
     8  
     9  This can be used for compressing input with a lot of similar input values to the smallest number of bytes.
    10  This does not perform any multi-byte [dictionary coding](https://en.wikipedia.org/wiki/Dictionary_coder) as LZ coders,
    11  but it can be used as a secondary step to compressors (like Snappy) that does not do entropy encoding. 
    12  
    13  * [Godoc documentation](https://godoc.org/github.com/klauspost/compress/fse)
    14  
    15  ## News
    16  
    17   * Feb 2018: First implementation released. Consider this beta software for now.
    18  
    19  # Usage
    20  
    21  This package provides a low level interface that allows to compress single independent blocks. 
    22  
    23  Each block is separate, and there is no built in integrity checks. 
    24  This means that the caller should keep track of block sizes and also do checksums if needed.  
    25  
    26  Compressing a block is done via the [`Compress`](https://godoc.org/github.com/klauspost/compress/fse#Compress) function.
    27  You must provide input and will receive the output and maybe an error.
    28  
    29  These error values can be returned:
    30  
    31  | Error               | Description                                                                 |
    32  |---------------------|-----------------------------------------------------------------------------|
    33  | `<nil>`             | Everything ok, output is returned                                           |
    34  | `ErrIncompressible` | Returned when input is judged to be too hard to compress                    |
    35  | `ErrUseRLE`         | Returned from the compressor when the input is a single byte value repeated |
    36  | `(error)`           | An internal error occurred.                                                 |
    37  
    38  As can be seen above there are errors that will be returned even under normal operation so it is important to handle these.
    39  
    40  To reduce allocations you can provide a [`Scratch`](https://godoc.org/github.com/klauspost/compress/fse#Scratch) object 
    41  that can be re-used for successive calls. Both compression and decompression accepts a `Scratch` object, and the same 
    42  object can be used for both.   
    43  
    44  Be aware, that when re-using a `Scratch` object that the *output* buffer is also re-used, so if you are still using this
    45  you must set the `Out` field in the scratch to nil. The same buffer is used for compression and decompression output.
    46  
    47  Decompressing is done by calling the [`Decompress`](https://godoc.org/github.com/klauspost/compress/fse#Decompress) function.
    48  You must provide the output from the compression stage, at exactly the size you got back. If you receive an error back
    49  your input was likely corrupted. 
    50  
    51  It is important to note that a successful decoding does *not* mean your output matches your original input. 
    52  There are no integrity checks, so relying on errors from the decompressor does not assure your data is valid.
    53  
    54  For more detailed usage, see examples in the [godoc documentation](https://godoc.org/github.com/klauspost/compress/fse#pkg-examples).
    55  
    56  # Performance
    57  
    58  A lot of factors are affecting speed. Block sizes and compressibility of the material are primary factors.  
    59  All compression functions are currently only running on the calling goroutine so only one core will be used per block.  
    60  
    61  The compressor is significantly faster if symbols are kept as small as possible. The highest byte value of the input
    62  is used to reduce some of the processing, so if all your input is above byte value 64 for instance, it may be 
    63  beneficial to transpose all your input values down by 64.   
    64  
    65  With moderate block sizes around 64k speed are typically 200MB/s per core for compression and 
    66  around 300MB/s decompression speed. 
    67  
    68  The same hardware typically does Huffman (deflate) encoding at 125MB/s and decompression at 100MB/s. 
    69  
    70  # Plans
    71  
    72  At one point, more internals will be exposed to facilitate more "expert" usage of the components. 
    73  
    74  A streaming interface is also likely to be implemented. Likely compatible with [FSE stream format](https://github.com/Cyan4973/FiniteStateEntropy/blob/dev/programs/fileio.c#L261).  
    75  
    76  # Contributing
    77  
    78  Contributions are always welcome. Be aware that adding public functions will require good justification and breaking 
    79  changes will likely not be accepted. If in doubt open an issue before writing the PR.