github.com/cellofellow/gopkg@v0.0.0-20140722061823-eec0544a62ad/image/webp/libwebp/doc/webp-lossless-bitstream-spec.txt (about)

     1  <!--
     2  
     3  Although you may be viewing an alternate representation, this document
     4  is sourced in Markdown, a light-duty markup scheme, and is optimized for
     5  the [kramdown](http://kramdown.rubyforge.org/) transformer.
     6  
     7  See the accompanying README. External link targets are referenced at the
     8  end of this file.
     9  
    10  -->
    11  
    12  Specification for WebP Lossless Bitstream
    13  =========================================
    14  
    15  _Jyrki Alakuijala, Ph.D., Google, Inc., 2012-06-19_
    16  
    17  
    18  Abstract
    19  --------
    20  
    21  WebP lossless is an image format for lossless compression of ARGB
    22  images. The lossless format stores and restores the pixel values
    23  exactly, including the color values for zero alpha pixels. The
    24  format uses subresolution images, recursively embedded into the format
    25  itself, for storing statistical data about the images, such as the used
    26  entropy codes, spatial predictors, color space conversion, and color
    27  table. LZ77, Huffman coding, and a color cache are used for compression
    28  of the bulk data. Decoding speeds faster than PNG have been
    29  demonstrated, as well as 25% denser compression than can be achieved
    30  using today's PNG format.
    31  
    32  
    33  * TOC placeholder
    34  {:toc}
    35  
    36  
    37  Nomenclature
    38  ------------
    39  
    40  ARGB
    41  : A pixel value consisting of alpha, red, green, and blue values.
    42  
    43  ARGB image
    44  : A two-dimensional array containing ARGB pixels.
    45  
    46  color cache
    47  : A small hash-addressed array to store recently used colors, to be able
    48    to recall them with shorter codes.
    49  
    50  color indexing image
    51  : A one-dimensional image of colors that can be indexed using a small
    52    integer (up to 256 within WebP lossless).
    53  
    54  color transform image
    55  : A two-dimensional subresolution image containing data about
    56    correlations of color components.
    57  
    58  distance mapping
    59  : Changes LZ77 distances to have the smallest values for pixels in 2D
    60    proximity.
    61  
    62  entropy image
    63  : A two-dimensional subresolution image indicating which entropy coding
    64    should be used in a respective square in the image, i.e., each pixel
    65    is a meta Huffman code.
    66  
    67  Huffman code
    68  : A classic way to do entropy coding where a smaller number of bits are
    69    used for more frequent codes.
    70  
    71  LZ77
    72  : Dictionary-based sliding window compression algorithm that either
    73    emits symbols or describes them as sequences of past symbols.
    74  
    75  meta Huffman code
    76  : A small integer (up to 16 bits) that indexes an element in the meta
    77    Huffman table.
    78  
    79  predictor image
    80  : A two-dimensional subresolution image indicating which spatial
    81    predictor is used for a particular square in the image.
    82  
    83  prefix coding
    84  : A way to entropy code larger integers that codes a few bits of the
    85    integer using an entropy code and codifies the remaining bits raw.
    86    This allows for the descriptions of the entropy codes to remain
    87    relatively small even when the range of symbols is large.
    88  
    89  scan-line order
    90  : A processing order of pixels, left-to-right, top-to-bottom, starting
    91    from the left-hand-top pixel, proceeding to the right. Once a row is
    92    completed, continue from the left-hand column of the next row.
    93  
    94  
    95  1 Introduction
    96  --------------
    97  
    98  This document describes the compressed data representation of a WebP
    99  lossless image. It is intended as a detailed reference for WebP lossless
   100  encoder and decoder implementation.
   101  
   102  In this document, we extensively use C programming language syntax to
   103  describe the bitstream, and assume the existence of a function for
   104  reading bits, `ReadBits(n)`. The bytes are read in the natural order of
   105  the stream containing them, and bits of each byte are read in
   106  least-significant-bit-first order. When multiple bits are read at the
   107  same time, the integer is constructed from the original data in the
   108  original order. The most significant bits of the returned integer are
   109  also the most significant bits of the original data. Thus the statement
   110  
   111  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   112  b = ReadBits(2);
   113  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   114  
   115  is equivalent with the two statements below:
   116  
   117  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   118  b = ReadBits(1);
   119  b |= ReadBits(1) << 1;
   120  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   121  
   122  We assume that each color component (e.g. alpha, red, blue and green) is
   123  represented using an 8-bit byte. We define the corresponding type as
   124  uint8. A whole ARGB pixel is represented by a type called uint32, an
   125  unsigned integer consisting of 32 bits. In the code showing the behavior
   126  of the transformations, alpha value is codified in bits 31..24, red in
   127  bits 23..16, green in bits 15..8 and blue in bits 7..0, but
   128  implementations of the format are free to use another representation
   129  internally.
   130  
   131  Broadly, a WebP lossless image contains header data, transform
   132  information and actual image data. Headers contain width and height of
   133  the image. A WebP lossless image can go through four different types of
   134  transformation before being entropy encoded. The transform information
   135  in the bitstream contains the data required to apply the respective
   136  inverse transforms.
   137  
   138  
   139  2 RIFF Header
   140  -------------
   141  
   142  The beginning of the header has the RIFF container. This consists of the
   143  following 21 bytes:
   144  
   145     1. String "RIFF"
   146     2. A little-endian 32 bit value of the block length, the whole size
   147        of the block controlled by the RIFF header. Normally this equals
   148        the payload size (file size minus 8 bytes: 4 bytes for the 'RIFF'
   149        identifier and 4 bytes for storing the value itself).
   150     3. String "WEBP" (RIFF container name).
   151     4. String "VP8L" (chunk tag for lossless encoded image data).
   152     5. A little-endian 32-bit value of the number of bytes in the
   153        lossless stream.
   154     6. One byte signature 0x2f.
   155  
   156  The first 28 bits of the bitstream specify the width and height of the
   157  image. Width and height are decoded as 14-bit integers as follows:
   158  
   159  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   160  int image_width = ReadBits(14) + 1;
   161  int image_height = ReadBits(14) + 1;
   162  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   163  
   164  The 14-bit dynamics for image size limit the maximum size of a WebP
   165  lossless image to 16384✕16384 pixels.
   166  
   167  The alpha_is_used bit is a hint only, and should not impact decoding.
   168  It should be set to 0 when all alpha values are 255 in the picture, and
   169  1 otherwise.
   170  
   171  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   172  int alpha_is_used = ReadBits(1);
   173  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   174  
   175  The version_number is a 3 bit code that must be discarded by the decoder
   176  at this time. Complying encoders write a 3-bit value 0.
   177  
   178  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   179  int version_number = ReadBits(3);
   180  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   181  
   182  
   183  3 Transformations
   184  -----------------
   185  
   186  Transformations are reversible manipulations of the image data that can
   187  reduce the remaining symbolic entropy by modeling spatial and color
   188  correlations. Transformations can make the final compression more dense.
   189  
   190  An image can go through four types of transformation. A 1 bit indicates
   191  the presence of a transform. Each transform is allowed to be used only
   192  once. The transformations are used only for the main level ARGB image:
   193  the subresolution images have no transforms, not even the 0 bit
   194  indicating the end-of-transforms.
   195  
   196  Typically an encoder would use these transforms to reduce the Shannon
   197  entropy in the residual image. Also, the transform data can be decided
   198  based on entropy minimization.
   199  
   200  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   201  while (ReadBits(1)) {  // Transform present.
   202    // Decode transform type.
   203    enum TransformType transform_type = ReadBits(2);
   204    // Decode transform data.
   205    ...
   206  }
   207  
   208  // Decode actual image data (Section 4).
   209  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   210  
   211  If a transform is present then the next two bits specify the transform
   212  type. There are four types of transforms.
   213  
   214  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   215  enum TransformType {
   216    PREDICTOR_TRANSFORM             = 0,
   217    COLOR_TRANSFORM                 = 1,
   218    SUBTRACT_GREEN                  = 2,
   219    COLOR_INDEXING_TRANSFORM        = 3,
   220  };
   221  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   222  
   223  The transform type is followed by the transform data. Transform data
   224  contains the information required to apply the inverse transform and
   225  depends on the transform type. Next we describe the transform data for
   226  different types.
   227  
   228  
   229  ### Predictor Transform
   230  
   231  The predictor transform can be used to reduce entropy by exploiting the
   232  fact that neighboring pixels are often correlated. In the predictor
   233  transform, the current pixel value is predicted from the pixels already
   234  decoded (in scan-line order) and only the residual value (actual -
   235  predicted) is encoded. The _prediction mode_ determines the type of
   236  prediction to use. We divide the image into squares and all the pixels
   237  in a square use same prediction mode.
   238  
   239  The first 3 bits of prediction data define the block width and height in
   240  number of bits. The number of block columns, `block_xsize`, is used in
   241  indexing two-dimensionally.
   242  
   243  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   244  int size_bits = ReadBits(3) + 2;
   245  int block_width = (1 << size_bits);
   246  int block_height = (1 << size_bits);
   247  #define DIV_ROUND_UP(num, den) ((num) + (den) - 1) / (den))
   248  int block_xsize = DIV_ROUND_UP(image_width, 1 << size_bits);
   249  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   250  
   251  The transform data contains the prediction mode for each block of the
   252  image. All the `block_width * block_height` pixels of a block use same
   253  prediction mode. The prediction modes are treated as pixels of an image
   254  and encoded using the same techniques described in
   255  [Chapter 4](#image-data).
   256  
   257  For a pixel _x, y_, one can compute the respective filter block address
   258  by:
   259  
   260  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   261  int block_index = (y >> size_bits) * block_xsize +
   262                    (x >> size_bits);
   263  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   264  
   265  There are 14 different prediction modes. In each prediction mode, the
   266  current pixel value is predicted from one or more neighboring pixels
   267  whose values are already known.
   268  
   269  We choose the neighboring pixels (TL, T, TR, and L) of the current pixel
   270  (P) as follows:
   271  
   272  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   273  O    O    O    O    O    O    O    O    O    O    O
   274  O    O    O    O    O    O    O    O    O    O    O
   275  O    O    O    O    TL   T    TR   O    O    O    O
   276  O    O    O    O    L    P    X    X    X    X    X
   277  X    X    X    X    X    X    X    X    X    X    X
   278  X    X    X    X    X    X    X    X    X    X    X
   279  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   280  
   281  where TL means top-left, T top, TR top-right, L left pixel.
   282  At the time of predicting a value for P, all pixels O, TL, T, TR and L
   283  have been already processed, and pixel P and all pixels X are unknown.
   284  
   285  Given the above neighboring pixels, the different prediction modes are
   286  defined as follows.
   287  
   288  | Mode   | Predicted value of each channel of the current pixel    |
   289  | ------ | ------------------------------------------------------- |
   290  |  0     | 0xff000000 (represents solid black color in ARGB)       |
   291  |  1     | L                                                       |
   292  |  2     | T                                                       |
   293  |  3     | TR                                                      |
   294  |  4     | TL                                                      |
   295  |  5     | Average2(Average2(L, TR), T)                            |
   296  |  6     | Average2(L, TL)                                         |
   297  |  7     | Average2(L, T)                                          |
   298  |  8     | Average2(TL, T)                                         |
   299  |  9     | Average2(T, TR)                                         |
   300  | 10     | Average2(Average2(L, TL), Average2(T, TR))              |
   301  | 11     | Select(L, T, TL)                                        |
   302  | 12     | ClampAddSubtractFull(L, T, TL)                          |
   303  | 13     | ClampAddSubtractHalf(Average2(L, T), TL)                |
   304  
   305  
   306  `Average2` is defined as follows for each ARGB component:
   307  
   308  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   309  uint8 Average2(uint8 a, uint8 b) {
   310    return (a + b) / 2;
   311  }
   312  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   313  
   314  The Select predictor is defined as follows:
   315  
   316  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   317  uint32 Select(uint32 L, uint32 T, uint32 TL) {
   318    // L = left pixel, T = top pixel, TL = top left pixel.
   319  
   320    // ARGB component estimates for prediction.
   321    int pAlpha = ALPHA(L) + ALPHA(T) - ALPHA(TL);
   322    int pRed = RED(L) + RED(T) - RED(TL);
   323    int pGreen = GREEN(L) + GREEN(T) - GREEN(TL);
   324    int pBlue = BLUE(L) + BLUE(T) - BLUE(TL);
   325  
   326    // Manhattan distances to estimates for left and top pixels.
   327    int pL = abs(pAlpha - ALPHA(L)) + abs(pRed - RED(L)) +
   328             abs(pGreen - GREEN(L)) + abs(pBlue - BLUE(L));
   329    int pT = abs(pAlpha - ALPHA(T)) + abs(pRed - RED(T)) +
   330             abs(pGreen - GREEN(T)) + abs(pBlue - BLUE(T));
   331  
   332    // Return either left or top, the one closer to the prediction.
   333    if (pL <= pT) {
   334      return L;
   335    } else {
   336      return T;
   337    }
   338  }
   339  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   340  
   341  The functions `ClampAddSubtractFull` and `ClampAddSubtractHalf` are
   342  performed for each ARGB component as follows:
   343  
   344  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   345  // Clamp the input value between 0 and 255.
   346  int Clamp(int a) {
   347    return (a < 0) ? 0 : (a > 255) ?  255 : a;
   348  }
   349  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   350  
   351  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   352  int ClampAddSubtractFull(int a, int b, int c) {
   353    return Clamp(a + b - c);
   354  }
   355  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   356  
   357  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   358  int ClampAddSubtractHalf(int a, int b) {
   359    return Clamp(a + (a - b) / 2);
   360  }
   361  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   362  
   363  There are special handling rules for some border pixels. If there is a
   364  prediction transform, regardless of the mode \[0..13\] for these pixels,
   365  the predicted value for the left-topmost pixel of the image is
   366  0xff000000, L-pixel for all pixels on the top row, and T-pixel for all
   367  pixels on the leftmost column.
   368  
   369  Addressing the TR-pixel for pixels on the rightmost column is
   370  exceptional. The pixels on the rightmost column are predicted by using
   371  the modes \[0..13\] just like pixels not on border, but by using the
   372  leftmost pixel on the same row as the current TR-pixel. The TR-pixel
   373  offset in memory is the same for border and non-border pixels.
   374  
   375  
   376  ### Color Transform
   377  
   378  The goal of the color transform is to decorrelate the R, G and B values
   379  of each pixel. Color transform keeps the green (G) value as it is,
   380  transforms red (R) based on green and transforms blue (B) based on green
   381  and then based on red.
   382  
   383  As is the case for the predictor transform, first the image is divided
   384  into blocks and the same transform mode is used for all the pixels in a
   385  block. For each block there are three types of color transform elements.
   386  
   387  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   388  typedef struct {
   389    uint8 green_to_red;
   390    uint8 green_to_blue;
   391    uint8 red_to_blue;
   392  } ColorTransformElement;
   393  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   394  
   395  The actual color transformation is done by defining a color transform
   396  delta. The color transform delta depends on the `ColorTransformElement`,
   397  which is the same for all the pixels in a particular block. The delta is
   398  added during color transform. The inverse color transform then is just
   399  subtracting those deltas.
   400  
   401  The color transform function is defined as follows:
   402  
   403  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   404  void ColorTransform(uint8 red, uint8 blue, uint8 green,
   405                      ColorTransformElement *trans,
   406                      uint8 *new_red, uint8 *new_blue) {
   407    // Transformed values of red and blue components
   408    uint32 tmp_red = red;
   409    uint32 tmp_blue = blue;
   410  
   411    // Applying transform is just adding the transform deltas
   412    tmp_red  += ColorTransformDelta(trans->green_to_red, green);
   413    tmp_blue += ColorTransformDelta(trans->green_to_blue, green);
   414    tmp_blue += ColorTransformDelta(trans->red_to_blue, red);
   415  
   416    *new_red = tmp_red & 0xff;
   417    *new_blue = tmp_blue & 0xff;
   418  }
   419  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   420  
   421  `ColorTransformDelta` is computed using a signed 8-bit integer
   422  representing a 3.5-fixed-point number, and a signed 8-bit RGB color
   423  channel (c) \[-128..127\] and is defined as follows:
   424  
   425  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   426  int8 ColorTransformDelta(int8 t, int8 c) {
   427    return (t * c) >> 5;
   428  }
   429  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   430  
   431  The multiplication is to be done using more precision (with at least
   432  16-bit dynamics). The sign extension property of the shift operation
   433  does not matter here: only the lowest 8 bits are used from the result,
   434  and there the sign extension shifting and unsigned shifting are
   435  consistent with each other.
   436  
   437  Now we describe the contents of color transform data so that decoding
   438  can apply the inverse color transform and recover the original red and
   439  blue values. The first 3 bits of the color transform data contain the
   440  width and height of the image block in number of bits, just like the
   441  predictor transform:
   442  
   443  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   444  int size_bits = ReadBits(3) + 2;
   445  int block_width = 1 << size_bits;
   446  int block_height = 1 << size_bits;
   447  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   448  
   449  The remaining part of the color transform data contains
   450  `ColorTransformElement` instances corresponding to each block of the
   451  image. `ColorTransformElement` instances are treated as pixels of an
   452  image and encoded using the methods described in
   453  [Chapter 4](#image-data).
   454  
   455  During decoding, `ColorTransformElement` instances of the blocks are
   456  decoded and the inverse color transform is applied on the ARGB values of
   457  the pixels. As mentioned earlier, that inverse color transform is just
   458  subtracting `ColorTransformElement` values from the red and blue
   459  channels.
   460  
   461  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   462  void InverseTransform(uint8 red, uint8 green, uint8 blue,
   463                        ColorTransformElement *p,
   464                        uint8 *new_red, uint8 *new_blue) {
   465    // Applying inverse transform is just subtracting the
   466    // color transform deltas
   467    red  -= ColorTransformDelta(p->green_to_red_,  green);
   468    blue -= ColorTransformDelta(p->green_to_blue_, green);
   469    blue -= ColorTransformDelta(p->red_to_blue_, red & 0xff);
   470  
   471    *new_red = red & 0xff;
   472    *new_blue = blue & 0xff;
   473  }
   474  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   475  
   476  
   477  ### Subtract Green Transform
   478  
   479  The subtract green transform subtracts green values from red and blue
   480  values of each pixel. When this transform is present, the decoder needs
   481  to add the green value to both red and blue. There is no data associated
   482  with this transform. The decoder applies the inverse transform as
   483  follows:
   484  
   485  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   486  void AddGreenToBlueAndRed(uint8 green, uint8 *red, uint8 *blue) {
   487    *red  = (*red  + green) & 0xff;
   488    *blue = (*blue + green) & 0xff;
   489  }
   490  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   491  
   492  This transform is redundant as it can be modeled using the color
   493  transform, but it is still often useful. Since it can extend the
   494  dynamics of the color transform and there is no additional data here,
   495  the subtract green transform can be coded using fewer bits than a
   496  full-blown color transform.
   497  
   498  
   499  ### Color Indexing Transform
   500  
   501  If there are not many unique pixel values, it may be more efficient to
   502  create a color index array and replace the pixel values by the array's
   503  indices. The color indexing transform achieves this. (In the context of
   504  WebP lossless, we specifically do not call this a palette transform
   505  because a similar but more dynamic concept exists in WebP lossless
   506  encoding: color cache.)
   507  
   508  The color indexing transform checks for the number of unique ARGB values
   509  in the image. If that number is below a threshold (256), it creates an
   510  array of those ARGB values, which is then used to replace the pixel
   511  values with the corresponding index: the green channel of the pixels are
   512  replaced with the index; all alpha values are set to 255; all red and
   513  blue values to 0.
   514  
   515  The transform data contains color table size and the entries in the
   516  color table. The decoder reads the color indexing transform data as
   517  follows:
   518  
   519  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   520  // 8 bit value for color table size
   521  int color_table_size = ReadBits(8) + 1;
   522  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   523  
   524  The color table is stored using the image storage format itself. The
   525  color table can be obtained by reading an image, without the RIFF
   526  header, image size, and transforms, assuming a height of one pixel and
   527  a width of `color_table_size`. The color table is always
   528  subtraction-coded to reduce image entropy. The deltas of palette colors
   529  contain typically much less entropy than the colors themselves, leading
   530  to significant savings for smaller images. In decoding, every final
   531  color in the color table can be obtained by adding the previous color
   532  component values by each ARGB component separately, and storing the
   533  least significant 8 bits of the result.
   534  
   535  The inverse transform for the image is simply replacing the pixel values
   536  (which are indices to the color table) with the actual color table
   537  values. The indexing is done based on the green component of the ARGB
   538  color.
   539  
   540  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   541  // Inverse transform
   542  argb = color_table[GREEN(argb)];
   543  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   544  
   545  When the color table is small (equal to or less than 16 colors), several
   546  pixels are bundled into a single pixel. The pixel bundling packs several
   547  (2, 4, or 8) pixels into a single pixel, reducing the image width
   548  respectively. Pixel bundling allows for a more efficient joint
   549  distribution entropy coding of neighboring pixels, and gives some
   550  arithmetic coding-like benefits to the entropy code, but it can only be
   551  used when there are a small number of unique values.
   552  
   553  `color_table_size` specifies how many pixels are combined together:
   554  
   555  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   556  int width_bits;
   557  if (color_table_size <= 2) {
   558    width_bits = 3;
   559  } else if (color_table_size <= 4) {
   560    width_bits = 2;
   561  } else if (color_table_size <= 16) {
   562    width_bits = 1;
   563  } else {
   564    width_bits = 0;
   565  }
   566  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   567  
   568  `width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no
   569  pixel bundling to be done for the image. A value of 1 indicates that two
   570  pixels are combined together, and each pixel has a range of \[0..15\]. A
   571  value of 2 indicates that four pixels are combined together, and each
   572  pixel has a range of \[0..3\]. A value of 3 indicates that eight pixels
   573  are combined together and each pixel has a range of \[0..1\], i.e., a
   574  binary value.
   575  
   576  The values are packed into the green component as follows:
   577  
   578    * `width_bits` = 1: for every x value where x ≡ 0 (mod 2), a green
   579      value at x is positioned into the 4 least-significant bits of the
   580      green value at x / 2, a green value at x + 1 is positioned into the
   581      4 most-significant bits of the green value at x / 2.
   582    * `width_bits` = 2: for every x value where x ≡ 0 (mod 4), a green
   583      value at x is positioned into the 2 least-significant bits of the
   584      green value at x / 4, green values at x + 1 to x + 3 in order to the
   585      more significant bits of the green value at x / 4.
   586    * `width_bits` = 3: for every x value where x ≡ 0 (mod 8), a green
   587      value at x is positioned into the least-significant bit of the green
   588      value at x / 8, green values at x + 1 to x + 7 in order to the more
   589      significant bits of the green value at x / 8.
   590  
   591  
   592  4 Image Data
   593  ------------
   594  
   595  Image data is an array of pixel values in scan-line order.
   596  
   597  ### 4.1 Roles of Image Data
   598  
   599  We use image data in five different roles:
   600  
   601    1. ARGB image: Stores the actual pixels of the image.
   602    1. Entropy image: Stores the
   603       [meta Huffman codes](#decoding-of-meta-huffman-codes). The red and green
   604       components of a pixel define the meta Huffman code used in a particular
   605       block of the ARGB image.
   606    1. Predictor image: Stores the metadata for [Predictor
   607       Transform](#predictor-transform). The green component of a pixel defines
   608       which of the 14 predictors is used within a particular block of the
   609       ARGB image.
   610    1. Color transform image. It is created by `ColorTransformElement` values
   611       (defined in [Color Transform](#color-transform)) for different blocks of
   612       the image. Each `ColorTransformElement` `'cte'` is treated as a pixel whose
   613       alpha component is `255`, red component is `cte.red_to_blue`, green
   614       component is `cte.green_to_blue` and blue component is `cte.green_to_red`.
   615    1. Color indexing image: An array of of size `color_table_size` (up to 256
   616       ARGB values) storing the metadata for the
   617       [Color Indexing Transform](#color-indexing-transform). This is stored as an
   618       image of width `color_table_size` and height `1`.
   619  
   620  ### 4.2 Encoding of Image data
   621  
   622  The encoding of image data is independent of its role.
   623  
   624  The image is first divided into a set of fixed-size blocks (typically 16x16
   625  blocks). Each of these blocks are modeled using their own entropy codes. Also,
   626  several blocks may share the same entropy codes.
   627  
   628  **Rationale:** Storing an entropy code incurs a cost. This cost can be minimized
   629  if statistically similar blocks share an entropy code, thereby storing that code
   630  only once. For example, an encoder can find similar blocks by clustering them
   631  using their statistical properties, or by repeatedly joining a pair of randomly
   632  selected clusters when it reduces the overall amount of bits needed to encode
   633  the image.
   634  
   635  Each pixel is encoded using one of the three possible methods:
   636  
   637    1. Huffman coded literal: each channel (green, red, blue and alpha) is
   638       entropy-coded independently;
   639    2. LZ77 backward reference: a sequence of pixels are copied from elsewhere
   640       in the image; or
   641    3. Color cache code: using a short multiplicative hash code (color cache
   642       index) of a recently seen color.
   643  
   644  The following sub-sections describe each of these in detail.
   645  
   646  #### 4.2.1 Huffman Coded Literals
   647  
   648  The pixel is stored as Huffman coded values of green, red, blue and alpha (in
   649  that order). See [this section](#decoding-entropy-coded-image-data) for details.
   650  
   651  #### 4.2.2 LZ77 Backward Reference
   652  
   653  Backward references are tuples of _length_ and _distance code_:
   654  
   655    * Length indicates how many pixels in scan-line order are to be copied.
   656    * Distance code is a number indicating the position of a previously seen
   657      pixel, from which the pixels are to be copied. The exact mapping is
   658      described [below](#distance-mapping).
   659  
   660  The length and distance values are stored using **LZ77 prefix coding**.
   661  
   662  LZ77 prefix coding divides large integer values into two parts: the _prefix
   663  code_ and the _extra bits_: the prefix code is stored using an entropy code,
   664  while the extra bits are stored as they are (without an entropy code).
   665  
   666  **Rationale**: This approach reduces the storage requirement for the entropy
   667  code. Also, large values are usually rare, and so extra bits would be used for
   668  very few values in the image. Thus, this approach results in a better
   669  compression overall.
   670  
   671  The following table denotes the prefix codes and extra bits used for storing
   672  different range of values.
   673  
   674  Note: The maximum backward reference length is limited to 4096. Hence, only the
   675  first 24 prefix codes (with the respective extra bits) are meaningful for length
   676  values. For distance values, however, all the 40 prefix codes are valid.
   677  
   678  | Value range     | Prefix code | Extra bits |
   679  | --------------- | ----------- | ---------- |
   680  | 1               | 0           | 0          |
   681  | 2               | 1           | 0          |
   682  | 3               | 2           | 0          |
   683  | 4               | 3           | 0          |
   684  | 5..6            | 4           | 1          |
   685  | 7..8            | 5           | 1          |
   686  | 9..12           | 6           | 2          |
   687  | 13..16          | 7           | 2          |
   688  | ...             | ...         | ...        |
   689  | 3072..4096      | 23          | 10         |
   690  | ...             | ...         | ...        |
   691  | 524289..786432  | 38          | 18         |
   692  | 786433..1048576 | 39          | 18         |
   693  
   694  The pseudocode to obtain a (length or distance) value from the prefix code is
   695  as follows:
   696  
   697  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   698  if (prefix_code < 4) {
   699    return prefix_code + 1;
   700  }
   701  int extra_bits = (prefix_code - 2) >> 1;
   702  int offset = (2 + (prefix_code & 1)) << extra_bits;
   703  return offset + ReadBits(extra_bits) + 1;
   704  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   705  
   706  **Distance Mapping:**
   707  {:#distance-mapping}
   708  
   709  As noted previously, distance code is a number indicating the position of a
   710  previously seen pixel, from which the pixels are to be copied. This sub-section
   711  defines the mapping between a distance code and the position of a previous
   712  pixel.
   713  
   714  The distance codes larger than 120 denote the pixel-distance in scan-line
   715  order, offset by 120.
   716  
   717  The smallest distance codes \[1..120\] are special, and are reserved for a close
   718  neighborhood of the current pixel. This neighborhood consists of 120 pixels:
   719  
   720    * Pixels that are 1 to 7 rows above the current pixel, and are up to 8 columns
   721      to the left or up to 7 columns to the right of the current pixel. \[Total
   722      such pixels = `7 * (8 + 1 + 7) = 112`\].
   723    * Pixels that are in same row as the current pixel, and are up to 8 columns to
   724      the left of the current pixel. \[`8` such pixels\].
   725  
   726  The mapping between distance code `i` and the neighboring pixel offset
   727  `(xi, yi)` is as follows:
   728  
   729  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   730  (0, 1),  (1, 0),  (1, 1),  (-1, 1), (0, 2),  (2, 0),  (1, 2),  (-1, 2),
   731  (2, 1),  (-2, 1), (2, 2),  (-2, 2), (0, 3),  (3, 0),  (1, 3),  (-1, 3),
   732  (3, 1),  (-3, 1), (2, 3),  (-2, 3), (3, 2),  (-3, 2), (0, 4),  (4, 0),
   733  (1, 4),  (-1, 4), (4, 1),  (-4, 1), (3, 3),  (-3, 3), (2, 4),  (-2, 4),
   734  (4, 2),  (-4, 2), (0, 5),  (3, 4),  (-3, 4), (4, 3),  (-4, 3), (5, 0),
   735  (1, 5),  (-1, 5), (5, 1),  (-5, 1), (2, 5),  (-2, 5), (5, 2),  (-5, 2),
   736  (4, 4),  (-4, 4), (3, 5),  (-3, 5), (5, 3),  (-5, 3), (0, 6),  (6, 0),
   737  (1, 6),  (-1, 6), (6, 1),  (-6, 1), (2, 6),  (-2, 6), (6, 2),  (-6, 2),
   738  (4, 5),  (-4, 5), (5, 4),  (-5, 4), (3, 6),  (-3, 6), (6, 3),  (-6, 3),
   739  (0, 7),  (7, 0),  (1, 7),  (-1, 7), (5, 5),  (-5, 5), (7, 1),  (-7, 1),
   740  (4, 6),  (-4, 6), (6, 4),  (-6, 4), (2, 7),  (-2, 7), (7, 2),  (-7, 2),
   741  (3, 7),  (-3, 7), (7, 3),  (-7, 3), (5, 6),  (-5, 6), (6, 5),  (-6, 5),
   742  (8, 0),  (4, 7),  (-4, 7), (7, 4),  (-7, 4), (8, 1),  (8, 2),  (6, 6),
   743  (-6, 6), (8, 3),  (5, 7),  (-5, 7), (7, 5),  (-7, 5), (8, 4),  (6, 7),
   744  (-6, 7), (7, 6),  (-7, 6), (8, 5),  (7, 7),  (-7, 7), (8, 6),  (8, 7)
   745  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   746  
   747  For example, distance code `1` indicates offset of `(0, 1)` for the neighboring
   748  pixel, that is, the pixel above the current pixel (0-pixel difference in
   749  X-direction and 1 pixel difference in Y-direction). Similarly, distance code
   750  `3` indicates left-top pixel.
   751  
   752  The decoder can convert a distances code 'i' to a scan-line order distance
   753  'dist' as follows:
   754  
   755  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   756  (xi, yi) = distance_map[i]
   757  dist = x + y * xsize
   758  if (dist < 1) {
   759    dist = 1
   760  }
   761  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   762  
   763  where 'distance_map' is the mapping noted above and `xsize` is the width of the
   764  image in pixels.
   765  
   766  
   767  #### 4.2.3 Color Cache Coding
   768  
   769  Color cache stores a set of colors that have been recently used in the image.
   770  
   771  **Rationale:** This way, the recently used colors can sometimes be referred to
   772  more efficiently than emitting them using other two methods (described in
   773  [4.2.1](#huffman-coded-literals) and [4.2.2](#lz77-backward-reference)).
   774  
   775  Color cache codes are stored as follows. First, there is a 1-bit value that
   776  indicates if the color cache is used. If this bit is 0, no color cache codes
   777  exist, and they are not transmitted in the Huffman code that decodes the green
   778  symbols and the length prefix codes. However, if this bit is 1, the color cache
   779  size is read next:
   780  
   781  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   782  int color_cache_code_bits = ReadBits(4);
   783  int color_cache_size = 1 << color_cache_code_bits;
   784  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   785  
   786  `color_cache_code_bits` defines the size of the color_cache by (1 <<
   787  `color_cache_code_bits`). The range of allowed values for
   788  `color_cache_code_bits` is \[1..11\]. Compliant decoders must indicate a
   789  corrupted bitstream for other values.
   790  
   791  A color cache is an array of size `color_cache_size`. Each entry
   792  stores one ARGB color. Colors are looked up by indexing them by
   793  (0x1e35a7bd * `color`) >> (32 - `color_cache_code_bits`). Only one
   794  lookup is done in a color cache; there is no conflict resolution.
   795  
   796  In the beginning of decoding or encoding of an image, all entries in all
   797  color cache values are set to zero. The color cache code is converted to
   798  this color at decoding time. The state of the color cache is maintained
   799  by inserting every pixel, be it produced by backward referencing or as
   800  literals, into the cache in the order they appear in the stream.
   801  
   802  
   803  5 Entropy Code
   804  --------------
   805  
   806  ### 5.1 Overview
   807  
   808  Most of the data is coded using [canonical Huffman code][canonical_huff]. Hence,
   809  the codes are transmitted by sending the _Huffman code lengths_, as opposed to
   810  the actual _Huffman codes_.
   811  
   812  In particular, the format uses **spatially-variant Huffman coding**. In other
   813  words, different blocks of the image can potentially use different entropy
   814  codes.
   815  
   816  **Rationale**: Different areas of the image may have different characteristics. So, allowing them to use different entropy codes provides more flexibility and
   817  potentially a better compression.
   818  
   819  ### 5.2 Details
   820  
   821  The encoded image data consists of two parts:
   822  
   823    1. Meta Huffman codes
   824    1. Entropy-coded image data
   825  
   826  #### 5.2.1 Decoding of Meta Huffman Codes
   827  
   828  As noted earlier, the format allows the use of different Huffman codes for
   829  different blocks of the image. _Meta Huffman codes_ are indexes identifying
   830  which Huffman codes to use in different parts of the image.
   831  
   832  Meta Huffman codes may be used _only_ when the image is being used in the
   833  [role](#roles-of-image-data) of an _ARGB image_.
   834  
   835  There are two possibilities for the meta Huffman codes, indicated by a 1-bit
   836  value:
   837  
   838    * If this bit is zero, there is only one meta Huffman code used everywhere in
   839      the image. No more data is stored.
   840    * If this bit is one, the image uses multiple meta Huffman codes. These meta
   841      Huffman codes are stored as an _entropy image_ (described below).
   842  
   843  **Entropy image:**
   844  
   845  The entropy image defines which Huffman codes are used in different parts of the
   846  image, as described below.
   847  
   848  The first 3-bits contain the `huffman_bits` value. The dimensions of the entropy
   849  image are derived from 'huffman_bits'.
   850  
   851  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   852  int huffman_bits = ReadBits(3) + 2;
   853  int huffman_xsize = DIV_ROUND_UP(xsize, 1 << huffman_bits);
   854  int huffman_ysize = DIV_ROUND_UP(ysize, 1 << huffman_bits);
   855  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   856  
   857  where `DIV_ROUND_UP` is as defined [earlier](#predictor-transform).
   858  
   859  Next bits contain an entropy image of width `huffman_xsize` and height
   860  `huffman_ysize`.
   861  
   862  **Interpretation of Meta Huffman Codes:**
   863  
   864  For any given pixel (x, y), there is a set of five Huffman codes associated with
   865  it. These codes are (in bitstream order):
   866  
   867    * **Huffman code #1**: used for green channel, backward-reference length and
   868      color cache
   869    * **Huffman code #2, #3 and #4**: used for red, blue and alpha channels
   870      respectively.
   871    * **Huffman code #5**: used for backward-reference distance.
   872  
   873  From here on, we refer to this set as a **Huffman code group**.
   874  
   875  The number of Huffman code groups in the ARGB image can be obtained by finding
   876  the _largest meta Huffman code_ from the entropy image:
   877  
   878  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   879  int num_huff_groups = max(entropy image) + 1;
   880  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   881  where `max(entropy image)` indicates the largest Huffman code stored in the
   882  entropy image.
   883  
   884  As each Huffman code groups contains five Huffman codes, the total number of
   885  Huffman codes is:
   886  
   887  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   888  int num_huff_codes = 5 * num_huff_groups;
   889  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   890  
   891  Given a pixel (x, y) in the ARGB image, we can obtain the corresponding Huffman
   892  codes to be used as follows:
   893  
   894  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   895  int position = (y >> huffman_bits) * huffman_xsize + (x >> huffman_bits);
   896  int meta_huff_code = (entropy_image[pos] >> 8) & 0xffff;
   897  HuffmanCodeGroup huff_group = huffman_code_groups[meta_huff_code];
   898  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   899  
   900  where, we have assumed the existence of `HuffmanCodeGroup` structure, which
   901  represents a set of five Huffman codes. Also, `huffman_code_groups` is an array
   902  of `HuffmanCodeGroup` (of size `num_huff_groups`).
   903  
   904  The decoder then uses Huffman code group `huff_group` to decode the pixel
   905  (x, y) as explained in the [next section](#decoding-entropy-coded-image-data).
   906  
   907  #### 5.2.2 Decoding Entropy-coded Image Data
   908  
   909  For the current position (x, y) in the image, the decoder first identifies the
   910  corresponding Huffman code group (as explained in the last section). Given the
   911  Huffman code group, the pixel is read and decoded as follows:
   912  
   913  Read next symbol S from the bitstream using Huffman code #1. \[See
   914  [next section](#decoding-the-code-lengths) for details on decoding the Huffman
   915  code lengths\]. Note that S is any integer in the range `0` to
   916  `(256 + 24 + ` [`color_cache_size`](#color-cache-code)`- 1)`.
   917  
   918  The interpretation of S depends on its value:
   919  
   920    1. if S < 256
   921       1. Use S as the green component
   922       1. Read red from the bitstream using Huffman code #2
   923       1. Read blue from the bitstream using Huffman code #3
   924       1. Read alpha from the bitstream using Huffman code #4
   925    1. if S < 256 + 24
   926       1. Use S - 256 as a length prefix code
   927       1. Read extra bits for length from the bitstream
   928       1. Determine backward-reference length L from length prefix code and the
   929          extra bits read.
   930       1. Read distance prefix code from the bitstream using Huffman code #5
   931       1. Read extra bits for distance from the bitstream
   932       1. Determine backward-reference distance D from distance prefix code and
   933          the extra bits read.
   934       1. Copy the L pixels (in scan-line order) from the sequence of pixels
   935          prior to them by D pixels.
   936    1. if S >= 256 + 24
   937       1. Use S - (256 + 24) as the index into the color cache.
   938       1. Get ARGB color from the color cache at that index.
   939  
   940  
   941  **Decoding the Code Lengths:**
   942  {:#decoding-the-code-lengths}
   943  
   944  This section describes the details about reading a symbol from the bitstream by
   945  decoding the Huffman code length.
   946  
   947  The Huffman code lengths can be coded in two ways. The method used is specified
   948  by a 1-bit value.
   949  
   950    * If this bit is 1, it is a _simple code length code_, and
   951    * If this bit is 0, it is a _normal code length code_.
   952  
   953  **(i) Simple Code Length Code:**
   954  
   955  This variant is used in the special case when only 1 or 2 Huffman code lengths
   956  are non-zero, and are in the range of \[0, 255\]. All other Huffman code lengths
   957  are implicitly zeros.
   958  
   959  The first bit indicates the number of non-zero code lengths:
   960  
   961  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   962  int num_code_lengths = ReadBits(1) + 1;
   963  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   964  
   965  The first code length is stored either using a 1-bit code for values of 0 and 1,
   966  or using an 8-bit code for values in range \[0, 255\]. The second code length,
   967  when present, is coded as an 8-bit code.
   968  
   969  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   970  int is_first_8bits = ReadBits(1);
   971  code_lengths[0] = ReadBits(1 + 7 * is_first_8bits);
   972  if (num_code_lengths == 2) {
   973    code_lengths[1] = ReadBits(8);
   974  }
   975  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   976  
   977  **Note:** Another special case is when _all_ Huffman code lengths are _zeros_
   978  (an empty Huffman code). For example, a Huffman code for distance can be empty
   979  if there are no backward references. Similarly, Huffman codes for alpha, red,
   980  and blue can be empty if all pixels within the same meta Huffman code are
   981  produced using the color cache. However, this case doesn't need a special
   982  handling, as empty Huffman codes can be coded as those containing a single
   983  symbol `0`.
   984  
   985  **(ii) Normal Code Length Code:**
   986  
   987  The code lengths of a Huffman code are read as follows: `num_code_lengths`
   988  specifies the number of code lengths; the rest of the code lengths
   989  (according to the order in `kCodeLengthCodeOrder`) are zeros.
   990  
   991  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   992  int kCodeLengthCodes = 19;
   993  int kCodeLengthCodeOrder[kCodeLengthCodes] = {
   994    17, 18, 0, 1, 2, 3, 4, 5, 16, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
   995  };
   996  int code_lengths[kCodeLengthCodes] = { 0 };  // All zeros.
   997  int num_code_lengths = 4 + ReadBits(4);
   998  for (i = 0; i < num_code_lengths; ++i) {
   999    code_lengths[kCodeLengthCodeOrder[i]] = ReadBits(3);
  1000  }
  1001  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1002  
  1003    * Code length code \[0..15\] indicates literal code lengths.
  1004      * Value 0 means no symbols have been coded.
  1005      * Values \[1..15\] indicate the bit length of the respective code.
  1006    * Code 16 repeats the previous non-zero value \[3..6\] times, i.e.,
  1007      3 + `ReadBits(2)` times.  If code 16 is used before a non-zero
  1008      value has been emitted, a value of 8 is repeated.
  1009    * Code 17 emits a streak of zeros \[3..10\], i.e., 3 + `ReadBits(3)`
  1010      times.
  1011    * Code 18 emits a streak of zeros of length \[11..138\], i.e.,
  1012      11 + `ReadBits(7)` times.
  1013  
  1014  
  1015  6 Overall Structure of the Format
  1016  ---------------------------------
  1017  
  1018  Below is a view into the format in Backus-Naur form. It does not cover
  1019  all details. End-of-image (EOI) is only implicitly coded into the number
  1020  of pixels (xsize * ysize).
  1021  
  1022  
  1023  #### Basic Structure
  1024  
  1025  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1026  <format> ::= <RIFF header><image size><image stream>
  1027  <image stream> ::= <optional-transform><spatially-coded image>
  1028  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1029  
  1030  
  1031  #### Structure of Transforms
  1032  
  1033  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1034  <optional-transform> ::= (1-bit value 1; <transform> <optional-transform>) |
  1035                           1-bit value 0
  1036  <transform> ::= <predictor-tx> | <color-tx> | <subtract-green-tx> |
  1037                  <color-indexing-tx>
  1038  <predictor-tx> ::= 2-bit value 0; <predictor image>
  1039  <predictor image> ::= 3-bit sub-pixel code ; <entropy-coded image>
  1040  <color-tx> ::= 2-bit value 1; <color image>
  1041  <color image> ::= 3-bit sub-pixel code ; <entropy-coded image>
  1042  <subtract-green-tx> ::= 2-bit value 2
  1043  <color-indexing-tx> ::= 2-bit value 3; <color-indexing image>
  1044  <color-indexing image> ::= 8-bit color count; <entropy-coded image>
  1045  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1046  
  1047  
  1048  #### Structure of the Image Data
  1049  
  1050  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1051  <spatially-coded image> ::= <meta huffman><entropy-coded image>
  1052  <entropy-coded image> ::= <color cache info><huffman codes><lz77-coded image>
  1053  <meta huffman> ::= 1-bit value 0 |
  1054                     (1-bit value 1; <entropy image>)
  1055  <entropy image> ::= 3-bit subsample value; <entropy-coded image>
  1056  <color cache info> ::= 1 bit value 0 |
  1057                         (1-bit value 1; 4-bit value for color cache size)
  1058  <huffman codes> ::= <huffman code group> | <huffman code group><huffman codes>
  1059  <huffman code group> ::= <huffman code><huffman code><huffman code>
  1060                           <huffman code><huffman code>
  1061                           See "Interpretation of Meta Huffman codes" to
  1062                           understand what each of these five Huffman codes are
  1063                           for.
  1064  <huffman code> ::= <simple huffman code> | <normal huffman code>
  1065  <simple huffman code> ::= see "Simple code length code" for details
  1066  <normal huffman code> ::= <code length code>; encoded code lengths
  1067  <code length code> ::= see section "Normal code length code"
  1068  <lz77-coded image> ::= ((<argb-pixel> | <lz77-copy> | <color-cache-code>)
  1069                         <lz77-coded image>) | ""
  1070  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1071  
  1072  A possible example sequence:
  1073  
  1074  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1075  <RIFF header><image size>1-bit value 1<subtract-green-tx>
  1076  1-bit value 1<predictor-tx>1-bit value 0<meta huffman>
  1077  <color cache info><huffman codes>
  1078  <lz77-coded image>
  1079  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1080  
  1081  [canonical_huff]: http://en.wikipedia.org/wiki/Canonical_Huffman_code