github.com/Konstantin8105/c4go@v0.0.0-20240505174241-768bb1c65a51/tests/raylib/external/stb_image.h (about)

     1  /* stb_image - v2.27 - public domain image loader - http://nothings.org/stb
     2                                    no warranty implied; use at your own risk
     3  
     4     Do this:
     5        #define STB_IMAGE_IMPLEMENTATION
     6     before you include this file in *one* C or C++ file to create the implementation.
     7  
     8     // i.e. it should look like this:
     9     #include ...
    10     #include ...
    11     #include ...
    12     #define STB_IMAGE_IMPLEMENTATION
    13     #include "stb_image.h"
    14  
    15     You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
    16     And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
    17  
    18  
    19     QUICK NOTES:
    20        Primarily of interest to game developers and other people who can
    21            avoid problematic images and only need the trivial interface
    22  
    23        JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
    24        PNG 1/2/4/8/16-bit-per-channel
    25  
    26        TGA (not sure what subset, if a subset)
    27        BMP non-1bpp, non-RLE
    28        PSD (composited view only, no extra channels, 8/16 bit-per-channel)
    29  
    30        GIF (*comp always reports as 4-channel)
    31        HDR (radiance rgbE format)
    32        PIC (Softimage PIC)
    33        PNM (PPM and PGM binary only)
    34  
    35        Animated GIF still needs a proper API, but here's one way to do it:
    36            http://gist.github.com/urraka/685d9a6340b26b830d49
    37  
    38        - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
    39        - decode from arbitrary I/O callbacks
    40        - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
    41  
    42     Full documentation under "DOCUMENTATION" below.
    43  
    44  
    45  LICENSE
    46  
    47    See end of file for license information.
    48  
    49  RECENT REVISION HISTORY:
    50  
    51        2.27  (2021-07-11) document stbi_info better, 16-bit PNM support, bug fixes
    52        2.26  (2020-07-13) many minor fixes
    53        2.25  (2020-02-02) fix warnings
    54        2.24  (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
    55        2.23  (2019-08-11) fix clang static analysis warning
    56        2.22  (2019-03-04) gif fixes, fix warnings
    57        2.21  (2019-02-25) fix typo in comment
    58        2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
    59        2.19  (2018-02-11) fix warning
    60        2.18  (2018-01-30) fix warnings
    61        2.17  (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
    62        2.16  (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
    63        2.15  (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
    64        2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
    65        2.13  (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
    66        2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
    67        2.11  (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
    68                           RGB-format JPEG; remove white matting in PSD;
    69                           allocate large structures on the stack;
    70                           correct channel count for PNG & BMP
    71        2.10  (2016-01-22) avoid warning introduced in 2.09
    72        2.09  (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
    73  
    74     See end of file for full revision history.
    75  
    76  
    77   ============================    Contributors    =========================
    78  
    79   Image formats                          Extensions, features
    80      Sean Barrett (jpeg, png, bmp)          Jetro Lauha (stbi_info)
    81      Nicolas Schulz (hdr, psd)              Martin "SpartanJ" Golini (stbi_info)
    82      Jonathan Dummer (tga)                  James "moose2000" Brown (iPhone PNG)
    83      Jean-Marc Lienher (gif)                Ben "Disch" Wenger (io callbacks)
    84      Tom Seddon (pic)                       Omar Cornut (1/2/4-bit PNG)
    85      Thatcher Ulrich (psd)                  Nicolas Guillemot (vertical flip)
    86      Ken Miller (pgm, ppm)                  Richard Mitton (16-bit PSD)
    87      github:urraka (animated gif)           Junggon Kim (PNM comments)
    88      Christopher Forseth (animated gif)     Daniel Gibson (16-bit TGA)
    89                                             socks-the-fox (16-bit PNG)
    90                                             Jeremy Sawicki (handle all ImageNet JPGs)
    91   Optimizations & bugfixes                  Mikhail Morozov (1-bit BMP)
    92      Fabian "ryg" Giesen                    Anael Seghezzi (is-16-bit query)
    93      Arseny Kapoulkine                      Simon Breuss (16-bit PNM)
    94      John-Mark Allen
    95      Carmelo J Fdez-Aguera
    96  
    97   Bug & warning fixes
    98      Marc LeBlanc            David Woo          Guillaume George     Martins Mozeiko
    99      Christpher Lloyd        Jerry Jansson      Joseph Thomson       Blazej Dariusz Roszkowski
   100      Phil Jordan                                Dave Moore           Roy Eltham
   101      Hayaki Saito            Nathan Reed        Won Chun
   102      Luke Graham             Johan Duparc       Nick Verigakis       the Horde3D community
   103      Thomas Ruf              Ronny Chevalier                         github:rlyeh
   104      Janez Zemva             John Bartholomew   Michal Cichon        github:romigrou
   105      Jonathan Blow           Ken Hamada         Tero Hanninen        github:svdijk
   106      Eugene Golushkov        Laurent Gomila     Cort Stratton        github:snagar
   107      Aruelien Pocheville     Sergio Gonzalez    Thibault Reuille     github:Zelex
   108      Cass Everitt            Ryamond Barbiero                        github:grim210
   109      Paul Du Bois            Engin Manap        Aldo Culquicondor    github:sammyhw
   110      Philipp Wiesemann       Dale Weiler        Oriol Ferrer Mesia   github:phprus
   111      Josh Tobin                                 Matthew Gregan       github:poppolopoppo
   112      Julian Raschke          Gregory Mullen     Christian Floisand   github:darealshinji
   113      Baldur Karlsson         Kevin Schmidt      JR Smith             github:Michaelangel007
   114                              Brad Weinberger    Matvey Cherevko      github:mosra
   115      Luca Sas                Alexander Veselov  Zack Middleton       [reserved]
   116      Ryan C. Gordon          [reserved]                              [reserved]
   117                       DO NOT ADD YOUR NAME HERE
   118  
   119                       Jacko Dirks
   120  
   121    To add your name to the credits, pick a random blank space in the middle and fill it.
   122    80% of merge conflicts on stb PRs are due to people adding their name at the end
   123    of the credits.
   124  */
   125  
   126  #ifndef STBI_INCLUDE_STB_IMAGE_H
   127  #define STBI_INCLUDE_STB_IMAGE_H
   128  
   129  // DOCUMENTATION
   130  //
   131  // Limitations:
   132  //    - no 12-bit-per-channel JPEG
   133  //    - no JPEGs with arithmetic coding
   134  //    - GIF always returns *comp=4
   135  //
   136  // Basic usage (see HDR discussion below for HDR usage):
   137  //    int x,y,n;
   138  //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
   139  //    // ... process data if not NULL ...
   140  //    // ... x = width, y = height, n = # 8-bit components per pixel ...
   141  //    // ... replace '0' with '1'..'4' to force that many components per pixel
   142  //    // ... but 'n' will always be the number that it would have been if you said 0
   143  //    stbi_image_free(data)
   144  //
   145  // Standard parameters:
   146  //    int *x                 -- outputs image width in pixels
   147  //    int *y                 -- outputs image height in pixels
   148  //    int *channels_in_file  -- outputs # of image components in image file
   149  //    int desired_channels   -- if non-zero, # of image components requested in result
   150  //
   151  // The return value from an image loader is an 'unsigned char *' which points
   152  // to the pixel data, or NULL on an allocation failure or if the image is
   153  // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
   154  // with each pixel consisting of N interleaved 8-bit components; the first
   155  // pixel pointed to is top-left-most in the image. There is no padding between
   156  // image scanlines or between pixels, regardless of format. The number of
   157  // components N is 'desired_channels' if desired_channels is non-zero, or
   158  // *channels_in_file otherwise. If desired_channels is non-zero,
   159  // *channels_in_file has the number of components that _would_ have been
   160  // output otherwise. E.g. if you set desired_channels to 4, you will always
   161  // get RGBA output, but you can check *channels_in_file to see if it's trivially
   162  // opaque because e.g. there were only 3 channels in the source image.
   163  //
   164  // An output image with N components has the following components interleaved
   165  // in this order in each pixel:
   166  //
   167  //     N=#comp     components
   168  //       1           grey
   169  //       2           grey, alpha
   170  //       3           red, green, blue
   171  //       4           red, green, blue, alpha
   172  //
   173  // If image loading fails for any reason, the return value will be NULL,
   174  // and *x, *y, *channels_in_file will be unchanged. The function
   175  // stbi_failure_reason() can be queried for an extremely brief, end-user
   176  // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
   177  // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
   178  // more user-friendly ones.
   179  //
   180  // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
   181  //
   182  // To query the width, height and component count of an image without having to
   183  // decode the full file, you can use the stbi_info family of functions:
   184  //
   185  //   int x,y,n,ok;
   186  //   ok = stbi_info(filename, &x, &y, &n);
   187  //   // returns ok=1 and sets x, y, n if image is a supported format,
   188  //   // 0 otherwise.
   189  //
   190  // Note that stb_image pervasively uses ints in its public API for sizes,
   191  // including sizes of memory buffers. This is now part of the API and thus
   192  // hard to change without causing breakage. As a result, the various image
   193  // loaders all have certain limits on image size; these differ somewhat
   194  // by format but generally boil down to either just under 2GB or just under
   195  // 1GB. When the decoded image would be larger than this, stb_image decoding
   196  // will fail.
   197  //
   198  // Additionally, stb_image will reject image files that have any of their
   199  // dimensions set to a larger value than the configurable STBI_MAX_DIMENSIONS,
   200  // which defaults to 2**24 = 16777216 pixels. Due to the above memory limit,
   201  // the only way to have an image with such dimensions load correctly
   202  // is for it to have a rather extreme aspect ratio. Either way, the
   203  // assumption here is that such larger images are likely to be malformed
   204  // or malicious. If you do need to load an image with individual dimensions
   205  // larger than that, and it still fits in the overall size limit, you can
   206  // #define STBI_MAX_DIMENSIONS on your own to be something larger.
   207  //
   208  // ===========================================================================
   209  //
   210  // UNICODE:
   211  //
   212  //   If compiling for Windows and you wish to use Unicode filenames, compile
   213  //   with
   214  //       #define STBI_WINDOWS_UTF8
   215  //   and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
   216  //   Windows wchar_t filenames to utf8.
   217  //
   218  // ===========================================================================
   219  //
   220  // Philosophy
   221  //
   222  // stb libraries are designed with the following priorities:
   223  //
   224  //    1. easy to use
   225  //    2. easy to maintain
   226  //    3. good performance
   227  //
   228  // Sometimes I let "good performance" creep up in priority over "easy to maintain",
   229  // and for best performance I may provide less-easy-to-use APIs that give higher
   230  // performance, in addition to the easy-to-use ones. Nevertheless, it's important
   231  // to keep in mind that from the standpoint of you, a client of this library,
   232  // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
   233  //
   234  // Some secondary priorities arise directly from the first two, some of which
   235  // provide more explicit reasons why performance can't be emphasized.
   236  //
   237  //    - Portable ("ease of use")
   238  //    - Small source code footprint ("easy to maintain")
   239  //    - No dependencies ("ease of use")
   240  //
   241  // ===========================================================================
   242  //
   243  // I/O callbacks
   244  //
   245  // I/O callbacks allow you to read from arbitrary sources, like packaged
   246  // files or some other source. Data read from callbacks are processed
   247  // through a small internal buffer (currently 128 bytes) to try to reduce
   248  // overhead.
   249  //
   250  // The three functions you must define are "read" (reads some bytes of data),
   251  // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
   252  //
   253  // ===========================================================================
   254  //
   255  // SIMD support
   256  //
   257  // The JPEG decoder will try to automatically use SIMD kernels on x86 when
   258  // supported by the compiler. For ARM Neon support, you must explicitly
   259  // request it.
   260  //
   261  // (The old do-it-yourself SIMD API is no longer supported in the current
   262  // code.)
   263  //
   264  // On x86, SSE2 will automatically be used when available based on a run-time
   265  // test; if not, the generic C versions are used as a fall-back. On ARM targets,
   266  // the typical path is to have separate builds for NEON and non-NEON devices
   267  // (at least this is true for iOS and Android). Therefore, the NEON support is
   268  // toggled by a build flag: define STBI_NEON to get NEON loops.
   269  //
   270  // If for some reason you do not want to use any of SIMD code, or if
   271  // you have issues compiling it, you can disable it entirely by
   272  // defining STBI_NO_SIMD.
   273  //
   274  // ===========================================================================
   275  //
   276  // HDR image support   (disable by defining STBI_NO_HDR)
   277  //
   278  // stb_image supports loading HDR images in general, and currently the Radiance
   279  // .HDR file format specifically. You can still load any file through the existing
   280  // interface; if you attempt to load an HDR file, it will be automatically remapped
   281  // to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
   282  // both of these constants can be reconfigured through this interface:
   283  //
   284  //     stbi_hdr_to_ldr_gamma(2.2f);
   285  //     stbi_hdr_to_ldr_scale(1.0f);
   286  //
   287  // (note, do not use _inverse_ constants; stbi_image will invert them
   288  // appropriately).
   289  //
   290  // Additionally, there is a new, parallel interface for loading files as
   291  // (linear) floats to preserve the full dynamic range:
   292  //
   293  //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
   294  //
   295  // If you load LDR images through this interface, those images will
   296  // be promoted to floating point values, run through the inverse of
   297  // constants corresponding to the above:
   298  //
   299  //     stbi_ldr_to_hdr_scale(1.0f);
   300  //     stbi_ldr_to_hdr_gamma(2.2f);
   301  //
   302  // Finally, given a filename (or an open file or memory block--see header
   303  // file for details) containing image data, you can query for the "most
   304  // appropriate" interface to use (that is, whether the image is HDR or
   305  // not), using:
   306  //
   307  //     stbi_is_hdr(char *filename);
   308  //
   309  // ===========================================================================
   310  //
   311  // iPhone PNG support:
   312  //
   313  // We optionally support converting iPhone-formatted PNGs (which store
   314  // premultiplied BGRA) back to RGB, even though they're internally encoded
   315  // differently. To enable this conversion, call
   316  // stbi_convert_iphone_png_to_rgb(1).
   317  //
   318  // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
   319  // pixel to remove any premultiplied alpha *only* if the image file explicitly
   320  // says there's premultiplied data (currently only happens in iPhone images,
   321  // and only if iPhone convert-to-rgb processing is on).
   322  //
   323  // ===========================================================================
   324  //
   325  // ADDITIONAL CONFIGURATION
   326  //
   327  //  - You can suppress implementation of any of the decoders to reduce
   328  //    your code footprint by #defining one or more of the following
   329  //    symbols before creating the implementation.
   330  //
   331  //        STBI_NO_JPEG
   332  //        STBI_NO_PNG
   333  //        STBI_NO_BMP
   334  //        STBI_NO_PSD
   335  //        STBI_NO_TGA
   336  //        STBI_NO_GIF
   337  //        STBI_NO_HDR
   338  //        STBI_NO_PIC
   339  //        STBI_NO_PNM   (.ppm and .pgm)
   340  //
   341  //  - You can request *only* certain decoders and suppress all other ones
   342  //    (this will be more forward-compatible, as addition of new decoders
   343  //    doesn't require you to disable them explicitly):
   344  //
   345  //        STBI_ONLY_JPEG
   346  //        STBI_ONLY_PNG
   347  //        STBI_ONLY_BMP
   348  //        STBI_ONLY_PSD
   349  //        STBI_ONLY_TGA
   350  //        STBI_ONLY_GIF
   351  //        STBI_ONLY_HDR
   352  //        STBI_ONLY_PIC
   353  //        STBI_ONLY_PNM   (.ppm and .pgm)
   354  //
   355  //   - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
   356  //     want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
   357  //
   358  //  - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater
   359  //    than that size (in either width or height) without further processing.
   360  //    This is to let programs in the wild set an upper bound to prevent
   361  //    denial-of-service attacks on untrusted data, as one could generate a
   362  //    valid image of gigantic dimensions and force stb_image to allocate a
   363  //    huge block of memory and spend disproportionate time decoding it. By
   364  //    default this is set to (1 << 24), which is 16777216, but that's still
   365  //    very big.
   366  
   367  #ifndef STBI_NO_STDIO
   368  #include <stdio.h>
   369  #endif // STBI_NO_STDIO
   370  
   371  #define STBI_VERSION 1
   372  
   373  enum
   374  {
   375     STBI_default = 0, // only used for desired_channels
   376  
   377     STBI_grey       = 1,
   378     STBI_grey_alpha = 2,
   379     STBI_rgb        = 3,
   380     STBI_rgb_alpha  = 4
   381  };
   382  
   383  #include <stdlib.h>
   384  typedef unsigned char stbi_uc;
   385  typedef unsigned short stbi_us;
   386  
   387  #ifdef __cplusplus
   388  extern "C" {
   389  #endif
   390  
   391  #ifndef STBIDEF
   392  #ifdef STB_IMAGE_STATIC
   393  #define STBIDEF static
   394  #else
   395  #define STBIDEF extern
   396  #endif
   397  #endif
   398  
   399  //////////////////////////////////////////////////////////////////////////////
   400  //
   401  // PRIMARY API - works on images of any type
   402  //
   403  
   404  //
   405  // load image by filename, open file, or memory buffer
   406  //
   407  
   408  typedef struct
   409  {
   410     int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
   411     void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
   412     int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
   413  } stbi_io_callbacks;
   414  
   415  ////////////////////////////////////
   416  //
   417  // 8-bits-per-channel interface
   418  //
   419  
   420  STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *channels_in_file, int desired_channels);
   421  STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
   422  
   423  #ifndef STBI_NO_STDIO
   424  STBIDEF stbi_uc *stbi_load            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
   425  STBIDEF stbi_uc *stbi_load_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
   426  // for stbi_load_from_file, file pointer is left pointing immediately after image
   427  #endif
   428  
   429  #ifndef STBI_NO_GIF
   430  STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
   431  #endif
   432  
   433  #ifdef STBI_WINDOWS_UTF8
   434  STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
   435  #endif
   436  
   437  ////////////////////////////////////
   438  //
   439  // 16-bits-per-channel interface
   440  //
   441  
   442  STBIDEF stbi_us *stbi_load_16_from_memory   (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
   443  STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
   444  
   445  #ifndef STBI_NO_STDIO
   446  STBIDEF stbi_us *stbi_load_16          (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
   447  STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
   448  #endif
   449  
   450  ////////////////////////////////////
   451  //
   452  // float-per-channel interface
   453  //
   454  #ifndef STBI_NO_LINEAR
   455     STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
   456     STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y,  int *channels_in_file, int desired_channels);
   457  
   458     #ifndef STBI_NO_STDIO
   459     STBIDEF float *stbi_loadf            (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
   460     STBIDEF float *stbi_loadf_from_file  (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
   461     #endif
   462  #endif
   463  
   464  #ifndef STBI_NO_HDR
   465     STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
   466     STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
   467  #endif // STBI_NO_HDR
   468  
   469  #ifndef STBI_NO_LINEAR
   470     STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
   471     STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
   472  #endif // STBI_NO_LINEAR
   473  
   474  // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
   475  STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
   476  STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
   477  #ifndef STBI_NO_STDIO
   478  STBIDEF int      stbi_is_hdr          (char const *filename);
   479  STBIDEF int      stbi_is_hdr_from_file(FILE *f);
   480  #endif // STBI_NO_STDIO
   481  
   482  
   483  // get a VERY brief reason for failure
   484  // on most compilers (and ALL modern mainstream compilers) this is threadsafe
   485  STBIDEF const char *stbi_failure_reason  (void);
   486  
   487  // free the loaded image -- this is just free()
   488  STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
   489  
   490  // get image dimensions & components without fully decoding
   491  STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
   492  STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
   493  STBIDEF int      stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
   494  STBIDEF int      stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
   495  
   496  #ifndef STBI_NO_STDIO
   497  STBIDEF int      stbi_info               (char const *filename,     int *x, int *y, int *comp);
   498  STBIDEF int      stbi_info_from_file     (FILE *f,                  int *x, int *y, int *comp);
   499  STBIDEF int      stbi_is_16_bit          (char const *filename);
   500  STBIDEF int      stbi_is_16_bit_from_file(FILE *f);
   501  #endif
   502  
   503  
   504  
   505  // for image formats that explicitly notate that they have premultiplied alpha,
   506  // we just return the colors as stored in the file. set this flag to force
   507  // unpremultiplication. results are undefined if the unpremultiply overflow.
   508  STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
   509  
   510  // indicate whether we should process iphone images back to canonical format,
   511  // or just pass them through "as-is"
   512  STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
   513  
   514  // flip the image vertically, so the first pixel in the output array is the bottom left
   515  STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
   516  
   517  // as above, but only applies to images loaded on the thread that calls the function
   518  // this function is only available if your compiler supports thread-local variables;
   519  // calling it will fail to link if your compiler doesn't
   520  STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply);
   521  STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert);
   522  STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
   523  
   524  // ZLIB client - used by PNG, available for other purposes
   525  
   526  STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
   527  STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
   528  STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
   529  STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
   530  
   531  STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
   532  STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
   533  
   534  
   535  #ifdef __cplusplus
   536  }
   537  #endif
   538  
   539  //
   540  //
   541  ////   end header file   /////////////////////////////////////////////////////
   542  #endif // STBI_INCLUDE_STB_IMAGE_H
   543  
   544  #ifdef STB_IMAGE_IMPLEMENTATION
   545  
   546  #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
   547    || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
   548    || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
   549    || defined(STBI_ONLY_ZLIB)
   550     #ifndef STBI_ONLY_JPEG
   551     #define STBI_NO_JPEG
   552     #endif
   553     #ifndef STBI_ONLY_PNG
   554     #define STBI_NO_PNG
   555     #endif
   556     #ifndef STBI_ONLY_BMP
   557     #define STBI_NO_BMP
   558     #endif
   559     #ifndef STBI_ONLY_PSD
   560     #define STBI_NO_PSD
   561     #endif
   562     #ifndef STBI_ONLY_TGA
   563     #define STBI_NO_TGA
   564     #endif
   565     #ifndef STBI_ONLY_GIF
   566     #define STBI_NO_GIF
   567     #endif
   568     #ifndef STBI_ONLY_HDR
   569     #define STBI_NO_HDR
   570     #endif
   571     #ifndef STBI_ONLY_PIC
   572     #define STBI_NO_PIC
   573     #endif
   574     #ifndef STBI_ONLY_PNM
   575     #define STBI_NO_PNM
   576     #endif
   577  #endif
   578  
   579  #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
   580  #define STBI_NO_ZLIB
   581  #endif
   582  
   583  
   584  #include <stdarg.h>
   585  #include <stddef.h> // ptrdiff_t on osx
   586  #include <stdlib.h>
   587  #include <string.h>
   588  #include <limits.h>
   589  
   590  #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
   591  #include <math.h>  // ldexp, pow
   592  #endif
   593  
   594  #ifndef STBI_NO_STDIO
   595  #include <stdio.h>
   596  #endif
   597  
   598  #ifndef STBI_ASSERT
   599  #include <assert.h>
   600  #define STBI_ASSERT(x) assert(x)
   601  #endif
   602  
   603  #ifdef __cplusplus
   604  #define STBI_EXTERN extern "C"
   605  #else
   606  #define STBI_EXTERN extern
   607  #endif
   608  
   609  
   610  #ifndef _MSC_VER
   611     #ifdef __cplusplus
   612     #define stbi_inline inline
   613     #else
   614     #define stbi_inline
   615     #endif
   616  #else
   617     #define stbi_inline __forceinline
   618  #endif
   619  
   620  #ifndef STBI_NO_THREAD_LOCALS
   621     #if defined(__cplusplus) &&  __cplusplus >= 201103L
   622        #define STBI_THREAD_LOCAL       thread_local
   623     #elif defined(__GNUC__) && __GNUC__ < 5
   624        #define STBI_THREAD_LOCAL       __thread
   625     #elif defined(_MSC_VER)
   626        #define STBI_THREAD_LOCAL       __declspec(thread)
   627     #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
   628        #define STBI_THREAD_LOCAL       _Thread_local
   629     #endif
   630  
   631     #ifndef STBI_THREAD_LOCAL
   632        #if defined(__GNUC__)
   633          #define STBI_THREAD_LOCAL       __thread
   634        #endif
   635     #endif
   636  #endif
   637  
   638  #ifdef _MSC_VER
   639  typedef unsigned short stbi__uint16;
   640  typedef   signed short stbi__int16;
   641  typedef unsigned int   stbi__uint32;
   642  typedef   signed int   stbi__int32;
   643  #else
   644  #include <stdint.h>
   645  typedef uint16_t stbi__uint16;
   646  typedef int16_t  stbi__int16;
   647  typedef uint32_t stbi__uint32;
   648  typedef int32_t  stbi__int32;
   649  #endif
   650  
   651  // should produce compiler error if size is wrong
   652  typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
   653  
   654  #ifdef _MSC_VER
   655  #define STBI_NOTUSED(v)  (void)(v)
   656  #else
   657  #define STBI_NOTUSED(v)  (void)sizeof(v)
   658  #endif
   659  
   660  #ifdef _MSC_VER
   661  #define STBI_HAS_LROTL
   662  #endif
   663  
   664  #ifdef STBI_HAS_LROTL
   665     #define stbi_lrot(x,y)  _lrotl(x,y)
   666  #else
   667     #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (-(y) & 31)))
   668  #endif
   669  
   670  #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
   671  // ok
   672  #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
   673  // ok
   674  #else
   675  #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
   676  #endif
   677  
   678  #ifndef STBI_MALLOC
   679  #define STBI_MALLOC(sz)           malloc(sz)
   680  #define STBI_REALLOC(p,newsz)     realloc(p,newsz)
   681  #define STBI_FREE(p)              free(p)
   682  #endif
   683  
   684  #ifndef STBI_REALLOC_SIZED
   685  #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
   686  #endif
   687  
   688  // x86/x64 detection
   689  #if defined(__x86_64__) || defined(_M_X64)
   690  #define STBI__X64_TARGET
   691  #elif defined(__i386) || defined(_M_IX86)
   692  #define STBI__X86_TARGET
   693  #endif
   694  
   695  #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
   696  // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
   697  // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
   698  // but previous attempts to provide the SSE2 functions with runtime
   699  // detection caused numerous issues. The way architecture extensions are
   700  // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
   701  // New behavior: if compiled with -msse2, we use SSE2 without any
   702  // detection; if not, we don't use it at all.
   703  #define STBI_NO_SIMD
   704  #endif
   705  
   706  #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
   707  // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
   708  //
   709  // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
   710  // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
   711  // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
   712  // simultaneously enabling "-mstackrealign".
   713  //
   714  // See https://github.com/nothings/stb/issues/81 for more information.
   715  //
   716  // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
   717  // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
   718  #define STBI_NO_SIMD
   719  #endif
   720  
   721  #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
   722  #define STBI_SSE2
   723  #include <emmintrin.h>
   724  
   725  #ifdef _MSC_VER
   726  
   727  #if _MSC_VER >= 1400  // not VC6
   728  #include <intrin.h> // __cpuid
   729  static int stbi__cpuid3(void)
   730  {
   731     int info[4];
   732     __cpuid(info,1);
   733     return info[3];
   734  }
   735  #else
   736  static int stbi__cpuid3(void)
   737  {
   738     int res;
   739     __asm {
   740        mov  eax,1
   741        cpuid
   742        mov  res,edx
   743     }
   744     return res;
   745  }
   746  #endif
   747  
   748  #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
   749  
   750  #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
   751  static int stbi__sse2_available(void)
   752  {
   753     int info3 = stbi__cpuid3();
   754     return ((info3 >> 26) & 1) != 0;
   755  }
   756  #endif
   757  
   758  #else // assume GCC-style if not VC++
   759  #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
   760  
   761  #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
   762  static int stbi__sse2_available(void)
   763  {
   764     // If we're even attempting to compile this on GCC/Clang, that means
   765     // -msse2 is on, which means the compiler is allowed to use SSE2
   766     // instructions at will, and so are we.
   767     return 1;
   768  }
   769  #endif
   770  
   771  #endif
   772  #endif
   773  
   774  // ARM NEON
   775  #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
   776  #undef STBI_NEON
   777  #endif
   778  
   779  #ifdef STBI_NEON
   780  #include <arm_neon.h>
   781  #ifdef _MSC_VER
   782  #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
   783  #else
   784  #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
   785  #endif
   786  #endif
   787  
   788  #ifndef STBI_SIMD_ALIGN
   789  #define STBI_SIMD_ALIGN(type, name) type name
   790  #endif
   791  
   792  #ifndef STBI_MAX_DIMENSIONS
   793  #define STBI_MAX_DIMENSIONS (1 << 24)
   794  #endif
   795  
   796  ///////////////////////////////////////////////
   797  //
   798  //  stbi__context struct and start_xxx functions
   799  
   800  // stbi__context structure is our basic context used by all images, so it
   801  // contains all the IO context, plus some basic image information
   802  typedef struct
   803  {
   804     stbi__uint32 img_x, img_y;
   805     int img_n, img_out_n;
   806  
   807     stbi_io_callbacks io;
   808     void *io_user_data;
   809  
   810     int read_from_callbacks;
   811     int buflen;
   812     stbi_uc buffer_start[128];
   813     int callback_already_read;
   814  
   815     stbi_uc *img_buffer, *img_buffer_end;
   816     stbi_uc *img_buffer_original, *img_buffer_original_end;
   817  } stbi__context;
   818  
   819  
   820  static void stbi__refill_buffer(stbi__context *s);
   821  
   822  // initialize a memory-decode context
   823  static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
   824  {
   825     s->io.read = NULL;
   826     s->read_from_callbacks = 0;
   827     s->callback_already_read = 0;
   828     s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
   829     s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
   830  }
   831  
   832  // initialize a callback-based context
   833  static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
   834  {
   835     s->io = *c;
   836     s->io_user_data = user;
   837     s->buflen = sizeof(s->buffer_start);
   838     s->read_from_callbacks = 1;
   839     s->callback_already_read = 0;
   840     s->img_buffer = s->img_buffer_original = s->buffer_start;
   841     stbi__refill_buffer(s);
   842     s->img_buffer_original_end = s->img_buffer_end;
   843  }
   844  
   845  #ifndef STBI_NO_STDIO
   846  
   847  static int stbi__stdio_read(void *user, char *data, int size)
   848  {
   849     return (int) fread(data,1,size,(FILE*) user);
   850  }
   851  
   852  static void stbi__stdio_skip(void *user, int n)
   853  {
   854     int ch;
   855     fseek((FILE*) user, n, SEEK_CUR);
   856     ch = fgetc((FILE*) user);  /* have to read a byte to reset feof()'s flag */
   857     if (ch != EOF) {
   858        ungetc(ch, (FILE *) user);  /* push byte back onto stream if valid. */
   859     }
   860  }
   861  
   862  static int stbi__stdio_eof(void *user)
   863  {
   864     return feof((FILE*) user) || ferror((FILE *) user);
   865  }
   866  
   867  static stbi_io_callbacks stbi__stdio_callbacks =
   868  {
   869     stbi__stdio_read,
   870     stbi__stdio_skip,
   871     stbi__stdio_eof,
   872  };
   873  
   874  static void stbi__start_file(stbi__context *s, FILE *f)
   875  {
   876     stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
   877  }
   878  
   879  //static void stop_file(stbi__context *s) { }
   880  
   881  #endif // !STBI_NO_STDIO
   882  
   883  static void stbi__rewind(stbi__context *s)
   884  {
   885     // conceptually rewind SHOULD rewind to the beginning of the stream,
   886     // but we just rewind to the beginning of the initial buffer, because
   887     // we only use it after doing 'test', which only ever looks at at most 92 bytes
   888     s->img_buffer = s->img_buffer_original;
   889     s->img_buffer_end = s->img_buffer_original_end;
   890  }
   891  
   892  enum
   893  {
   894     STBI_ORDER_RGB,
   895     STBI_ORDER_BGR
   896  };
   897  
   898  typedef struct
   899  {
   900     int bits_per_channel;
   901     int num_channels;
   902     int channel_order;
   903  } stbi__result_info;
   904  
   905  #ifndef STBI_NO_JPEG
   906  static int      stbi__jpeg_test(stbi__context *s);
   907  static void    *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   908  static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
   909  #endif
   910  
   911  #ifndef STBI_NO_PNG
   912  static int      stbi__png_test(stbi__context *s);
   913  static void    *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   914  static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
   915  static int      stbi__png_is16(stbi__context *s);
   916  #endif
   917  
   918  #ifndef STBI_NO_BMP
   919  static int      stbi__bmp_test(stbi__context *s);
   920  static void    *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   921  static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
   922  #endif
   923  
   924  #ifndef STBI_NO_TGA
   925  static int      stbi__tga_test(stbi__context *s);
   926  static void    *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   927  static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
   928  #endif
   929  
   930  #ifndef STBI_NO_PSD
   931  static int      stbi__psd_test(stbi__context *s);
   932  static void    *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
   933  static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
   934  static int      stbi__psd_is16(stbi__context *s);
   935  #endif
   936  
   937  #ifndef STBI_NO_HDR
   938  static int      stbi__hdr_test(stbi__context *s);
   939  static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   940  static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
   941  #endif
   942  
   943  #ifndef STBI_NO_PIC
   944  static int      stbi__pic_test(stbi__context *s);
   945  static void    *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   946  static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
   947  #endif
   948  
   949  #ifndef STBI_NO_GIF
   950  static int      stbi__gif_test(stbi__context *s);
   951  static void    *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   952  static void    *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
   953  static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
   954  #endif
   955  
   956  #ifndef STBI_NO_PNM
   957  static int      stbi__pnm_test(stbi__context *s);
   958  static void    *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
   959  static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
   960  static int      stbi__pnm_is16(stbi__context *s);
   961  #endif
   962  
   963  static
   964  #ifdef STBI_THREAD_LOCAL
   965  STBI_THREAD_LOCAL
   966  #endif
   967  const char *stbi__g_failure_reason;
   968  
   969  STBIDEF const char *stbi_failure_reason(void)
   970  {
   971     return stbi__g_failure_reason;
   972  }
   973  
   974  #ifndef STBI_NO_FAILURE_STRINGS
   975  static int stbi__err(const char *str)
   976  {
   977     stbi__g_failure_reason = str;
   978     return 0;
   979  }
   980  #endif
   981  
   982  static void *stbi__malloc(size_t size)
   983  {
   984      return STBI_MALLOC(size);
   985  }
   986  
   987  // stb_image uses ints pervasively, including for offset calculations.
   988  // therefore the largest decoded image size we can support with the
   989  // current code, even on 64-bit targets, is INT_MAX. this is not a
   990  // significant limitation for the intended use case.
   991  //
   992  // we do, however, need to make sure our size calculations don't
   993  // overflow. hence a few helper functions for size calculations that
   994  // multiply integers together, making sure that they're non-negative
   995  // and no overflow occurs.
   996  
   997  // return 1 if the sum is valid, 0 on overflow.
   998  // negative terms are considered invalid.
   999  static int stbi__addsizes_valid(int a, int b)
  1000  {
  1001     if (b < 0) return 0;
  1002     // now 0 <= b <= INT_MAX, hence also
  1003     // 0 <= INT_MAX - b <= INTMAX.
  1004     // And "a + b <= INT_MAX" (which might overflow) is the
  1005     // same as a <= INT_MAX - b (no overflow)
  1006     return a <= INT_MAX - b;
  1007  }
  1008  
  1009  // returns 1 if the product is valid, 0 on overflow.
  1010  // negative factors are considered invalid.
  1011  static int stbi__mul2sizes_valid(int a, int b)
  1012  {
  1013     if (a < 0 || b < 0) return 0;
  1014     if (b == 0) return 1; // mul-by-0 is always safe
  1015     // portable way to check for no overflows in a*b
  1016     return a <= INT_MAX/b;
  1017  }
  1018  
  1019  #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
  1020  // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
  1021  static int stbi__mad2sizes_valid(int a, int b, int add)
  1022  {
  1023     return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
  1024  }
  1025  #endif
  1026  
  1027  // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
  1028  static int stbi__mad3sizes_valid(int a, int b, int c, int add)
  1029  {
  1030     return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
  1031        stbi__addsizes_valid(a*b*c, add);
  1032  }
  1033  
  1034  // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
  1035  #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
  1036  static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
  1037  {
  1038     return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
  1039        stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
  1040  }
  1041  #endif
  1042  
  1043  #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
  1044  // mallocs with size overflow checking
  1045  static void *stbi__malloc_mad2(int a, int b, int add)
  1046  {
  1047     if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
  1048     return stbi__malloc(a*b + add);
  1049  }
  1050  #endif
  1051  
  1052  static void *stbi__malloc_mad3(int a, int b, int c, int add)
  1053  {
  1054     if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
  1055     return stbi__malloc(a*b*c + add);
  1056  }
  1057  
  1058  #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
  1059  static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
  1060  {
  1061     if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
  1062     return stbi__malloc(a*b*c*d + add);
  1063  }
  1064  #endif
  1065  
  1066  // stbi__err - error
  1067  // stbi__errpf - error returning pointer to float
  1068  // stbi__errpuc - error returning pointer to unsigned char
  1069  
  1070  #ifdef STBI_NO_FAILURE_STRINGS
  1071     #define stbi__err(x,y)  0
  1072  #elif defined(STBI_FAILURE_USERMSG)
  1073     #define stbi__err(x,y)  stbi__err(y)
  1074  #else
  1075     #define stbi__err(x,y)  stbi__err(x)
  1076  #endif
  1077  
  1078  #define stbi__errpf(x,y)   ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
  1079  #define stbi__errpuc(x,y)  ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
  1080  
  1081  STBIDEF void stbi_image_free(void *retval_from_stbi_load)
  1082  {
  1083     STBI_FREE(retval_from_stbi_load);
  1084  }
  1085  
  1086  #ifndef STBI_NO_LINEAR
  1087  static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
  1088  #endif
  1089  
  1090  #ifndef STBI_NO_HDR
  1091  static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
  1092  #endif
  1093  
  1094  static int stbi__vertically_flip_on_load_global = 0;
  1095  
  1096  STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
  1097  {
  1098     stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
  1099  }
  1100  
  1101  #ifndef STBI_THREAD_LOCAL
  1102  #define stbi__vertically_flip_on_load  stbi__vertically_flip_on_load_global
  1103  #else
  1104  static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
  1105  
  1106  STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
  1107  {
  1108     stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
  1109     stbi__vertically_flip_on_load_set = 1;
  1110  }
  1111  
  1112  #define stbi__vertically_flip_on_load  (stbi__vertically_flip_on_load_set       \
  1113                                           ? stbi__vertically_flip_on_load_local  \
  1114                                           : stbi__vertically_flip_on_load_global)
  1115  #endif // STBI_THREAD_LOCAL
  1116  
  1117  static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
  1118  {
  1119     memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
  1120     ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
  1121     ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
  1122     ri->num_channels = 0;
  1123  
  1124     // test the formats with a very explicit header first (at least a FOURCC
  1125     // or distinctive magic number first)
  1126     #ifndef STBI_NO_PNG
  1127     if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp, ri);
  1128     #endif
  1129     #ifndef STBI_NO_BMP
  1130     if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp, ri);
  1131     #endif
  1132     #ifndef STBI_NO_GIF
  1133     if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp, ri);
  1134     #endif
  1135     #ifndef STBI_NO_PSD
  1136     if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
  1137     #else
  1138     STBI_NOTUSED(bpc);
  1139     #endif
  1140     #ifndef STBI_NO_PIC
  1141     if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp, ri);
  1142     #endif
  1143  
  1144     // then the formats that can end up attempting to load with just 1 or 2
  1145     // bytes matching expectations; these are prone to false positives, so
  1146     // try them later
  1147     #ifndef STBI_NO_JPEG
  1148     if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
  1149     #endif
  1150     #ifndef STBI_NO_PNM
  1151     if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp, ri);
  1152     #endif
  1153  
  1154     #ifndef STBI_NO_HDR
  1155     if (stbi__hdr_test(s)) {
  1156        float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
  1157        return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
  1158     }
  1159     #endif
  1160  
  1161     #ifndef STBI_NO_TGA
  1162     // test tga last because it's a crappy test!
  1163     if (stbi__tga_test(s))
  1164        return stbi__tga_load(s,x,y,comp,req_comp, ri);
  1165     #endif
  1166  
  1167     return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
  1168  }
  1169  
  1170  static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
  1171  {
  1172     int i;
  1173     int img_len = w * h * channels;
  1174     stbi_uc *reduced;
  1175  
  1176     reduced = (stbi_uc *) stbi__malloc(img_len);
  1177     if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
  1178  
  1179     for (i = 0; i < img_len; ++i)
  1180        reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
  1181  
  1182     STBI_FREE(orig);
  1183     return reduced;
  1184  }
  1185  
  1186  static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
  1187  {
  1188     int i;
  1189     int img_len = w * h * channels;
  1190     stbi__uint16 *enlarged;
  1191  
  1192     enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
  1193     if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
  1194  
  1195     for (i = 0; i < img_len; ++i)
  1196        enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
  1197  
  1198     STBI_FREE(orig);
  1199     return enlarged;
  1200  }
  1201  
  1202  static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
  1203  {
  1204     int row;
  1205     size_t bytes_per_row = (size_t)w * bytes_per_pixel;
  1206     stbi_uc temp[2048];
  1207     stbi_uc *bytes = (stbi_uc *)image;
  1208  
  1209     for (row = 0; row < (h>>1); row++) {
  1210        stbi_uc *row0 = bytes + row*bytes_per_row;
  1211        stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
  1212        // swap row0 with row1
  1213        size_t bytes_left = bytes_per_row;
  1214        while (bytes_left) {
  1215           size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
  1216           memcpy(temp, row0, bytes_copy);
  1217           memcpy(row0, row1, bytes_copy);
  1218           memcpy(row1, temp, bytes_copy);
  1219           row0 += bytes_copy;
  1220           row1 += bytes_copy;
  1221           bytes_left -= bytes_copy;
  1222        }
  1223     }
  1224  }
  1225  
  1226  #ifndef STBI_NO_GIF
  1227  static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
  1228  {
  1229     int slice;
  1230     int slice_size = w * h * bytes_per_pixel;
  1231  
  1232     stbi_uc *bytes = (stbi_uc *)image;
  1233     for (slice = 0; slice < z; ++slice) {
  1234        stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
  1235        bytes += slice_size;
  1236     }
  1237  }
  1238  #endif
  1239  
  1240  static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
  1241  {
  1242     stbi__result_info ri;
  1243     void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
  1244  
  1245     if (result == NULL)
  1246        return NULL;
  1247  
  1248     // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
  1249     STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
  1250  
  1251     if (ri.bits_per_channel != 8) {
  1252        result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
  1253        ri.bits_per_channel = 8;
  1254     }
  1255  
  1256     // @TODO: move stbi__convert_format to here
  1257  
  1258     if (stbi__vertically_flip_on_load) {
  1259        int channels = req_comp ? req_comp : *comp;
  1260        stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
  1261     }
  1262  
  1263     return (unsigned char *) result;
  1264  }
  1265  
  1266  static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
  1267  {
  1268     stbi__result_info ri;
  1269     void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
  1270  
  1271     if (result == NULL)
  1272        return NULL;
  1273  
  1274     // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
  1275     STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
  1276  
  1277     if (ri.bits_per_channel != 16) {
  1278        result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
  1279        ri.bits_per_channel = 16;
  1280     }
  1281  
  1282     // @TODO: move stbi__convert_format16 to here
  1283     // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
  1284  
  1285     if (stbi__vertically_flip_on_load) {
  1286        int channels = req_comp ? req_comp : *comp;
  1287        stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
  1288     }
  1289  
  1290     return (stbi__uint16 *) result;
  1291  }
  1292  
  1293  #if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
  1294  static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
  1295  {
  1296     if (stbi__vertically_flip_on_load && result != NULL) {
  1297        int channels = req_comp ? req_comp : *comp;
  1298        stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
  1299     }
  1300  }
  1301  #endif
  1302  
  1303  #ifndef STBI_NO_STDIO
  1304  
  1305  #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
  1306  STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
  1307  STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
  1308  #endif
  1309  
  1310  #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
  1311  STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
  1312  {
  1313  	return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
  1314  }
  1315  #endif
  1316  
  1317  static FILE *stbi__fopen(char const *filename, char const *mode)
  1318  {
  1319     FILE *f;
  1320  #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
  1321     wchar_t wMode[64];
  1322     wchar_t wFilename[1024];
  1323  	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename)))
  1324        return 0;
  1325  
  1326  	if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode)))
  1327        return 0;
  1328  
  1329  #if defined(_MSC_VER) && _MSC_VER >= 1400
  1330  	if (0 != _wfopen_s(&f, wFilename, wMode))
  1331  		f = 0;
  1332  #else
  1333     f = _wfopen(wFilename, wMode);
  1334  #endif
  1335  
  1336  #elif defined(_MSC_VER) && _MSC_VER >= 1400
  1337     if (0 != fopen_s(&f, filename, mode))
  1338        f=0;
  1339  #else
  1340     f = fopen(filename, mode);
  1341  #endif
  1342     return f;
  1343  }
  1344  
  1345  
  1346  STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
  1347  {
  1348     FILE *f = stbi__fopen(filename, "rb");
  1349     unsigned char *result;
  1350     if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
  1351     result = stbi_load_from_file(f,x,y,comp,req_comp);
  1352     fclose(f);
  1353     return result;
  1354  }
  1355  
  1356  STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
  1357  {
  1358     unsigned char *result;
  1359     stbi__context s;
  1360     stbi__start_file(&s,f);
  1361     result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
  1362     if (result) {
  1363        // need to 'unget' all the characters in the IO buffer
  1364        fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
  1365     }
  1366     return result;
  1367  }
  1368  
  1369  STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
  1370  {
  1371     stbi__uint16 *result;
  1372     stbi__context s;
  1373     stbi__start_file(&s,f);
  1374     result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
  1375     if (result) {
  1376        // need to 'unget' all the characters in the IO buffer
  1377        fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
  1378     }
  1379     return result;
  1380  }
  1381  
  1382  STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
  1383  {
  1384     FILE *f = stbi__fopen(filename, "rb");
  1385     stbi__uint16 *result;
  1386     if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
  1387     result = stbi_load_from_file_16(f,x,y,comp,req_comp);
  1388     fclose(f);
  1389     return result;
  1390  }
  1391  
  1392  
  1393  #endif //!STBI_NO_STDIO
  1394  
  1395  STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
  1396  {
  1397     stbi__context s;
  1398     stbi__start_mem(&s,buffer,len);
  1399     return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
  1400  }
  1401  
  1402  STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
  1403  {
  1404     stbi__context s;
  1405     stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
  1406     return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
  1407  }
  1408  
  1409  STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
  1410  {
  1411     stbi__context s;
  1412     stbi__start_mem(&s,buffer,len);
  1413     return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
  1414  }
  1415  
  1416  STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
  1417  {
  1418     stbi__context s;
  1419     stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
  1420     return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
  1421  }
  1422  
  1423  #ifndef STBI_NO_GIF
  1424  STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
  1425  {
  1426     unsigned char *result;
  1427     stbi__context s;
  1428     stbi__start_mem(&s,buffer,len);
  1429  
  1430     result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
  1431     if (stbi__vertically_flip_on_load) {
  1432        stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
  1433     }
  1434  
  1435     return result;
  1436  }
  1437  #endif
  1438  
  1439  #ifndef STBI_NO_LINEAR
  1440  static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
  1441  {
  1442     unsigned char *data;
  1443     #ifndef STBI_NO_HDR
  1444     if (stbi__hdr_test(s)) {
  1445        stbi__result_info ri;
  1446        float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
  1447        if (hdr_data)
  1448           stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
  1449        return hdr_data;
  1450     }
  1451     #endif
  1452     data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
  1453     if (data)
  1454        return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
  1455     return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
  1456  }
  1457  
  1458  STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
  1459  {
  1460     stbi__context s;
  1461     stbi__start_mem(&s,buffer,len);
  1462     return stbi__loadf_main(&s,x,y,comp,req_comp);
  1463  }
  1464  
  1465  STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
  1466  {
  1467     stbi__context s;
  1468     stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
  1469     return stbi__loadf_main(&s,x,y,comp,req_comp);
  1470  }
  1471  
  1472  #ifndef STBI_NO_STDIO
  1473  STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
  1474  {
  1475     float *result;
  1476     FILE *f = stbi__fopen(filename, "rb");
  1477     if (!f) return stbi__errpf("can't fopen", "Unable to open file");
  1478     result = stbi_loadf_from_file(f,x,y,comp,req_comp);
  1479     fclose(f);
  1480     return result;
  1481  }
  1482  
  1483  STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
  1484  {
  1485     stbi__context s;
  1486     stbi__start_file(&s,f);
  1487     return stbi__loadf_main(&s,x,y,comp,req_comp);
  1488  }
  1489  #endif // !STBI_NO_STDIO
  1490  
  1491  #endif // !STBI_NO_LINEAR
  1492  
  1493  // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
  1494  // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
  1495  // reports false!
  1496  
  1497  STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
  1498  {
  1499     #ifndef STBI_NO_HDR
  1500     stbi__context s;
  1501     stbi__start_mem(&s,buffer,len);
  1502     return stbi__hdr_test(&s);
  1503     #else
  1504     STBI_NOTUSED(buffer);
  1505     STBI_NOTUSED(len);
  1506     return 0;
  1507     #endif
  1508  }
  1509  
  1510  #ifndef STBI_NO_STDIO
  1511  STBIDEF int      stbi_is_hdr          (char const *filename)
  1512  {
  1513     FILE *f = stbi__fopen(filename, "rb");
  1514     int result=0;
  1515     if (f) {
  1516        result = stbi_is_hdr_from_file(f);
  1517        fclose(f);
  1518     }
  1519     return result;
  1520  }
  1521  
  1522  STBIDEF int stbi_is_hdr_from_file(FILE *f)
  1523  {
  1524     #ifndef STBI_NO_HDR
  1525     long pos = ftell(f);
  1526     int res;
  1527     stbi__context s;
  1528     stbi__start_file(&s,f);
  1529     res = stbi__hdr_test(&s);
  1530     fseek(f, pos, SEEK_SET);
  1531     return res;
  1532     #else
  1533     STBI_NOTUSED(f);
  1534     return 0;
  1535     #endif
  1536  }
  1537  #endif // !STBI_NO_STDIO
  1538  
  1539  STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
  1540  {
  1541     #ifndef STBI_NO_HDR
  1542     stbi__context s;
  1543     stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
  1544     return stbi__hdr_test(&s);
  1545     #else
  1546     STBI_NOTUSED(clbk);
  1547     STBI_NOTUSED(user);
  1548     return 0;
  1549     #endif
  1550  }
  1551  
  1552  #ifndef STBI_NO_LINEAR
  1553  static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
  1554  
  1555  STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
  1556  STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
  1557  #endif
  1558  
  1559  static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
  1560  
  1561  STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
  1562  STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
  1563  
  1564  
  1565  //////////////////////////////////////////////////////////////////////////////
  1566  //
  1567  // Common code used by all image loaders
  1568  //
  1569  
  1570  enum
  1571  {
  1572     STBI__SCAN_load=0,
  1573     STBI__SCAN_type,
  1574     STBI__SCAN_header
  1575  };
  1576  
  1577  static void stbi__refill_buffer(stbi__context *s)
  1578  {
  1579     int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
  1580     s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original);
  1581     if (n == 0) {
  1582        // at end of file, treat same as if from memory, but need to handle case
  1583        // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
  1584        s->read_from_callbacks = 0;
  1585        s->img_buffer = s->buffer_start;
  1586        s->img_buffer_end = s->buffer_start+1;
  1587        *s->img_buffer = 0;
  1588     } else {
  1589        s->img_buffer = s->buffer_start;
  1590        s->img_buffer_end = s->buffer_start + n;
  1591     }
  1592  }
  1593  
  1594  stbi_inline static stbi_uc stbi__get8(stbi__context *s)
  1595  {
  1596     if (s->img_buffer < s->img_buffer_end)
  1597        return *s->img_buffer++;
  1598     if (s->read_from_callbacks) {
  1599        stbi__refill_buffer(s);
  1600        return *s->img_buffer++;
  1601     }
  1602     return 0;
  1603  }
  1604  
  1605  #if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
  1606  // nothing
  1607  #else
  1608  stbi_inline static int stbi__at_eof(stbi__context *s)
  1609  {
  1610     if (s->io.read) {
  1611        if (!(s->io.eof)(s->io_user_data)) return 0;
  1612        // if feof() is true, check if buffer = end
  1613        // special case: we've only got the special 0 character at the end
  1614        if (s->read_from_callbacks == 0) return 1;
  1615     }
  1616  
  1617     return s->img_buffer >= s->img_buffer_end;
  1618  }
  1619  #endif
  1620  
  1621  #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
  1622  // nothing
  1623  #else
  1624  static void stbi__skip(stbi__context *s, int n)
  1625  {
  1626     if (n == 0) return;  // already there!
  1627     if (n < 0) {
  1628        s->img_buffer = s->img_buffer_end;
  1629        return;
  1630     }
  1631     if (s->io.read) {
  1632        int blen = (int) (s->img_buffer_end - s->img_buffer);
  1633        if (blen < n) {
  1634           s->img_buffer = s->img_buffer_end;
  1635           (s->io.skip)(s->io_user_data, n - blen);
  1636           return;
  1637        }
  1638     }
  1639     s->img_buffer += n;
  1640  }
  1641  #endif
  1642  
  1643  #if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
  1644  // nothing
  1645  #else
  1646  static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
  1647  {
  1648     if (s->io.read) {
  1649        int blen = (int) (s->img_buffer_end - s->img_buffer);
  1650        if (blen < n) {
  1651           int res, count;
  1652  
  1653           memcpy(buffer, s->img_buffer, blen);
  1654  
  1655           count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
  1656           res = (count == (n-blen));
  1657           s->img_buffer = s->img_buffer_end;
  1658           return res;
  1659        }
  1660     }
  1661  
  1662     if (s->img_buffer+n <= s->img_buffer_end) {
  1663        memcpy(buffer, s->img_buffer, n);
  1664        s->img_buffer += n;
  1665        return 1;
  1666     } else
  1667        return 0;
  1668  }
  1669  #endif
  1670  
  1671  #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
  1672  // nothing
  1673  #else
  1674  static int stbi__get16be(stbi__context *s)
  1675  {
  1676     int z = stbi__get8(s);
  1677     return (z << 8) + stbi__get8(s);
  1678  }
  1679  #endif
  1680  
  1681  #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
  1682  // nothing
  1683  #else
  1684  static stbi__uint32 stbi__get32be(stbi__context *s)
  1685  {
  1686     stbi__uint32 z = stbi__get16be(s);
  1687     return (z << 16) + stbi__get16be(s);
  1688  }
  1689  #endif
  1690  
  1691  #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
  1692  // nothing
  1693  #else
  1694  static int stbi__get16le(stbi__context *s)
  1695  {
  1696     int z = stbi__get8(s);
  1697     return z + (stbi__get8(s) << 8);
  1698  }
  1699  #endif
  1700  
  1701  #ifndef STBI_NO_BMP
  1702  static stbi__uint32 stbi__get32le(stbi__context *s)
  1703  {
  1704     stbi__uint32 z = stbi__get16le(s);
  1705     z += (stbi__uint32)stbi__get16le(s) << 16;
  1706     return z;
  1707  }
  1708  #endif
  1709  
  1710  #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
  1711  
  1712  #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
  1713  // nothing
  1714  #else
  1715  //////////////////////////////////////////////////////////////////////////////
  1716  //
  1717  //  generic converter from built-in img_n to req_comp
  1718  //    individual types do this automatically as much as possible (e.g. jpeg
  1719  //    does all cases internally since it needs to colorspace convert anyway,
  1720  //    and it never has alpha, so very few cases ). png can automatically
  1721  //    interleave an alpha=255 channel, but falls back to this for other cases
  1722  //
  1723  //  assume data buffer is malloced, so malloc a new one and free that one
  1724  //  only failure mode is malloc failing
  1725  
  1726  static stbi_uc stbi__compute_y(int r, int g, int b)
  1727  {
  1728     return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
  1729  }
  1730  #endif
  1731  
  1732  #if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
  1733  // nothing
  1734  #else
  1735  static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
  1736  {
  1737     int i,j;
  1738     unsigned char *good;
  1739  
  1740     if (req_comp == img_n) return data;
  1741     STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
  1742  
  1743     good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
  1744     if (good == NULL) {
  1745        STBI_FREE(data);
  1746        return stbi__errpuc("outofmem", "Out of memory");
  1747     }
  1748  
  1749     for (j=0; j < (int) y; ++j) {
  1750        unsigned char *src  = data + j * x * img_n   ;
  1751        unsigned char *dest = good + j * x * req_comp;
  1752  
  1753        #define STBI__COMBO(a,b)  ((a)*8+(b))
  1754        #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
  1755        // convert source image with img_n components to one with req_comp components;
  1756        // avoid switch per pixel, so use switch per scanline and massive macros
  1757        switch (STBI__COMBO(img_n, req_comp)) {
  1758           STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255;                                     } break;
  1759           STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
  1760           STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255;                     } break;
  1761           STBI__CASE(2,1) { dest[0]=src[0];                                                  } break;
  1762           STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                  } break;
  1763           STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                  } break;
  1764           STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255;        } break;
  1765           STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
  1766           STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255;    } break;
  1767           STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]);                   } break;
  1768           STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
  1769           STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                    } break;
  1770           default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion");
  1771        }
  1772        #undef STBI__CASE
  1773     }
  1774  
  1775     STBI_FREE(data);
  1776     return good;
  1777  }
  1778  #endif
  1779  
  1780  #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
  1781  // nothing
  1782  #else
  1783  static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
  1784  {
  1785     return (stbi__uint16) (((r*77) + (g*150) +  (29*b)) >> 8);
  1786  }
  1787  #endif
  1788  
  1789  #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
  1790  // nothing
  1791  #else
  1792  static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
  1793  {
  1794     int i,j;
  1795     stbi__uint16 *good;
  1796  
  1797     if (req_comp == img_n) return data;
  1798     STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
  1799  
  1800     good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
  1801     if (good == NULL) {
  1802        STBI_FREE(data);
  1803        return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
  1804     }
  1805  
  1806     for (j=0; j < (int) y; ++j) {
  1807        stbi__uint16 *src  = data + j * x * img_n   ;
  1808        stbi__uint16 *dest = good + j * x * req_comp;
  1809  
  1810        #define STBI__COMBO(a,b)  ((a)*8+(b))
  1811        #define STBI__CASE(a,b)   case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
  1812        // convert source image with img_n components to one with req_comp components;
  1813        // avoid switch per pixel, so use switch per scanline and massive macros
  1814        switch (STBI__COMBO(img_n, req_comp)) {
  1815           STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff;                                     } break;
  1816           STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
  1817           STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff;                     } break;
  1818           STBI__CASE(2,1) { dest[0]=src[0];                                                     } break;
  1819           STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0];                                     } break;
  1820           STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1];                     } break;
  1821           STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff;        } break;
  1822           STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
  1823           STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
  1824           STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]);                   } break;
  1825           STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
  1826           STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];                       } break;
  1827           default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion");
  1828        }
  1829        #undef STBI__CASE
  1830     }
  1831  
  1832     STBI_FREE(data);
  1833     return good;
  1834  }
  1835  #endif
  1836  
  1837  #ifndef STBI_NO_LINEAR
  1838  static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
  1839  {
  1840     int i,k,n;
  1841     float *output;
  1842     if (!data) return NULL;
  1843     output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
  1844     if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
  1845     // compute number of non-alpha components
  1846     if (comp & 1) n = comp; else n = comp-1;
  1847     for (i=0; i < x*y; ++i) {
  1848        for (k=0; k < n; ++k) {
  1849           output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
  1850        }
  1851     }
  1852     if (n < comp) {
  1853        for (i=0; i < x*y; ++i) {
  1854           output[i*comp + n] = data[i*comp + n]/255.0f;
  1855        }
  1856     }
  1857     STBI_FREE(data);
  1858     return output;
  1859  }
  1860  #endif
  1861  
  1862  #ifndef STBI_NO_HDR
  1863  #define stbi__float2int(x)   ((int) (x))
  1864  static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
  1865  {
  1866     int i,k,n;
  1867     stbi_uc *output;
  1868     if (!data) return NULL;
  1869     output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
  1870     if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
  1871     // compute number of non-alpha components
  1872     if (comp & 1) n = comp; else n = comp-1;
  1873     for (i=0; i < x*y; ++i) {
  1874        for (k=0; k < n; ++k) {
  1875           float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
  1876           if (z < 0) z = 0;
  1877           if (z > 255) z = 255;
  1878           output[i*comp + k] = (stbi_uc) stbi__float2int(z);
  1879        }
  1880        if (k < comp) {
  1881           float z = data[i*comp+k] * 255 + 0.5f;
  1882           if (z < 0) z = 0;
  1883           if (z > 255) z = 255;
  1884           output[i*comp + k] = (stbi_uc) stbi__float2int(z);
  1885        }
  1886     }
  1887     STBI_FREE(data);
  1888     return output;
  1889  }
  1890  #endif
  1891  
  1892  //////////////////////////////////////////////////////////////////////////////
  1893  //
  1894  //  "baseline" JPEG/JFIF decoder
  1895  //
  1896  //    simple implementation
  1897  //      - doesn't support delayed output of y-dimension
  1898  //      - simple interface (only one output format: 8-bit interleaved RGB)
  1899  //      - doesn't try to recover corrupt jpegs
  1900  //      - doesn't allow partial loading, loading multiple at once
  1901  //      - still fast on x86 (copying globals into locals doesn't help x86)
  1902  //      - allocates lots of intermediate memory (full size of all components)
  1903  //        - non-interleaved case requires this anyway
  1904  //        - allows good upsampling (see next)
  1905  //    high-quality
  1906  //      - upsampled channels are bilinearly interpolated, even across blocks
  1907  //      - quality integer IDCT derived from IJG's 'slow'
  1908  //    performance
  1909  //      - fast huffman; reasonable integer IDCT
  1910  //      - some SIMD kernels for common paths on targets with SSE2/NEON
  1911  //      - uses a lot of intermediate memory, could cache poorly
  1912  
  1913  #ifndef STBI_NO_JPEG
  1914  
  1915  // huffman decoding acceleration
  1916  #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
  1917  
  1918  typedef struct
  1919  {
  1920     stbi_uc  fast[1 << FAST_BITS];
  1921     // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
  1922     stbi__uint16 code[256];
  1923     stbi_uc  values[256];
  1924     stbi_uc  size[257];
  1925     unsigned int maxcode[18];
  1926     int    delta[17];   // old 'firstsymbol' - old 'firstcode'
  1927  } stbi__huffman;
  1928  
  1929  typedef struct
  1930  {
  1931     stbi__context *s;
  1932     stbi__huffman huff_dc[4];
  1933     stbi__huffman huff_ac[4];
  1934     stbi__uint16 dequant[4][64];
  1935     stbi__int16 fast_ac[4][1 << FAST_BITS];
  1936  
  1937  // sizes for components, interleaved MCUs
  1938     int img_h_max, img_v_max;
  1939     int img_mcu_x, img_mcu_y;
  1940     int img_mcu_w, img_mcu_h;
  1941  
  1942  // definition of jpeg image component
  1943     struct
  1944     {
  1945        int id;
  1946        int h,v;
  1947        int tq;
  1948        int hd,ha;
  1949        int dc_pred;
  1950  
  1951        int x,y,w2,h2;
  1952        stbi_uc *data;
  1953        void *raw_data, *raw_coeff;
  1954        stbi_uc *linebuf;
  1955        short   *coeff;   // progressive only
  1956        int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
  1957     } img_comp[4];
  1958  
  1959     stbi__uint32   code_buffer; // jpeg entropy-coded buffer
  1960     int            code_bits;   // number of valid bits
  1961     unsigned char  marker;      // marker seen while filling entropy buffer
  1962     int            nomore;      // flag if we saw a marker so must stop
  1963  
  1964     int            progressive;
  1965     int            spec_start;
  1966     int            spec_end;
  1967     int            succ_high;
  1968     int            succ_low;
  1969     int            eob_run;
  1970     int            jfif;
  1971     int            app14_color_transform; // Adobe APP14 tag
  1972     int            rgb;
  1973  
  1974     int scan_n, order[4];
  1975     int restart_interval, todo;
  1976  
  1977  // kernels
  1978     void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
  1979     void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
  1980     stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
  1981  } stbi__jpeg;
  1982  
  1983  static int stbi__build_huffman(stbi__huffman *h, int *count)
  1984  {
  1985     int i,j,k=0;
  1986     unsigned int code;
  1987     // build size list for each symbol (from JPEG spec)
  1988     for (i=0; i < 16; ++i)
  1989        for (j=0; j < count[i]; ++j)
  1990           h->size[k++] = (stbi_uc) (i+1);
  1991     h->size[k] = 0;
  1992  
  1993     // compute actual symbols (from jpeg spec)
  1994     code = 0;
  1995     k = 0;
  1996     for(j=1; j <= 16; ++j) {
  1997        // compute delta to add to code to compute symbol id
  1998        h->delta[j] = k - code;
  1999        if (h->size[k] == j) {
  2000           while (h->size[k] == j)
  2001              h->code[k++] = (stbi__uint16) (code++);
  2002           if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
  2003        }
  2004        // compute largest code + 1 for this size, preshifted as needed later
  2005        h->maxcode[j] = code << (16-j);
  2006        code <<= 1;
  2007     }
  2008     h->maxcode[j] = 0xffffffff;
  2009  
  2010     // build non-spec acceleration table; 255 is flag for not-accelerated
  2011     memset(h->fast, 255, 1 << FAST_BITS);
  2012     for (i=0; i < k; ++i) {
  2013        int s = h->size[i];
  2014        if (s <= FAST_BITS) {
  2015           int c = h->code[i] << (FAST_BITS-s);
  2016           int m = 1 << (FAST_BITS-s);
  2017           for (j=0; j < m; ++j) {
  2018              h->fast[c+j] = (stbi_uc) i;
  2019           }
  2020        }
  2021     }
  2022     return 1;
  2023  }
  2024  
  2025  // build a table that decodes both magnitude and value of small ACs in
  2026  // one go.
  2027  static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
  2028  {
  2029     int i;
  2030     for (i=0; i < (1 << FAST_BITS); ++i) {
  2031        stbi_uc fast = h->fast[i];
  2032        fast_ac[i] = 0;
  2033        if (fast < 255) {
  2034           int rs = h->values[fast];
  2035           int run = (rs >> 4) & 15;
  2036           int magbits = rs & 15;
  2037           int len = h->size[fast];
  2038  
  2039           if (magbits && len + magbits <= FAST_BITS) {
  2040              // magnitude code followed by receive_extend code
  2041              int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
  2042              int m = 1 << (magbits - 1);
  2043              if (k < m) k += (~0U << magbits) + 1;
  2044              // if the result is small enough, we can fit it in fast_ac table
  2045              if (k >= -128 && k <= 127)
  2046                 fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
  2047           }
  2048        }
  2049     }
  2050  }
  2051  
  2052  static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
  2053  {
  2054     do {
  2055        unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
  2056        if (b == 0xff) {
  2057           int c = stbi__get8(j->s);
  2058           while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
  2059           if (c != 0) {
  2060              j->marker = (unsigned char) c;
  2061              j->nomore = 1;
  2062              return;
  2063           }
  2064        }
  2065        j->code_buffer |= b << (24 - j->code_bits);
  2066        j->code_bits += 8;
  2067     } while (j->code_bits <= 24);
  2068  }
  2069  
  2070  // (1 << n) - 1
  2071  static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
  2072  
  2073  // decode a jpeg huffman value from the bitstream
  2074  stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
  2075  {
  2076     unsigned int temp;
  2077     int c,k;
  2078  
  2079     if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
  2080  
  2081     // look at the top FAST_BITS and determine what symbol ID it is,
  2082     // if the code is <= FAST_BITS
  2083     c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
  2084     k = h->fast[c];
  2085     if (k < 255) {
  2086        int s = h->size[k];
  2087        if (s > j->code_bits)
  2088           return -1;
  2089        j->code_buffer <<= s;
  2090        j->code_bits -= s;
  2091        return h->values[k];
  2092     }
  2093  
  2094     // naive test is to shift the code_buffer down so k bits are
  2095     // valid, then test against maxcode. To speed this up, we've
  2096     // preshifted maxcode left so that it has (16-k) 0s at the
  2097     // end; in other words, regardless of the number of bits, it
  2098     // wants to be compared against something shifted to have 16;
  2099     // that way we don't need to shift inside the loop.
  2100     temp = j->code_buffer >> 16;
  2101     for (k=FAST_BITS+1 ; ; ++k)
  2102        if (temp < h->maxcode[k])
  2103           break;
  2104     if (k == 17) {
  2105        // error! code not found
  2106        j->code_bits -= 16;
  2107        return -1;
  2108     }
  2109  
  2110     if (k > j->code_bits)
  2111        return -1;
  2112  
  2113     // convert the huffman code to the symbol id
  2114     c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
  2115     STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
  2116  
  2117     // convert the id to a symbol
  2118     j->code_bits -= k;
  2119     j->code_buffer <<= k;
  2120     return h->values[c];
  2121  }
  2122  
  2123  // bias[n] = (-1<<n) + 1
  2124  static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
  2125  
  2126  // combined JPEG 'receive' and JPEG 'extend', since baseline
  2127  // always extends everything it receives.
  2128  stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
  2129  {
  2130     unsigned int k;
  2131     int sgn;
  2132     if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
  2133  
  2134     sgn = j->code_buffer >> 31; // sign bit always in MSB; 0 if MSB clear (positive), 1 if MSB set (negative)
  2135     k = stbi_lrot(j->code_buffer, n);
  2136     j->code_buffer = k & ~stbi__bmask[n];
  2137     k &= stbi__bmask[n];
  2138     j->code_bits -= n;
  2139     return k + (stbi__jbias[n] & (sgn - 1));
  2140  }
  2141  
  2142  // get some unsigned bits
  2143  stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
  2144  {
  2145     unsigned int k;
  2146     if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
  2147     k = stbi_lrot(j->code_buffer, n);
  2148     j->code_buffer = k & ~stbi__bmask[n];
  2149     k &= stbi__bmask[n];
  2150     j->code_bits -= n;
  2151     return k;
  2152  }
  2153  
  2154  stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
  2155  {
  2156     unsigned int k;
  2157     if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
  2158     k = j->code_buffer;
  2159     j->code_buffer <<= 1;
  2160     --j->code_bits;
  2161     return k & 0x80000000;
  2162  }
  2163  
  2164  // given a value that's at position X in the zigzag stream,
  2165  // where does it appear in the 8x8 matrix coded as row-major?
  2166  static const stbi_uc stbi__jpeg_dezigzag[64+15] =
  2167  {
  2168      0,  1,  8, 16,  9,  2,  3, 10,
  2169     17, 24, 32, 25, 18, 11,  4,  5,
  2170     12, 19, 26, 33, 40, 48, 41, 34,
  2171     27, 20, 13,  6,  7, 14, 21, 28,
  2172     35, 42, 49, 56, 57, 50, 43, 36,
  2173     29, 22, 15, 23, 30, 37, 44, 51,
  2174     58, 59, 52, 45, 38, 31, 39, 46,
  2175     53, 60, 61, 54, 47, 55, 62, 63,
  2176     // let corrupt input sample past end
  2177     63, 63, 63, 63, 63, 63, 63, 63,
  2178     63, 63, 63, 63, 63, 63, 63
  2179  };
  2180  
  2181  // decode one 64-entry block--
  2182  static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
  2183  {
  2184     int diff,dc,k;
  2185     int t;
  2186  
  2187     if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
  2188     t = stbi__jpeg_huff_decode(j, hdc);
  2189     if (t < 0 || t > 15) return stbi__err("bad huffman code","Corrupt JPEG");
  2190  
  2191     // 0 all the ac values now so we can do it 32-bits at a time
  2192     memset(data,0,64*sizeof(data[0]));
  2193  
  2194     diff = t ? stbi__extend_receive(j, t) : 0;
  2195     dc = j->img_comp[b].dc_pred + diff;
  2196     j->img_comp[b].dc_pred = dc;
  2197     data[0] = (short) (dc * dequant[0]);
  2198  
  2199     // decode AC components, see JPEG spec
  2200     k = 1;
  2201     do {
  2202        unsigned int zig;
  2203        int c,r,s;
  2204        if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
  2205        c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
  2206        r = fac[c];
  2207        if (r) { // fast-AC path
  2208           k += (r >> 4) & 15; // run
  2209           s = r & 15; // combined length
  2210           j->code_buffer <<= s;
  2211           j->code_bits -= s;
  2212           // decode into unzigzag'd location
  2213           zig = stbi__jpeg_dezigzag[k++];
  2214           data[zig] = (short) ((r >> 8) * dequant[zig]);
  2215        } else {
  2216           int rs = stbi__jpeg_huff_decode(j, hac);
  2217           if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
  2218           s = rs & 15;
  2219           r = rs >> 4;
  2220           if (s == 0) {
  2221              if (rs != 0xf0) break; // end block
  2222              k += 16;
  2223           } else {
  2224              k += r;
  2225              // decode into unzigzag'd location
  2226              zig = stbi__jpeg_dezigzag[k++];
  2227              data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
  2228           }
  2229        }
  2230     } while (k < 64);
  2231     return 1;
  2232  }
  2233  
  2234  static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
  2235  {
  2236     int diff,dc;
  2237     int t;
  2238     if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
  2239  
  2240     if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
  2241  
  2242     if (j->succ_high == 0) {
  2243        // first scan for DC coefficient, must be first
  2244        memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
  2245        t = stbi__jpeg_huff_decode(j, hdc);
  2246        if (t < 0 || t > 15) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
  2247        diff = t ? stbi__extend_receive(j, t) : 0;
  2248  
  2249        dc = j->img_comp[b].dc_pred + diff;
  2250        j->img_comp[b].dc_pred = dc;
  2251        data[0] = (short) (dc * (1 << j->succ_low));
  2252     } else {
  2253        // refinement scan for DC coefficient
  2254        if (stbi__jpeg_get_bit(j))
  2255           data[0] += (short) (1 << j->succ_low);
  2256     }
  2257     return 1;
  2258  }
  2259  
  2260  // @OPTIMIZE: store non-zigzagged during the decode passes,
  2261  // and only de-zigzag when dequantizing
  2262  static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
  2263  {
  2264     int k;
  2265     if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
  2266  
  2267     if (j->succ_high == 0) {
  2268        int shift = j->succ_low;
  2269  
  2270        if (j->eob_run) {
  2271           --j->eob_run;
  2272           return 1;
  2273        }
  2274  
  2275        k = j->spec_start;
  2276        do {
  2277           unsigned int zig;
  2278           int c,r,s;
  2279           if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
  2280           c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
  2281           r = fac[c];
  2282           if (r) { // fast-AC path
  2283              k += (r >> 4) & 15; // run
  2284              s = r & 15; // combined length
  2285              j->code_buffer <<= s;
  2286              j->code_bits -= s;
  2287              zig = stbi__jpeg_dezigzag[k++];
  2288              data[zig] = (short) ((r >> 8) * (1 << shift));
  2289           } else {
  2290              int rs = stbi__jpeg_huff_decode(j, hac);
  2291              if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
  2292              s = rs & 15;
  2293              r = rs >> 4;
  2294              if (s == 0) {
  2295                 if (r < 15) {
  2296                    j->eob_run = (1 << r);
  2297                    if (r)
  2298                       j->eob_run += stbi__jpeg_get_bits(j, r);
  2299                    --j->eob_run;
  2300                    break;
  2301                 }
  2302                 k += 16;
  2303              } else {
  2304                 k += r;
  2305                 zig = stbi__jpeg_dezigzag[k++];
  2306                 data[zig] = (short) (stbi__extend_receive(j,s) * (1 << shift));
  2307              }
  2308           }
  2309        } while (k <= j->spec_end);
  2310     } else {
  2311        // refinement scan for these AC coefficients
  2312  
  2313        short bit = (short) (1 << j->succ_low);
  2314  
  2315        if (j->eob_run) {
  2316           --j->eob_run;
  2317           for (k = j->spec_start; k <= j->spec_end; ++k) {
  2318              short *p = &data[stbi__jpeg_dezigzag[k]];
  2319              if (*p != 0)
  2320                 if (stbi__jpeg_get_bit(j))
  2321                    if ((*p & bit)==0) {
  2322                       if (*p > 0)
  2323                          *p += bit;
  2324                       else
  2325                          *p -= bit;
  2326                    }
  2327           }
  2328        } else {
  2329           k = j->spec_start;
  2330           do {
  2331              int r,s;
  2332              int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
  2333              if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
  2334              s = rs & 15;
  2335              r = rs >> 4;
  2336              if (s == 0) {
  2337                 if (r < 15) {
  2338                    j->eob_run = (1 << r) - 1;
  2339                    if (r)
  2340                       j->eob_run += stbi__jpeg_get_bits(j, r);
  2341                    r = 64; // force end of block
  2342                 } else {
  2343                    // r=15 s=0 should write 16 0s, so we just do
  2344                    // a run of 15 0s and then write s (which is 0),
  2345                    // so we don't have to do anything special here
  2346                 }
  2347              } else {
  2348                 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
  2349                 // sign bit
  2350                 if (stbi__jpeg_get_bit(j))
  2351                    s = bit;
  2352                 else
  2353                    s = -bit;
  2354              }
  2355  
  2356              // advance by r
  2357              while (k <= j->spec_end) {
  2358                 short *p = &data[stbi__jpeg_dezigzag[k++]];
  2359                 if (*p != 0) {
  2360                    if (stbi__jpeg_get_bit(j))
  2361                       if ((*p & bit)==0) {
  2362                          if (*p > 0)
  2363                             *p += bit;
  2364                          else
  2365                             *p -= bit;
  2366                       }
  2367                 } else {
  2368                    if (r == 0) {
  2369                       *p = (short) s;
  2370                       break;
  2371                    }
  2372                    --r;
  2373                 }
  2374              }
  2375           } while (k <= j->spec_end);
  2376        }
  2377     }
  2378     return 1;
  2379  }
  2380  
  2381  // take a -128..127 value and stbi__clamp it and convert to 0..255
  2382  stbi_inline static stbi_uc stbi__clamp(int x)
  2383  {
  2384     // trick to use a single test to catch both cases
  2385     if ((unsigned int) x > 255) {
  2386        if (x < 0) return 0;
  2387        if (x > 255) return 255;
  2388     }
  2389     return (stbi_uc) x;
  2390  }
  2391  
  2392  #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
  2393  #define stbi__fsh(x)  ((x) * 4096)
  2394  
  2395  // derived from jidctint -- DCT_ISLOW
  2396  #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
  2397     int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
  2398     p2 = s2;                                    \
  2399     p3 = s6;                                    \
  2400     p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
  2401     t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
  2402     t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
  2403     p2 = s0;                                    \
  2404     p3 = s4;                                    \
  2405     t0 = stbi__fsh(p2+p3);                      \
  2406     t1 = stbi__fsh(p2-p3);                      \
  2407     x0 = t0+t3;                                 \
  2408     x3 = t0-t3;                                 \
  2409     x1 = t1+t2;                                 \
  2410     x2 = t1-t2;                                 \
  2411     t0 = s7;                                    \
  2412     t1 = s5;                                    \
  2413     t2 = s3;                                    \
  2414     t3 = s1;                                    \
  2415     p3 = t0+t2;                                 \
  2416     p4 = t1+t3;                                 \
  2417     p1 = t0+t3;                                 \
  2418     p2 = t1+t2;                                 \
  2419     p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
  2420     t0 = t0*stbi__f2f( 0.298631336f);           \
  2421     t1 = t1*stbi__f2f( 2.053119869f);           \
  2422     t2 = t2*stbi__f2f( 3.072711026f);           \
  2423     t3 = t3*stbi__f2f( 1.501321110f);           \
  2424     p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
  2425     p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
  2426     p3 = p3*stbi__f2f(-1.961570560f);           \
  2427     p4 = p4*stbi__f2f(-0.390180644f);           \
  2428     t3 += p1+p4;                                \
  2429     t2 += p2+p3;                                \
  2430     t1 += p2+p4;                                \
  2431     t0 += p1+p3;
  2432  
  2433  static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
  2434  {
  2435     int i,val[64],*v=val;
  2436     stbi_uc *o;
  2437     short *d = data;
  2438  
  2439     // columns
  2440     for (i=0; i < 8; ++i,++d, ++v) {
  2441        // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
  2442        if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
  2443             && d[40]==0 && d[48]==0 && d[56]==0) {
  2444           //    no shortcut                 0     seconds
  2445           //    (1|2|3|4|5|6|7)==0          0     seconds
  2446           //    all separate               -0.047 seconds
  2447           //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
  2448           int dcterm = d[0]*4;
  2449           v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
  2450        } else {
  2451           STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
  2452           // constants scaled things up by 1<<12; let's bring them back
  2453           // down, but keep 2 extra bits of precision
  2454           x0 += 512; x1 += 512; x2 += 512; x3 += 512;
  2455           v[ 0] = (x0+t3) >> 10;
  2456           v[56] = (x0-t3) >> 10;
  2457           v[ 8] = (x1+t2) >> 10;
  2458           v[48] = (x1-t2) >> 10;
  2459           v[16] = (x2+t1) >> 10;
  2460           v[40] = (x2-t1) >> 10;
  2461           v[24] = (x3+t0) >> 10;
  2462           v[32] = (x3-t0) >> 10;
  2463        }
  2464     }
  2465  
  2466     for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
  2467        // no fast case since the first 1D IDCT spread components out
  2468        STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
  2469        // constants scaled things up by 1<<12, plus we had 1<<2 from first
  2470        // loop, plus horizontal and vertical each scale by sqrt(8) so together
  2471        // we've got an extra 1<<3, so 1<<17 total we need to remove.
  2472        // so we want to round that, which means adding 0.5 * 1<<17,
  2473        // aka 65536. Also, we'll end up with -128 to 127 that we want
  2474        // to encode as 0..255 by adding 128, so we'll add that before the shift
  2475        x0 += 65536 + (128<<17);
  2476        x1 += 65536 + (128<<17);
  2477        x2 += 65536 + (128<<17);
  2478        x3 += 65536 + (128<<17);
  2479        // tried computing the shifts into temps, or'ing the temps to see
  2480        // if any were out of range, but that was slower
  2481        o[0] = stbi__clamp((x0+t3) >> 17);
  2482        o[7] = stbi__clamp((x0-t3) >> 17);
  2483        o[1] = stbi__clamp((x1+t2) >> 17);
  2484        o[6] = stbi__clamp((x1-t2) >> 17);
  2485        o[2] = stbi__clamp((x2+t1) >> 17);
  2486        o[5] = stbi__clamp((x2-t1) >> 17);
  2487        o[3] = stbi__clamp((x3+t0) >> 17);
  2488        o[4] = stbi__clamp((x3-t0) >> 17);
  2489     }
  2490  }
  2491  
  2492  #ifdef STBI_SSE2
  2493  // sse2 integer IDCT. not the fastest possible implementation but it
  2494  // produces bit-identical results to the generic C version so it's
  2495  // fully "transparent".
  2496  static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
  2497  {
  2498     // This is constructed to match our regular (generic) integer IDCT exactly.
  2499     __m128i row0, row1, row2, row3, row4, row5, row6, row7;
  2500     __m128i tmp;
  2501  
  2502     // dot product constant: even elems=x, odd elems=y
  2503     #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
  2504  
  2505     // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
  2506     // out(1) = c1[even]*x + c1[odd]*y
  2507     #define dct_rot(out0,out1, x,y,c0,c1) \
  2508        __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
  2509        __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
  2510        __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
  2511        __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
  2512        __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
  2513        __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
  2514  
  2515     // out = in << 12  (in 16-bit, out 32-bit)
  2516     #define dct_widen(out, in) \
  2517        __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
  2518        __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
  2519  
  2520     // wide add
  2521     #define dct_wadd(out, a, b) \
  2522        __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
  2523        __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
  2524  
  2525     // wide sub
  2526     #define dct_wsub(out, a, b) \
  2527        __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
  2528        __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
  2529  
  2530     // butterfly a/b, add bias, then shift by "s" and pack
  2531     #define dct_bfly32o(out0, out1, a,b,bias,s) \
  2532        { \
  2533           __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
  2534           __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
  2535           dct_wadd(sum, abiased, b); \
  2536           dct_wsub(dif, abiased, b); \
  2537           out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
  2538           out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
  2539        }
  2540  
  2541     // 8-bit interleave step (for transposes)
  2542     #define dct_interleave8(a, b) \
  2543        tmp = a; \
  2544        a = _mm_unpacklo_epi8(a, b); \
  2545        b = _mm_unpackhi_epi8(tmp, b)
  2546  
  2547     // 16-bit interleave step (for transposes)
  2548     #define dct_interleave16(a, b) \
  2549        tmp = a; \
  2550        a = _mm_unpacklo_epi16(a, b); \
  2551        b = _mm_unpackhi_epi16(tmp, b)
  2552  
  2553     #define dct_pass(bias,shift) \
  2554        { \
  2555           /* even part */ \
  2556           dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
  2557           __m128i sum04 = _mm_add_epi16(row0, row4); \
  2558           __m128i dif04 = _mm_sub_epi16(row0, row4); \
  2559           dct_widen(t0e, sum04); \
  2560           dct_widen(t1e, dif04); \
  2561           dct_wadd(x0, t0e, t3e); \
  2562           dct_wsub(x3, t0e, t3e); \
  2563           dct_wadd(x1, t1e, t2e); \
  2564           dct_wsub(x2, t1e, t2e); \
  2565           /* odd part */ \
  2566           dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
  2567           dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
  2568           __m128i sum17 = _mm_add_epi16(row1, row7); \
  2569           __m128i sum35 = _mm_add_epi16(row3, row5); \
  2570           dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
  2571           dct_wadd(x4, y0o, y4o); \
  2572           dct_wadd(x5, y1o, y5o); \
  2573           dct_wadd(x6, y2o, y5o); \
  2574           dct_wadd(x7, y3o, y4o); \
  2575           dct_bfly32o(row0,row7, x0,x7,bias,shift); \
  2576           dct_bfly32o(row1,row6, x1,x6,bias,shift); \
  2577           dct_bfly32o(row2,row5, x2,x5,bias,shift); \
  2578           dct_bfly32o(row3,row4, x3,x4,bias,shift); \
  2579        }
  2580  
  2581     __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
  2582     __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
  2583     __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
  2584     __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
  2585     __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
  2586     __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
  2587     __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
  2588     __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
  2589  
  2590     // rounding biases in column/row passes, see stbi__idct_block for explanation.
  2591     __m128i bias_0 = _mm_set1_epi32(512);
  2592     __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
  2593  
  2594     // load
  2595     row0 = _mm_load_si128((const __m128i *) (data + 0*8));
  2596     row1 = _mm_load_si128((const __m128i *) (data + 1*8));
  2597     row2 = _mm_load_si128((const __m128i *) (data + 2*8));
  2598     row3 = _mm_load_si128((const __m128i *) (data + 3*8));
  2599     row4 = _mm_load_si128((const __m128i *) (data + 4*8));
  2600     row5 = _mm_load_si128((const __m128i *) (data + 5*8));
  2601     row6 = _mm_load_si128((const __m128i *) (data + 6*8));
  2602     row7 = _mm_load_si128((const __m128i *) (data + 7*8));
  2603  
  2604     // column pass
  2605     dct_pass(bias_0, 10);
  2606  
  2607     {
  2608        // 16bit 8x8 transpose pass 1
  2609        dct_interleave16(row0, row4);
  2610        dct_interleave16(row1, row5);
  2611        dct_interleave16(row2, row6);
  2612        dct_interleave16(row3, row7);
  2613  
  2614        // transpose pass 2
  2615        dct_interleave16(row0, row2);
  2616        dct_interleave16(row1, row3);
  2617        dct_interleave16(row4, row6);
  2618        dct_interleave16(row5, row7);
  2619  
  2620        // transpose pass 3
  2621        dct_interleave16(row0, row1);
  2622        dct_interleave16(row2, row3);
  2623        dct_interleave16(row4, row5);
  2624        dct_interleave16(row6, row7);
  2625     }
  2626  
  2627     // row pass
  2628     dct_pass(bias_1, 17);
  2629  
  2630     {
  2631        // pack
  2632        __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
  2633        __m128i p1 = _mm_packus_epi16(row2, row3);
  2634        __m128i p2 = _mm_packus_epi16(row4, row5);
  2635        __m128i p3 = _mm_packus_epi16(row6, row7);
  2636  
  2637        // 8bit 8x8 transpose pass 1
  2638        dct_interleave8(p0, p2); // a0e0a1e1...
  2639        dct_interleave8(p1, p3); // c0g0c1g1...
  2640  
  2641        // transpose pass 2
  2642        dct_interleave8(p0, p1); // a0c0e0g0...
  2643        dct_interleave8(p2, p3); // b0d0f0h0...
  2644  
  2645        // transpose pass 3
  2646        dct_interleave8(p0, p2); // a0b0c0d0...
  2647        dct_interleave8(p1, p3); // a4b4c4d4...
  2648  
  2649        // store
  2650        _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
  2651        _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
  2652        _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
  2653        _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
  2654        _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
  2655        _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
  2656        _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
  2657        _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
  2658     }
  2659  
  2660  #undef dct_const
  2661  #undef dct_rot
  2662  #undef dct_widen
  2663  #undef dct_wadd
  2664  #undef dct_wsub
  2665  #undef dct_bfly32o
  2666  #undef dct_interleave8
  2667  #undef dct_interleave16
  2668  #undef dct_pass
  2669  }
  2670  
  2671  #endif // STBI_SSE2
  2672  
  2673  #ifdef STBI_NEON
  2674  
  2675  // NEON integer IDCT. should produce bit-identical
  2676  // results to the generic C version.
  2677  static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
  2678  {
  2679     int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
  2680  
  2681     int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
  2682     int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
  2683     int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
  2684     int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
  2685     int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
  2686     int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
  2687     int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
  2688     int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
  2689     int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
  2690     int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
  2691     int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
  2692     int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
  2693  
  2694  #define dct_long_mul(out, inq, coeff) \
  2695     int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
  2696     int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
  2697  
  2698  #define dct_long_mac(out, acc, inq, coeff) \
  2699     int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
  2700     int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
  2701  
  2702  #define dct_widen(out, inq) \
  2703     int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
  2704     int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
  2705  
  2706  // wide add
  2707  #define dct_wadd(out, a, b) \
  2708     int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
  2709     int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
  2710  
  2711  // wide sub
  2712  #define dct_wsub(out, a, b) \
  2713     int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
  2714     int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
  2715  
  2716  // butterfly a/b, then shift using "shiftop" by "s" and pack
  2717  #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
  2718     { \
  2719        dct_wadd(sum, a, b); \
  2720        dct_wsub(dif, a, b); \
  2721        out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
  2722        out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
  2723     }
  2724  
  2725  #define dct_pass(shiftop, shift) \
  2726     { \
  2727        /* even part */ \
  2728        int16x8_t sum26 = vaddq_s16(row2, row6); \
  2729        dct_long_mul(p1e, sum26, rot0_0); \
  2730        dct_long_mac(t2e, p1e, row6, rot0_1); \
  2731        dct_long_mac(t3e, p1e, row2, rot0_2); \
  2732        int16x8_t sum04 = vaddq_s16(row0, row4); \
  2733        int16x8_t dif04 = vsubq_s16(row0, row4); \
  2734        dct_widen(t0e, sum04); \
  2735        dct_widen(t1e, dif04); \
  2736        dct_wadd(x0, t0e, t3e); \
  2737        dct_wsub(x3, t0e, t3e); \
  2738        dct_wadd(x1, t1e, t2e); \
  2739        dct_wsub(x2, t1e, t2e); \
  2740        /* odd part */ \
  2741        int16x8_t sum15 = vaddq_s16(row1, row5); \
  2742        int16x8_t sum17 = vaddq_s16(row1, row7); \
  2743        int16x8_t sum35 = vaddq_s16(row3, row5); \
  2744        int16x8_t sum37 = vaddq_s16(row3, row7); \
  2745        int16x8_t sumodd = vaddq_s16(sum17, sum35); \
  2746        dct_long_mul(p5o, sumodd, rot1_0); \
  2747        dct_long_mac(p1o, p5o, sum17, rot1_1); \
  2748        dct_long_mac(p2o, p5o, sum35, rot1_2); \
  2749        dct_long_mul(p3o, sum37, rot2_0); \
  2750        dct_long_mul(p4o, sum15, rot2_1); \
  2751        dct_wadd(sump13o, p1o, p3o); \
  2752        dct_wadd(sump24o, p2o, p4o); \
  2753        dct_wadd(sump23o, p2o, p3o); \
  2754        dct_wadd(sump14o, p1o, p4o); \
  2755        dct_long_mac(x4, sump13o, row7, rot3_0); \
  2756        dct_long_mac(x5, sump24o, row5, rot3_1); \
  2757        dct_long_mac(x6, sump23o, row3, rot3_2); \
  2758        dct_long_mac(x7, sump14o, row1, rot3_3); \
  2759        dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
  2760        dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
  2761        dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
  2762        dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
  2763     }
  2764  
  2765     // load
  2766     row0 = vld1q_s16(data + 0*8);
  2767     row1 = vld1q_s16(data + 1*8);
  2768     row2 = vld1q_s16(data + 2*8);
  2769     row3 = vld1q_s16(data + 3*8);
  2770     row4 = vld1q_s16(data + 4*8);
  2771     row5 = vld1q_s16(data + 5*8);
  2772     row6 = vld1q_s16(data + 6*8);
  2773     row7 = vld1q_s16(data + 7*8);
  2774  
  2775     // add DC bias
  2776     row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
  2777  
  2778     // column pass
  2779     dct_pass(vrshrn_n_s32, 10);
  2780  
  2781     // 16bit 8x8 transpose
  2782     {
  2783  // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
  2784  // whether compilers actually get this is another story, sadly.
  2785  #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
  2786  #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
  2787  #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
  2788  
  2789        // pass 1
  2790        dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
  2791        dct_trn16(row2, row3);
  2792        dct_trn16(row4, row5);
  2793        dct_trn16(row6, row7);
  2794  
  2795        // pass 2
  2796        dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
  2797        dct_trn32(row1, row3);
  2798        dct_trn32(row4, row6);
  2799        dct_trn32(row5, row7);
  2800  
  2801        // pass 3
  2802        dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
  2803        dct_trn64(row1, row5);
  2804        dct_trn64(row2, row6);
  2805        dct_trn64(row3, row7);
  2806  
  2807  #undef dct_trn16
  2808  #undef dct_trn32
  2809  #undef dct_trn64
  2810     }
  2811  
  2812     // row pass
  2813     // vrshrn_n_s32 only supports shifts up to 16, we need
  2814     // 17. so do a non-rounding shift of 16 first then follow
  2815     // up with a rounding shift by 1.
  2816     dct_pass(vshrn_n_s32, 16);
  2817  
  2818     {
  2819        // pack and round
  2820        uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
  2821        uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
  2822        uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
  2823        uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
  2824        uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
  2825        uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
  2826        uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
  2827        uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
  2828  
  2829        // again, these can translate into one instruction, but often don't.
  2830  #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
  2831  #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
  2832  #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
  2833  
  2834        // sadly can't use interleaved stores here since we only write
  2835        // 8 bytes to each scan line!
  2836  
  2837        // 8x8 8-bit transpose pass 1
  2838        dct_trn8_8(p0, p1);
  2839        dct_trn8_8(p2, p3);
  2840        dct_trn8_8(p4, p5);
  2841        dct_trn8_8(p6, p7);
  2842  
  2843        // pass 2
  2844        dct_trn8_16(p0, p2);
  2845        dct_trn8_16(p1, p3);
  2846        dct_trn8_16(p4, p6);
  2847        dct_trn8_16(p5, p7);
  2848  
  2849        // pass 3
  2850        dct_trn8_32(p0, p4);
  2851        dct_trn8_32(p1, p5);
  2852        dct_trn8_32(p2, p6);
  2853        dct_trn8_32(p3, p7);
  2854  
  2855        // store
  2856        vst1_u8(out, p0); out += out_stride;
  2857        vst1_u8(out, p1); out += out_stride;
  2858        vst1_u8(out, p2); out += out_stride;
  2859        vst1_u8(out, p3); out += out_stride;
  2860        vst1_u8(out, p4); out += out_stride;
  2861        vst1_u8(out, p5); out += out_stride;
  2862        vst1_u8(out, p6); out += out_stride;
  2863        vst1_u8(out, p7);
  2864  
  2865  #undef dct_trn8_8
  2866  #undef dct_trn8_16
  2867  #undef dct_trn8_32
  2868     }
  2869  
  2870  #undef dct_long_mul
  2871  #undef dct_long_mac
  2872  #undef dct_widen
  2873  #undef dct_wadd
  2874  #undef dct_wsub
  2875  #undef dct_bfly32o
  2876  #undef dct_pass
  2877  }
  2878  
  2879  #endif // STBI_NEON
  2880  
  2881  #define STBI__MARKER_none  0xff
  2882  // if there's a pending marker from the entropy stream, return that
  2883  // otherwise, fetch from the stream and get a marker. if there's no
  2884  // marker, return 0xff, which is never a valid marker value
  2885  static stbi_uc stbi__get_marker(stbi__jpeg *j)
  2886  {
  2887     stbi_uc x;
  2888     if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
  2889     x = stbi__get8(j->s);
  2890     if (x != 0xff) return STBI__MARKER_none;
  2891     while (x == 0xff)
  2892        x = stbi__get8(j->s); // consume repeated 0xff fill bytes
  2893     return x;
  2894  }
  2895  
  2896  // in each scan, we'll have scan_n components, and the order
  2897  // of the components is specified by order[]
  2898  #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
  2899  
  2900  // after a restart interval, stbi__jpeg_reset the entropy decoder and
  2901  // the dc prediction
  2902  static void stbi__jpeg_reset(stbi__jpeg *j)
  2903  {
  2904     j->code_bits = 0;
  2905     j->code_buffer = 0;
  2906     j->nomore = 0;
  2907     j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
  2908     j->marker = STBI__MARKER_none;
  2909     j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
  2910     j->eob_run = 0;
  2911     // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
  2912     // since we don't even allow 1<<30 pixels
  2913  }
  2914  
  2915  static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
  2916  {
  2917     stbi__jpeg_reset(z);
  2918     if (!z->progressive) {
  2919        if (z->scan_n == 1) {
  2920           int i,j;
  2921           STBI_SIMD_ALIGN(short, data[64]);
  2922           int n = z->order[0];
  2923           // non-interleaved data, we just need to process one block at a time,
  2924           // in trivial scanline order
  2925           // number of blocks to do just depends on how many actual "pixels" this
  2926           // component has, independent of interleaved MCU blocking and such
  2927           int w = (z->img_comp[n].x+7) >> 3;
  2928           int h = (z->img_comp[n].y+7) >> 3;
  2929           for (j=0; j < h; ++j) {
  2930              for (i=0; i < w; ++i) {
  2931                 int ha = z->img_comp[n].ha;
  2932                 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
  2933                 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
  2934                 // every data block is an MCU, so countdown the restart interval
  2935                 if (--z->todo <= 0) {
  2936                    if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
  2937                    // if it's NOT a restart, then just bail, so we get corrupt data
  2938                    // rather than no data
  2939                    if (!STBI__RESTART(z->marker)) return 1;
  2940                    stbi__jpeg_reset(z);
  2941                 }
  2942              }
  2943           }
  2944           return 1;
  2945        } else { // interleaved
  2946           int i,j,k,x,y;
  2947           STBI_SIMD_ALIGN(short, data[64]);
  2948           for (j=0; j < z->img_mcu_y; ++j) {
  2949              for (i=0; i < z->img_mcu_x; ++i) {
  2950                 // scan an interleaved mcu... process scan_n components in order
  2951                 for (k=0; k < z->scan_n; ++k) {
  2952                    int n = z->order[k];
  2953                    // scan out an mcu's worth of this component; that's just determined
  2954                    // by the basic H and V specified for the component
  2955                    for (y=0; y < z->img_comp[n].v; ++y) {
  2956                       for (x=0; x < z->img_comp[n].h; ++x) {
  2957                          int x2 = (i*z->img_comp[n].h + x)*8;
  2958                          int y2 = (j*z->img_comp[n].v + y)*8;
  2959                          int ha = z->img_comp[n].ha;
  2960                          if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
  2961                          z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
  2962                       }
  2963                    }
  2964                 }
  2965                 // after all interleaved components, that's an interleaved MCU,
  2966                 // so now count down the restart interval
  2967                 if (--z->todo <= 0) {
  2968                    if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
  2969                    if (!STBI__RESTART(z->marker)) return 1;
  2970                    stbi__jpeg_reset(z);
  2971                 }
  2972              }
  2973           }
  2974           return 1;
  2975        }
  2976     } else {
  2977        if (z->scan_n == 1) {
  2978           int i,j;
  2979           int n = z->order[0];
  2980           // non-interleaved data, we just need to process one block at a time,
  2981           // in trivial scanline order
  2982           // number of blocks to do just depends on how many actual "pixels" this
  2983           // component has, independent of interleaved MCU blocking and such
  2984           int w = (z->img_comp[n].x+7) >> 3;
  2985           int h = (z->img_comp[n].y+7) >> 3;
  2986           for (j=0; j < h; ++j) {
  2987              for (i=0; i < w; ++i) {
  2988                 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
  2989                 if (z->spec_start == 0) {
  2990                    if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
  2991                       return 0;
  2992                 } else {
  2993                    int ha = z->img_comp[n].ha;
  2994                    if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
  2995                       return 0;
  2996                 }
  2997                 // every data block is an MCU, so countdown the restart interval
  2998                 if (--z->todo <= 0) {
  2999                    if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
  3000                    if (!STBI__RESTART(z->marker)) return 1;
  3001                    stbi__jpeg_reset(z);
  3002                 }
  3003              }
  3004           }
  3005           return 1;
  3006        } else { // interleaved
  3007           int i,j,k,x,y;
  3008           for (j=0; j < z->img_mcu_y; ++j) {
  3009              for (i=0; i < z->img_mcu_x; ++i) {
  3010                 // scan an interleaved mcu... process scan_n components in order
  3011                 for (k=0; k < z->scan_n; ++k) {
  3012                    int n = z->order[k];
  3013                    // scan out an mcu's worth of this component; that's just determined
  3014                    // by the basic H and V specified for the component
  3015                    for (y=0; y < z->img_comp[n].v; ++y) {
  3016                       for (x=0; x < z->img_comp[n].h; ++x) {
  3017                          int x2 = (i*z->img_comp[n].h + x);
  3018                          int y2 = (j*z->img_comp[n].v + y);
  3019                          short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
  3020                          if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
  3021                             return 0;
  3022                       }
  3023                    }
  3024                 }
  3025                 // after all interleaved components, that's an interleaved MCU,
  3026                 // so now count down the restart interval
  3027                 if (--z->todo <= 0) {
  3028                    if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
  3029                    if (!STBI__RESTART(z->marker)) return 1;
  3030                    stbi__jpeg_reset(z);
  3031                 }
  3032              }
  3033           }
  3034           return 1;
  3035        }
  3036     }
  3037  }
  3038  
  3039  static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
  3040  {
  3041     int i;
  3042     for (i=0; i < 64; ++i)
  3043        data[i] *= dequant[i];
  3044  }
  3045  
  3046  static void stbi__jpeg_finish(stbi__jpeg *z)
  3047  {
  3048     if (z->progressive) {
  3049        // dequantize and idct the data
  3050        int i,j,n;
  3051        for (n=0; n < z->s->img_n; ++n) {
  3052           int w = (z->img_comp[n].x+7) >> 3;
  3053           int h = (z->img_comp[n].y+7) >> 3;
  3054           for (j=0; j < h; ++j) {
  3055              for (i=0; i < w; ++i) {
  3056                 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
  3057                 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
  3058                 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
  3059              }
  3060           }
  3061        }
  3062     }
  3063  }
  3064  
  3065  static int stbi__process_marker(stbi__jpeg *z, int m)
  3066  {
  3067     int L;
  3068     switch (m) {
  3069        case STBI__MARKER_none: // no marker found
  3070           return stbi__err("expected marker","Corrupt JPEG");
  3071  
  3072        case 0xDD: // DRI - specify restart interval
  3073           if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
  3074           z->restart_interval = stbi__get16be(z->s);
  3075           return 1;
  3076  
  3077        case 0xDB: // DQT - define quantization table
  3078           L = stbi__get16be(z->s)-2;
  3079           while (L > 0) {
  3080              int q = stbi__get8(z->s);
  3081              int p = q >> 4, sixteen = (p != 0);
  3082              int t = q & 15,i;
  3083              if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
  3084              if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
  3085  
  3086              for (i=0; i < 64; ++i)
  3087                 z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
  3088              L -= (sixteen ? 129 : 65);
  3089           }
  3090           return L==0;
  3091  
  3092        case 0xC4: // DHT - define huffman table
  3093           L = stbi__get16be(z->s)-2;
  3094           while (L > 0) {
  3095              stbi_uc *v;
  3096              int sizes[16],i,n=0;
  3097              int q = stbi__get8(z->s);
  3098              int tc = q >> 4;
  3099              int th = q & 15;
  3100              if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
  3101              for (i=0; i < 16; ++i) {
  3102                 sizes[i] = stbi__get8(z->s);
  3103                 n += sizes[i];
  3104              }
  3105              L -= 17;
  3106              if (tc == 0) {
  3107                 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
  3108                 v = z->huff_dc[th].values;
  3109              } else {
  3110                 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
  3111                 v = z->huff_ac[th].values;
  3112              }
  3113              for (i=0; i < n; ++i)
  3114                 v[i] = stbi__get8(z->s);
  3115              if (tc != 0)
  3116                 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
  3117              L -= n;
  3118           }
  3119           return L==0;
  3120     }
  3121  
  3122     // check for comment block or APP blocks
  3123     if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
  3124        L = stbi__get16be(z->s);
  3125        if (L < 2) {
  3126           if (m == 0xFE)
  3127              return stbi__err("bad COM len","Corrupt JPEG");
  3128           else
  3129              return stbi__err("bad APP len","Corrupt JPEG");
  3130        }
  3131        L -= 2;
  3132  
  3133        if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
  3134           static const unsigned char tag[5] = {'J','F','I','F','\0'};
  3135           int ok = 1;
  3136           int i;
  3137           for (i=0; i < 5; ++i)
  3138              if (stbi__get8(z->s) != tag[i])
  3139                 ok = 0;
  3140           L -= 5;
  3141           if (ok)
  3142              z->jfif = 1;
  3143        } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
  3144           static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
  3145           int ok = 1;
  3146           int i;
  3147           for (i=0; i < 6; ++i)
  3148              if (stbi__get8(z->s) != tag[i])
  3149                 ok = 0;
  3150           L -= 6;
  3151           if (ok) {
  3152              stbi__get8(z->s); // version
  3153              stbi__get16be(z->s); // flags0
  3154              stbi__get16be(z->s); // flags1
  3155              z->app14_color_transform = stbi__get8(z->s); // color transform
  3156              L -= 6;
  3157           }
  3158        }
  3159  
  3160        stbi__skip(z->s, L);
  3161        return 1;
  3162     }
  3163  
  3164     return stbi__err("unknown marker","Corrupt JPEG");
  3165  }
  3166  
  3167  // after we see SOS
  3168  static int stbi__process_scan_header(stbi__jpeg *z)
  3169  {
  3170     int i;
  3171     int Ls = stbi__get16be(z->s);
  3172     z->scan_n = stbi__get8(z->s);
  3173     if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
  3174     if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
  3175     for (i=0; i < z->scan_n; ++i) {
  3176        int id = stbi__get8(z->s), which;
  3177        int q = stbi__get8(z->s);
  3178        for (which = 0; which < z->s->img_n; ++which)
  3179           if (z->img_comp[which].id == id)
  3180              break;
  3181        if (which == z->s->img_n) return 0; // no match
  3182        z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
  3183        z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
  3184        z->order[i] = which;
  3185     }
  3186  
  3187     {
  3188        int aa;
  3189        z->spec_start = stbi__get8(z->s);
  3190        z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
  3191        aa = stbi__get8(z->s);
  3192        z->succ_high = (aa >> 4);
  3193        z->succ_low  = (aa & 15);
  3194        if (z->progressive) {
  3195           if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
  3196              return stbi__err("bad SOS", "Corrupt JPEG");
  3197        } else {
  3198           if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
  3199           if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
  3200           z->spec_end = 63;
  3201        }
  3202     }
  3203  
  3204     return 1;
  3205  }
  3206  
  3207  static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
  3208  {
  3209     int i;
  3210     for (i=0; i < ncomp; ++i) {
  3211        if (z->img_comp[i].raw_data) {
  3212           STBI_FREE(z->img_comp[i].raw_data);
  3213           z->img_comp[i].raw_data = NULL;
  3214           z->img_comp[i].data = NULL;
  3215        }
  3216        if (z->img_comp[i].raw_coeff) {
  3217           STBI_FREE(z->img_comp[i].raw_coeff);
  3218           z->img_comp[i].raw_coeff = 0;
  3219           z->img_comp[i].coeff = 0;
  3220        }
  3221        if (z->img_comp[i].linebuf) {
  3222           STBI_FREE(z->img_comp[i].linebuf);
  3223           z->img_comp[i].linebuf = NULL;
  3224        }
  3225     }
  3226     return why;
  3227  }
  3228  
  3229  static int stbi__process_frame_header(stbi__jpeg *z, int scan)
  3230  {
  3231     stbi__context *s = z->s;
  3232     int Lf,p,i,q, h_max=1,v_max=1,c;
  3233     Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
  3234     p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
  3235     s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
  3236     s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
  3237     if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  3238     if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  3239     c = stbi__get8(s);
  3240     if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
  3241     s->img_n = c;
  3242     for (i=0; i < c; ++i) {
  3243        z->img_comp[i].data = NULL;
  3244        z->img_comp[i].linebuf = NULL;
  3245     }
  3246  
  3247     if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
  3248  
  3249     z->rgb = 0;
  3250     for (i=0; i < s->img_n; ++i) {
  3251        static const unsigned char rgb[3] = { 'R', 'G', 'B' };
  3252        z->img_comp[i].id = stbi__get8(s);
  3253        if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
  3254           ++z->rgb;
  3255        q = stbi__get8(s);
  3256        z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
  3257        z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
  3258        z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
  3259     }
  3260  
  3261     if (scan != STBI__SCAN_load) return 1;
  3262  
  3263     if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
  3264  
  3265     for (i=0; i < s->img_n; ++i) {
  3266        if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
  3267        if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
  3268     }
  3269  
  3270     // check that plane subsampling factors are integer ratios; our resamplers can't deal with fractional ratios
  3271     // and I've never seen a non-corrupted JPEG file actually use them
  3272     for (i=0; i < s->img_n; ++i) {
  3273        if (h_max % z->img_comp[i].h != 0) return stbi__err("bad H","Corrupt JPEG");
  3274        if (v_max % z->img_comp[i].v != 0) return stbi__err("bad V","Corrupt JPEG");
  3275     }
  3276  
  3277     // compute interleaved mcu info
  3278     z->img_h_max = h_max;
  3279     z->img_v_max = v_max;
  3280     z->img_mcu_w = h_max * 8;
  3281     z->img_mcu_h = v_max * 8;
  3282     // these sizes can't be more than 17 bits
  3283     z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
  3284     z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
  3285  
  3286     for (i=0; i < s->img_n; ++i) {
  3287        // number of effective pixels (e.g. for non-interleaved MCU)
  3288        z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
  3289        z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
  3290        // to simplify generation, we'll allocate enough memory to decode
  3291        // the bogus oversized data from using interleaved MCUs and their
  3292        // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
  3293        // discard the extra data until colorspace conversion
  3294        //
  3295        // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
  3296        // so these muls can't overflow with 32-bit ints (which we require)
  3297        z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
  3298        z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
  3299        z->img_comp[i].coeff = 0;
  3300        z->img_comp[i].raw_coeff = 0;
  3301        z->img_comp[i].linebuf = NULL;
  3302        z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
  3303        if (z->img_comp[i].raw_data == NULL)
  3304           return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
  3305        // align blocks for idct using mmx/sse
  3306        z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
  3307        if (z->progressive) {
  3308           // w2, h2 are multiples of 8 (see above)
  3309           z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
  3310           z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
  3311           z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
  3312           if (z->img_comp[i].raw_coeff == NULL)
  3313              return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
  3314           z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
  3315        }
  3316     }
  3317  
  3318     return 1;
  3319  }
  3320  
  3321  // use comparisons since in some cases we handle more than one case (e.g. SOF)
  3322  #define stbi__DNL(x)         ((x) == 0xdc)
  3323  #define stbi__SOI(x)         ((x) == 0xd8)
  3324  #define stbi__EOI(x)         ((x) == 0xd9)
  3325  #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
  3326  #define stbi__SOS(x)         ((x) == 0xda)
  3327  
  3328  #define stbi__SOF_progressive(x)   ((x) == 0xc2)
  3329  
  3330  static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
  3331  {
  3332     int m;
  3333     z->jfif = 0;
  3334     z->app14_color_transform = -1; // valid values are 0,1,2
  3335     z->marker = STBI__MARKER_none; // initialize cached marker to empty
  3336     m = stbi__get_marker(z);
  3337     if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
  3338     if (scan == STBI__SCAN_type) return 1;
  3339     m = stbi__get_marker(z);
  3340     while (!stbi__SOF(m)) {
  3341        if (!stbi__process_marker(z,m)) return 0;
  3342        m = stbi__get_marker(z);
  3343        while (m == STBI__MARKER_none) {
  3344           // some files have extra padding after their blocks, so ok, we'll scan
  3345           if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
  3346           m = stbi__get_marker(z);
  3347        }
  3348     }
  3349     z->progressive = stbi__SOF_progressive(m);
  3350     if (!stbi__process_frame_header(z, scan)) return 0;
  3351     return 1;
  3352  }
  3353  
  3354  // decode image to YCbCr format
  3355  static int stbi__decode_jpeg_image(stbi__jpeg *j)
  3356  {
  3357     int m;
  3358     for (m = 0; m < 4; m++) {
  3359        j->img_comp[m].raw_data = NULL;
  3360        j->img_comp[m].raw_coeff = NULL;
  3361     }
  3362     j->restart_interval = 0;
  3363     if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
  3364     m = stbi__get_marker(j);
  3365     while (!stbi__EOI(m)) {
  3366        if (stbi__SOS(m)) {
  3367           if (!stbi__process_scan_header(j)) return 0;
  3368           if (!stbi__parse_entropy_coded_data(j)) return 0;
  3369           if (j->marker == STBI__MARKER_none ) {
  3370              // handle 0s at the end of image data from IP Kamera 9060
  3371              while (!stbi__at_eof(j->s)) {
  3372                 int x = stbi__get8(j->s);
  3373                 if (x == 255) {
  3374                    j->marker = stbi__get8(j->s);
  3375                    break;
  3376                 }
  3377              }
  3378              // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
  3379           }
  3380        } else if (stbi__DNL(m)) {
  3381           int Ld = stbi__get16be(j->s);
  3382           stbi__uint32 NL = stbi__get16be(j->s);
  3383           if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
  3384           if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
  3385        } else {
  3386           if (!stbi__process_marker(j, m)) return 0;
  3387        }
  3388        m = stbi__get_marker(j);
  3389     }
  3390     if (j->progressive)
  3391        stbi__jpeg_finish(j);
  3392     return 1;
  3393  }
  3394  
  3395  // static jfif-centered resampling (across block boundaries)
  3396  
  3397  typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
  3398                                      int w, int hs);
  3399  
  3400  #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
  3401  
  3402  static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3403  {
  3404     STBI_NOTUSED(out);
  3405     STBI_NOTUSED(in_far);
  3406     STBI_NOTUSED(w);
  3407     STBI_NOTUSED(hs);
  3408     return in_near;
  3409  }
  3410  
  3411  static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3412  {
  3413     // need to generate two samples vertically for every one in input
  3414     int i;
  3415     STBI_NOTUSED(hs);
  3416     for (i=0; i < w; ++i)
  3417        out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
  3418     return out;
  3419  }
  3420  
  3421  static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3422  {
  3423     // need to generate two samples horizontally for every one in input
  3424     int i;
  3425     stbi_uc *input = in_near;
  3426  
  3427     if (w == 1) {
  3428        // if only one sample, can't do any interpolation
  3429        out[0] = out[1] = input[0];
  3430        return out;
  3431     }
  3432  
  3433     out[0] = input[0];
  3434     out[1] = stbi__div4(input[0]*3 + input[1] + 2);
  3435     for (i=1; i < w-1; ++i) {
  3436        int n = 3*input[i]+2;
  3437        out[i*2+0] = stbi__div4(n+input[i-1]);
  3438        out[i*2+1] = stbi__div4(n+input[i+1]);
  3439     }
  3440     out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
  3441     out[i*2+1] = input[w-1];
  3442  
  3443     STBI_NOTUSED(in_far);
  3444     STBI_NOTUSED(hs);
  3445  
  3446     return out;
  3447  }
  3448  
  3449  #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
  3450  
  3451  static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3452  {
  3453     // need to generate 2x2 samples for every one in input
  3454     int i,t0,t1;
  3455     if (w == 1) {
  3456        out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
  3457        return out;
  3458     }
  3459  
  3460     t1 = 3*in_near[0] + in_far[0];
  3461     out[0] = stbi__div4(t1+2);
  3462     for (i=1; i < w; ++i) {
  3463        t0 = t1;
  3464        t1 = 3*in_near[i]+in_far[i];
  3465        out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
  3466        out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
  3467     }
  3468     out[w*2-1] = stbi__div4(t1+2);
  3469  
  3470     STBI_NOTUSED(hs);
  3471  
  3472     return out;
  3473  }
  3474  
  3475  #if defined(STBI_SSE2) || defined(STBI_NEON)
  3476  static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3477  {
  3478     // need to generate 2x2 samples for every one in input
  3479     int i=0,t0,t1;
  3480  
  3481     if (w == 1) {
  3482        out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
  3483        return out;
  3484     }
  3485  
  3486     t1 = 3*in_near[0] + in_far[0];
  3487     // process groups of 8 pixels for as long as we can.
  3488     // note we can't handle the last pixel in a row in this loop
  3489     // because we need to handle the filter boundary conditions.
  3490     for (; i < ((w-1) & ~7); i += 8) {
  3491  #if defined(STBI_SSE2)
  3492        // load and perform the vertical filtering pass
  3493        // this uses 3*x + y = 4*x + (y - x)
  3494        __m128i zero  = _mm_setzero_si128();
  3495        __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
  3496        __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
  3497        __m128i farw  = _mm_unpacklo_epi8(farb, zero);
  3498        __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
  3499        __m128i diff  = _mm_sub_epi16(farw, nearw);
  3500        __m128i nears = _mm_slli_epi16(nearw, 2);
  3501        __m128i curr  = _mm_add_epi16(nears, diff); // current row
  3502  
  3503        // horizontal filter works the same based on shifted vers of current
  3504        // row. "prev" is current row shifted right by 1 pixel; we need to
  3505        // insert the previous pixel value (from t1).
  3506        // "next" is current row shifted left by 1 pixel, with first pixel
  3507        // of next block of 8 pixels added in.
  3508        __m128i prv0 = _mm_slli_si128(curr, 2);
  3509        __m128i nxt0 = _mm_srli_si128(curr, 2);
  3510        __m128i prev = _mm_insert_epi16(prv0, t1, 0);
  3511        __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
  3512  
  3513        // horizontal filter, polyphase implementation since it's convenient:
  3514        // even pixels = 3*cur + prev = cur*4 + (prev - cur)
  3515        // odd  pixels = 3*cur + next = cur*4 + (next - cur)
  3516        // note the shared term.
  3517        __m128i bias  = _mm_set1_epi16(8);
  3518        __m128i curs = _mm_slli_epi16(curr, 2);
  3519        __m128i prvd = _mm_sub_epi16(prev, curr);
  3520        __m128i nxtd = _mm_sub_epi16(next, curr);
  3521        __m128i curb = _mm_add_epi16(curs, bias);
  3522        __m128i even = _mm_add_epi16(prvd, curb);
  3523        __m128i odd  = _mm_add_epi16(nxtd, curb);
  3524  
  3525        // interleave even and odd pixels, then undo scaling.
  3526        __m128i int0 = _mm_unpacklo_epi16(even, odd);
  3527        __m128i int1 = _mm_unpackhi_epi16(even, odd);
  3528        __m128i de0  = _mm_srli_epi16(int0, 4);
  3529        __m128i de1  = _mm_srli_epi16(int1, 4);
  3530  
  3531        // pack and write output
  3532        __m128i outv = _mm_packus_epi16(de0, de1);
  3533        _mm_storeu_si128((__m128i *) (out + i*2), outv);
  3534  #elif defined(STBI_NEON)
  3535        // load and perform the vertical filtering pass
  3536        // this uses 3*x + y = 4*x + (y - x)
  3537        uint8x8_t farb  = vld1_u8(in_far + i);
  3538        uint8x8_t nearb = vld1_u8(in_near + i);
  3539        int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
  3540        int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
  3541        int16x8_t curr  = vaddq_s16(nears, diff); // current row
  3542  
  3543        // horizontal filter works the same based on shifted vers of current
  3544        // row. "prev" is current row shifted right by 1 pixel; we need to
  3545        // insert the previous pixel value (from t1).
  3546        // "next" is current row shifted left by 1 pixel, with first pixel
  3547        // of next block of 8 pixels added in.
  3548        int16x8_t prv0 = vextq_s16(curr, curr, 7);
  3549        int16x8_t nxt0 = vextq_s16(curr, curr, 1);
  3550        int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
  3551        int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
  3552  
  3553        // horizontal filter, polyphase implementation since it's convenient:
  3554        // even pixels = 3*cur + prev = cur*4 + (prev - cur)
  3555        // odd  pixels = 3*cur + next = cur*4 + (next - cur)
  3556        // note the shared term.
  3557        int16x8_t curs = vshlq_n_s16(curr, 2);
  3558        int16x8_t prvd = vsubq_s16(prev, curr);
  3559        int16x8_t nxtd = vsubq_s16(next, curr);
  3560        int16x8_t even = vaddq_s16(curs, prvd);
  3561        int16x8_t odd  = vaddq_s16(curs, nxtd);
  3562  
  3563        // undo scaling and round, then store with even/odd phases interleaved
  3564        uint8x8x2_t o;
  3565        o.val[0] = vqrshrun_n_s16(even, 4);
  3566        o.val[1] = vqrshrun_n_s16(odd,  4);
  3567        vst2_u8(out + i*2, o);
  3568  #endif
  3569  
  3570        // "previous" value for next iter
  3571        t1 = 3*in_near[i+7] + in_far[i+7];
  3572     }
  3573  
  3574     t0 = t1;
  3575     t1 = 3*in_near[i] + in_far[i];
  3576     out[i*2] = stbi__div16(3*t1 + t0 + 8);
  3577  
  3578     for (++i; i < w; ++i) {
  3579        t0 = t1;
  3580        t1 = 3*in_near[i]+in_far[i];
  3581        out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
  3582        out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
  3583     }
  3584     out[w*2-1] = stbi__div4(t1+2);
  3585  
  3586     STBI_NOTUSED(hs);
  3587  
  3588     return out;
  3589  }
  3590  #endif
  3591  
  3592  static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
  3593  {
  3594     // resample with nearest-neighbor
  3595     int i,j;
  3596     STBI_NOTUSED(in_far);
  3597     for (i=0; i < w; ++i)
  3598        for (j=0; j < hs; ++j)
  3599           out[i*hs+j] = in_near[i];
  3600     return out;
  3601  }
  3602  
  3603  // this is a reduced-precision calculation of YCbCr-to-RGB introduced
  3604  // to make sure the code produces the same results in both SIMD and scalar
  3605  #define stbi__float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
  3606  static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
  3607  {
  3608     int i;
  3609     for (i=0; i < count; ++i) {
  3610        int y_fixed = (y[i] << 20) + (1<<19); // rounding
  3611        int r,g,b;
  3612        int cr = pcr[i] - 128;
  3613        int cb = pcb[i] - 128;
  3614        r = y_fixed +  cr* stbi__float2fixed(1.40200f);
  3615        g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
  3616        b = y_fixed                                     +   cb* stbi__float2fixed(1.77200f);
  3617        r >>= 20;
  3618        g >>= 20;
  3619        b >>= 20;
  3620        if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
  3621        if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
  3622        if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
  3623        out[0] = (stbi_uc)r;
  3624        out[1] = (stbi_uc)g;
  3625        out[2] = (stbi_uc)b;
  3626        out[3] = 255;
  3627        out += step;
  3628     }
  3629  }
  3630  
  3631  #if defined(STBI_SSE2) || defined(STBI_NEON)
  3632  static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
  3633  {
  3634     int i = 0;
  3635  
  3636  #ifdef STBI_SSE2
  3637     // step == 3 is pretty ugly on the final interleave, and i'm not convinced
  3638     // it's useful in practice (you wouldn't use it for textures, for example).
  3639     // so just accelerate step == 4 case.
  3640     if (step == 4) {
  3641        // this is a fairly straightforward implementation and not super-optimized.
  3642        __m128i signflip  = _mm_set1_epi8(-0x80);
  3643        __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
  3644        __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
  3645        __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
  3646        __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
  3647        __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
  3648        __m128i xw = _mm_set1_epi16(255); // alpha channel
  3649  
  3650        for (; i+7 < count; i += 8) {
  3651           // load
  3652           __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
  3653           __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
  3654           __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
  3655           __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
  3656           __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
  3657  
  3658           // unpack to short (and left-shift cr, cb by 8)
  3659           __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
  3660           __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
  3661           __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
  3662  
  3663           // color transform
  3664           __m128i yws = _mm_srli_epi16(yw, 4);
  3665           __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
  3666           __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
  3667           __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
  3668           __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
  3669           __m128i rws = _mm_add_epi16(cr0, yws);
  3670           __m128i gwt = _mm_add_epi16(cb0, yws);
  3671           __m128i bws = _mm_add_epi16(yws, cb1);
  3672           __m128i gws = _mm_add_epi16(gwt, cr1);
  3673  
  3674           // descale
  3675           __m128i rw = _mm_srai_epi16(rws, 4);
  3676           __m128i bw = _mm_srai_epi16(bws, 4);
  3677           __m128i gw = _mm_srai_epi16(gws, 4);
  3678  
  3679           // back to byte, set up for transpose
  3680           __m128i brb = _mm_packus_epi16(rw, bw);
  3681           __m128i gxb = _mm_packus_epi16(gw, xw);
  3682  
  3683           // transpose to interleave channels
  3684           __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
  3685           __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
  3686           __m128i o0 = _mm_unpacklo_epi16(t0, t1);
  3687           __m128i o1 = _mm_unpackhi_epi16(t0, t1);
  3688  
  3689           // store
  3690           _mm_storeu_si128((__m128i *) (out + 0), o0);
  3691           _mm_storeu_si128((__m128i *) (out + 16), o1);
  3692           out += 32;
  3693        }
  3694     }
  3695  #endif
  3696  
  3697  #ifdef STBI_NEON
  3698     // in this version, step=3 support would be easy to add. but is there demand?
  3699     if (step == 4) {
  3700        // this is a fairly straightforward implementation and not super-optimized.
  3701        uint8x8_t signflip = vdup_n_u8(0x80);
  3702        int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
  3703        int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
  3704        int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
  3705        int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
  3706  
  3707        for (; i+7 < count; i += 8) {
  3708           // load
  3709           uint8x8_t y_bytes  = vld1_u8(y + i);
  3710           uint8x8_t cr_bytes = vld1_u8(pcr + i);
  3711           uint8x8_t cb_bytes = vld1_u8(pcb + i);
  3712           int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
  3713           int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
  3714  
  3715           // expand to s16
  3716           int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
  3717           int16x8_t crw = vshll_n_s8(cr_biased, 7);
  3718           int16x8_t cbw = vshll_n_s8(cb_biased, 7);
  3719  
  3720           // color transform
  3721           int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
  3722           int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
  3723           int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
  3724           int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
  3725           int16x8_t rws = vaddq_s16(yws, cr0);
  3726           int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
  3727           int16x8_t bws = vaddq_s16(yws, cb1);
  3728  
  3729           // undo scaling, round, convert to byte
  3730           uint8x8x4_t o;
  3731           o.val[0] = vqrshrun_n_s16(rws, 4);
  3732           o.val[1] = vqrshrun_n_s16(gws, 4);
  3733           o.val[2] = vqrshrun_n_s16(bws, 4);
  3734           o.val[3] = vdup_n_u8(255);
  3735  
  3736           // store, interleaving r/g/b/a
  3737           vst4_u8(out, o);
  3738           out += 8*4;
  3739        }
  3740     }
  3741  #endif
  3742  
  3743     for (; i < count; ++i) {
  3744        int y_fixed = (y[i] << 20) + (1<<19); // rounding
  3745        int r,g,b;
  3746        int cr = pcr[i] - 128;
  3747        int cb = pcb[i] - 128;
  3748        r = y_fixed + cr* stbi__float2fixed(1.40200f);
  3749        g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
  3750        b = y_fixed                                   +   cb* stbi__float2fixed(1.77200f);
  3751        r >>= 20;
  3752        g >>= 20;
  3753        b >>= 20;
  3754        if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
  3755        if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
  3756        if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
  3757        out[0] = (stbi_uc)r;
  3758        out[1] = (stbi_uc)g;
  3759        out[2] = (stbi_uc)b;
  3760        out[3] = 255;
  3761        out += step;
  3762     }
  3763  }
  3764  #endif
  3765  
  3766  // set up the kernels
  3767  static void stbi__setup_jpeg(stbi__jpeg *j)
  3768  {
  3769     j->idct_block_kernel = stbi__idct_block;
  3770     j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
  3771     j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
  3772  
  3773  #ifdef STBI_SSE2
  3774     if (stbi__sse2_available()) {
  3775        j->idct_block_kernel = stbi__idct_simd;
  3776        j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
  3777        j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
  3778     }
  3779  #endif
  3780  
  3781  #ifdef STBI_NEON
  3782     j->idct_block_kernel = stbi__idct_simd;
  3783     j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
  3784     j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
  3785  #endif
  3786  }
  3787  
  3788  // clean up the temporary component buffers
  3789  static void stbi__cleanup_jpeg(stbi__jpeg *j)
  3790  {
  3791     stbi__free_jpeg_components(j, j->s->img_n, 0);
  3792  }
  3793  
  3794  typedef struct
  3795  {
  3796     resample_row_func resample;
  3797     stbi_uc *line0,*line1;
  3798     int hs,vs;   // expansion factor in each axis
  3799     int w_lores; // horizontal pixels pre-expansion
  3800     int ystep;   // how far through vertical expansion we are
  3801     int ypos;    // which pre-expansion row we're on
  3802  } stbi__resample;
  3803  
  3804  // fast 0..255 * 0..255 => 0..255 rounded multiplication
  3805  static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
  3806  {
  3807     unsigned int t = x*y + 128;
  3808     return (stbi_uc) ((t + (t >>8)) >> 8);
  3809  }
  3810  
  3811  static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
  3812  {
  3813     int n, decode_n, is_rgb;
  3814     z->s->img_n = 0; // make stbi__cleanup_jpeg safe
  3815  
  3816     // validate req_comp
  3817     if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
  3818  
  3819     // load a jpeg image from whichever source, but leave in YCbCr format
  3820     if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
  3821  
  3822     // determine actual number of components to generate
  3823     n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
  3824  
  3825     is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
  3826  
  3827     if (z->s->img_n == 3 && n < 3 && !is_rgb)
  3828        decode_n = 1;
  3829     else
  3830        decode_n = z->s->img_n;
  3831  
  3832     // nothing to do if no components requested; check this now to avoid
  3833     // accessing uninitialized coutput[0] later
  3834     if (decode_n <= 0) { stbi__cleanup_jpeg(z); return NULL; }
  3835  
  3836     // resample and color-convert
  3837     {
  3838        int k;
  3839        unsigned int i,j;
  3840        stbi_uc *output;
  3841        stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
  3842  
  3843        stbi__resample res_comp[4];
  3844  
  3845        for (k=0; k < decode_n; ++k) {
  3846           stbi__resample *r = &res_comp[k];
  3847  
  3848           // allocate line buffer big enough for upsampling off the edges
  3849           // with upsample factor of 4
  3850           z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
  3851           if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
  3852  
  3853           r->hs      = z->img_h_max / z->img_comp[k].h;
  3854           r->vs      = z->img_v_max / z->img_comp[k].v;
  3855           r->ystep   = r->vs >> 1;
  3856           r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
  3857           r->ypos    = 0;
  3858           r->line0   = r->line1 = z->img_comp[k].data;
  3859  
  3860           if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
  3861           else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
  3862           else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
  3863           else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
  3864           else                               r->resample = stbi__resample_row_generic;
  3865        }
  3866  
  3867        // can't error after this so, this is safe
  3868        output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
  3869        if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
  3870  
  3871        // now go ahead and resample
  3872        for (j=0; j < z->s->img_y; ++j) {
  3873           stbi_uc *out = output + n * z->s->img_x * j;
  3874           for (k=0; k < decode_n; ++k) {
  3875              stbi__resample *r = &res_comp[k];
  3876              int y_bot = r->ystep >= (r->vs >> 1);
  3877              coutput[k] = r->resample(z->img_comp[k].linebuf,
  3878                                       y_bot ? r->line1 : r->line0,
  3879                                       y_bot ? r->line0 : r->line1,
  3880                                       r->w_lores, r->hs);
  3881              if (++r->ystep >= r->vs) {
  3882                 r->ystep = 0;
  3883                 r->line0 = r->line1;
  3884                 if (++r->ypos < z->img_comp[k].y)
  3885                    r->line1 += z->img_comp[k].w2;
  3886              }
  3887           }
  3888           if (n >= 3) {
  3889              stbi_uc *y = coutput[0];
  3890              if (z->s->img_n == 3) {
  3891                 if (is_rgb) {
  3892                    for (i=0; i < z->s->img_x; ++i) {
  3893                       out[0] = y[i];
  3894                       out[1] = coutput[1][i];
  3895                       out[2] = coutput[2][i];
  3896                       out[3] = 255;
  3897                       out += n;
  3898                    }
  3899                 } else {
  3900                    z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
  3901                 }
  3902              } else if (z->s->img_n == 4) {
  3903                 if (z->app14_color_transform == 0) { // CMYK
  3904                    for (i=0; i < z->s->img_x; ++i) {
  3905                       stbi_uc m = coutput[3][i];
  3906                       out[0] = stbi__blinn_8x8(coutput[0][i], m);
  3907                       out[1] = stbi__blinn_8x8(coutput[1][i], m);
  3908                       out[2] = stbi__blinn_8x8(coutput[2][i], m);
  3909                       out[3] = 255;
  3910                       out += n;
  3911                    }
  3912                 } else if (z->app14_color_transform == 2) { // YCCK
  3913                    z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
  3914                    for (i=0; i < z->s->img_x; ++i) {
  3915                       stbi_uc m = coutput[3][i];
  3916                       out[0] = stbi__blinn_8x8(255 - out[0], m);
  3917                       out[1] = stbi__blinn_8x8(255 - out[1], m);
  3918                       out[2] = stbi__blinn_8x8(255 - out[2], m);
  3919                       out += n;
  3920                    }
  3921                 } else { // YCbCr + alpha?  Ignore the fourth channel for now
  3922                    z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
  3923                 }
  3924              } else
  3925                 for (i=0; i < z->s->img_x; ++i) {
  3926                    out[0] = out[1] = out[2] = y[i];
  3927                    out[3] = 255; // not used if n==3
  3928                    out += n;
  3929                 }
  3930           } else {
  3931              if (is_rgb) {
  3932                 if (n == 1)
  3933                    for (i=0; i < z->s->img_x; ++i)
  3934                       *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
  3935                 else {
  3936                    for (i=0; i < z->s->img_x; ++i, out += 2) {
  3937                       out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
  3938                       out[1] = 255;
  3939                    }
  3940                 }
  3941              } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
  3942                 for (i=0; i < z->s->img_x; ++i) {
  3943                    stbi_uc m = coutput[3][i];
  3944                    stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
  3945                    stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
  3946                    stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
  3947                    out[0] = stbi__compute_y(r, g, b);
  3948                    out[1] = 255;
  3949                    out += n;
  3950                 }
  3951              } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
  3952                 for (i=0; i < z->s->img_x; ++i) {
  3953                    out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
  3954                    out[1] = 255;
  3955                    out += n;
  3956                 }
  3957              } else {
  3958                 stbi_uc *y = coutput[0];
  3959                 if (n == 1)
  3960                    for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
  3961                 else
  3962                    for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
  3963              }
  3964           }
  3965        }
  3966        stbi__cleanup_jpeg(z);
  3967        *out_x = z->s->img_x;
  3968        *out_y = z->s->img_y;
  3969        if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
  3970        return output;
  3971     }
  3972  }
  3973  
  3974  static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  3975  {
  3976     unsigned char* result;
  3977     stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
  3978     if (!j) return stbi__errpuc("outofmem", "Out of memory");
  3979     STBI_NOTUSED(ri);
  3980     j->s = s;
  3981     stbi__setup_jpeg(j);
  3982     result = load_jpeg_image(j, x,y,comp,req_comp);
  3983     STBI_FREE(j);
  3984     return result;
  3985  }
  3986  
  3987  static int stbi__jpeg_test(stbi__context *s)
  3988  {
  3989     int r;
  3990     stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
  3991     if (!j) return stbi__err("outofmem", "Out of memory");
  3992     j->s = s;
  3993     stbi__setup_jpeg(j);
  3994     r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
  3995     stbi__rewind(s);
  3996     STBI_FREE(j);
  3997     return r;
  3998  }
  3999  
  4000  static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
  4001  {
  4002     if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
  4003        stbi__rewind( j->s );
  4004        return 0;
  4005     }
  4006     if (x) *x = j->s->img_x;
  4007     if (y) *y = j->s->img_y;
  4008     if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
  4009     return 1;
  4010  }
  4011  
  4012  static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
  4013  {
  4014     int result;
  4015     stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
  4016     if (!j) return stbi__err("outofmem", "Out of memory");
  4017     j->s = s;
  4018     result = stbi__jpeg_info_raw(j, x, y, comp);
  4019     STBI_FREE(j);
  4020     return result;
  4021  }
  4022  #endif
  4023  
  4024  // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
  4025  //    simple implementation
  4026  //      - all input must be provided in an upfront buffer
  4027  //      - all output is written to a single output buffer (can malloc/realloc)
  4028  //    performance
  4029  //      - fast huffman
  4030  
  4031  #ifndef STBI_NO_ZLIB
  4032  
  4033  // fast-way is faster to check than jpeg huffman, but slow way is slower
  4034  #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
  4035  #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
  4036  #define STBI__ZNSYMS 288 // number of symbols in literal/length alphabet
  4037  
  4038  // zlib-style huffman encoding
  4039  // (jpegs packs from left, zlib from right, so can't share code)
  4040  typedef struct
  4041  {
  4042     stbi__uint16 fast[1 << STBI__ZFAST_BITS];
  4043     stbi__uint16 firstcode[16];
  4044     int maxcode[17];
  4045     stbi__uint16 firstsymbol[16];
  4046     stbi_uc  size[STBI__ZNSYMS];
  4047     stbi__uint16 value[STBI__ZNSYMS];
  4048  } stbi__zhuffman;
  4049  
  4050  stbi_inline static int stbi__bitreverse16(int n)
  4051  {
  4052    n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
  4053    n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
  4054    n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
  4055    n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
  4056    return n;
  4057  }
  4058  
  4059  stbi_inline static int stbi__bit_reverse(int v, int bits)
  4060  {
  4061     STBI_ASSERT(bits <= 16);
  4062     // to bit reverse n bits, reverse 16 and shift
  4063     // e.g. 11 bits, bit reverse and shift away 5
  4064     return stbi__bitreverse16(v) >> (16-bits);
  4065  }
  4066  
  4067  static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
  4068  {
  4069     int i,k=0;
  4070     int code, next_code[16], sizes[17];
  4071  
  4072     // DEFLATE spec for generating codes
  4073     memset(sizes, 0, sizeof(sizes));
  4074     memset(z->fast, 0, sizeof(z->fast));
  4075     for (i=0; i < num; ++i)
  4076        ++sizes[sizelist[i]];
  4077     sizes[0] = 0;
  4078     for (i=1; i < 16; ++i)
  4079        if (sizes[i] > (1 << i))
  4080           return stbi__err("bad sizes", "Corrupt PNG");
  4081     code = 0;
  4082     for (i=1; i < 16; ++i) {
  4083        next_code[i] = code;
  4084        z->firstcode[i] = (stbi__uint16) code;
  4085        z->firstsymbol[i] = (stbi__uint16) k;
  4086        code = (code + sizes[i]);
  4087        if (sizes[i])
  4088           if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
  4089        z->maxcode[i] = code << (16-i); // preshift for inner loop
  4090        code <<= 1;
  4091        k += sizes[i];
  4092     }
  4093     z->maxcode[16] = 0x10000; // sentinel
  4094     for (i=0; i < num; ++i) {
  4095        int s = sizelist[i];
  4096        if (s) {
  4097           int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
  4098           stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
  4099           z->size [c] = (stbi_uc     ) s;
  4100           z->value[c] = (stbi__uint16) i;
  4101           if (s <= STBI__ZFAST_BITS) {
  4102              int j = stbi__bit_reverse(next_code[s],s);
  4103              while (j < (1 << STBI__ZFAST_BITS)) {
  4104                 z->fast[j] = fastv;
  4105                 j += (1 << s);
  4106              }
  4107           }
  4108           ++next_code[s];
  4109        }
  4110     }
  4111     return 1;
  4112  }
  4113  
  4114  // zlib-from-memory implementation for PNG reading
  4115  //    because PNG allows splitting the zlib stream arbitrarily,
  4116  //    and it's annoying structurally to have PNG call ZLIB call PNG,
  4117  //    we require PNG read all the IDATs and combine them into a single
  4118  //    memory buffer
  4119  
  4120  typedef struct
  4121  {
  4122     stbi_uc *zbuffer, *zbuffer_end;
  4123     int num_bits;
  4124     stbi__uint32 code_buffer;
  4125  
  4126     char *zout;
  4127     char *zout_start;
  4128     char *zout_end;
  4129     int   z_expandable;
  4130  
  4131     stbi__zhuffman z_length, z_distance;
  4132  } stbi__zbuf;
  4133  
  4134  stbi_inline static int stbi__zeof(stbi__zbuf *z)
  4135  {
  4136     return (z->zbuffer >= z->zbuffer_end);
  4137  }
  4138  
  4139  stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
  4140  {
  4141     return stbi__zeof(z) ? 0 : *z->zbuffer++;
  4142  }
  4143  
  4144  static void stbi__fill_bits(stbi__zbuf *z)
  4145  {
  4146     do {
  4147        if (z->code_buffer >= (1U << z->num_bits)) {
  4148          z->zbuffer = z->zbuffer_end;  /* treat this as EOF so we fail. */
  4149          return;
  4150        }
  4151        z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
  4152        z->num_bits += 8;
  4153     } while (z->num_bits <= 24);
  4154  }
  4155  
  4156  stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
  4157  {
  4158     unsigned int k;
  4159     if (z->num_bits < n) stbi__fill_bits(z);
  4160     k = z->code_buffer & ((1 << n) - 1);
  4161     z->code_buffer >>= n;
  4162     z->num_bits -= n;
  4163     return k;
  4164  }
  4165  
  4166  static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
  4167  {
  4168     int b,s,k;
  4169     // not resolved by fast table, so compute it the slow way
  4170     // use jpeg approach, which requires MSbits at top
  4171     k = stbi__bit_reverse(a->code_buffer, 16);
  4172     for (s=STBI__ZFAST_BITS+1; ; ++s)
  4173        if (k < z->maxcode[s])
  4174           break;
  4175     if (s >= 16) return -1; // invalid code!
  4176     // code size is s, so:
  4177     b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
  4178     if (b >= STBI__ZNSYMS) return -1; // some data was corrupt somewhere!
  4179     if (z->size[b] != s) return -1;  // was originally an assert, but report failure instead.
  4180     a->code_buffer >>= s;
  4181     a->num_bits -= s;
  4182     return z->value[b];
  4183  }
  4184  
  4185  stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
  4186  {
  4187     int b,s;
  4188     if (a->num_bits < 16) {
  4189        if (stbi__zeof(a)) {
  4190           return -1;   /* report error for unexpected end of data. */
  4191        }
  4192        stbi__fill_bits(a);
  4193     }
  4194     b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
  4195     if (b) {
  4196        s = b >> 9;
  4197        a->code_buffer >>= s;
  4198        a->num_bits -= s;
  4199        return b & 511;
  4200     }
  4201     return stbi__zhuffman_decode_slowpath(a, z);
  4202  }
  4203  
  4204  static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
  4205  {
  4206     char *q;
  4207     unsigned int cur, limit, old_limit;
  4208     z->zout = zout;
  4209     if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
  4210     cur   = (unsigned int) (z->zout - z->zout_start);
  4211     limit = old_limit = (unsigned) (z->zout_end - z->zout_start);
  4212     if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory");
  4213     while (cur + n > limit) {
  4214        if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory");
  4215        limit *= 2;
  4216     }
  4217     q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
  4218     STBI_NOTUSED(old_limit);
  4219     if (q == NULL) return stbi__err("outofmem", "Out of memory");
  4220     z->zout_start = q;
  4221     z->zout       = q + cur;
  4222     z->zout_end   = q + limit;
  4223     return 1;
  4224  }
  4225  
  4226  static const int stbi__zlength_base[31] = {
  4227     3,4,5,6,7,8,9,10,11,13,
  4228     15,17,19,23,27,31,35,43,51,59,
  4229     67,83,99,115,131,163,195,227,258,0,0 };
  4230  
  4231  static const int stbi__zlength_extra[31]=
  4232  { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
  4233  
  4234  static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
  4235  257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
  4236  
  4237  static const int stbi__zdist_extra[32] =
  4238  { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
  4239  
  4240  static int stbi__parse_huffman_block(stbi__zbuf *a)
  4241  {
  4242     char *zout = a->zout;
  4243     for(;;) {
  4244        int z = stbi__zhuffman_decode(a, &a->z_length);
  4245        if (z < 256) {
  4246           if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
  4247           if (zout >= a->zout_end) {
  4248              if (!stbi__zexpand(a, zout, 1)) return 0;
  4249              zout = a->zout;
  4250           }
  4251           *zout++ = (char) z;
  4252        } else {
  4253           stbi_uc *p;
  4254           int len,dist;
  4255           if (z == 256) {
  4256              a->zout = zout;
  4257              return 1;
  4258           }
  4259           z -= 257;
  4260           len = stbi__zlength_base[z];
  4261           if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
  4262           z = stbi__zhuffman_decode(a, &a->z_distance);
  4263           if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
  4264           dist = stbi__zdist_base[z];
  4265           if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
  4266           if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
  4267           if (zout + len > a->zout_end) {
  4268              if (!stbi__zexpand(a, zout, len)) return 0;
  4269              zout = a->zout;
  4270           }
  4271           p = (stbi_uc *) (zout - dist);
  4272           if (dist == 1) { // run of one byte; common in images.
  4273              stbi_uc v = *p;
  4274              if (len) { do *zout++ = v; while (--len); }
  4275           } else {
  4276              if (len) { do *zout++ = *p++; while (--len); }
  4277           }
  4278        }
  4279     }
  4280  }
  4281  
  4282  static int stbi__compute_huffman_codes(stbi__zbuf *a)
  4283  {
  4284     static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
  4285     stbi__zhuffman z_codelength;
  4286     stbi_uc lencodes[286+32+137];//padding for maximum single op
  4287     stbi_uc codelength_sizes[19];
  4288     int i,n;
  4289  
  4290     int hlit  = stbi__zreceive(a,5) + 257;
  4291     int hdist = stbi__zreceive(a,5) + 1;
  4292     int hclen = stbi__zreceive(a,4) + 4;
  4293     int ntot  = hlit + hdist;
  4294  
  4295     memset(codelength_sizes, 0, sizeof(codelength_sizes));
  4296     for (i=0; i < hclen; ++i) {
  4297        int s = stbi__zreceive(a,3);
  4298        codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
  4299     }
  4300     if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
  4301  
  4302     n = 0;
  4303     while (n < ntot) {
  4304        int c = stbi__zhuffman_decode(a, &z_codelength);
  4305        if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
  4306        if (c < 16)
  4307           lencodes[n++] = (stbi_uc) c;
  4308        else {
  4309           stbi_uc fill = 0;
  4310           if (c == 16) {
  4311              c = stbi__zreceive(a,2)+3;
  4312              if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
  4313              fill = lencodes[n-1];
  4314           } else if (c == 17) {
  4315              c = stbi__zreceive(a,3)+3;
  4316           } else if (c == 18) {
  4317              c = stbi__zreceive(a,7)+11;
  4318           } else {
  4319              return stbi__err("bad codelengths", "Corrupt PNG");
  4320           }
  4321           if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
  4322           memset(lencodes+n, fill, c);
  4323           n += c;
  4324        }
  4325     }
  4326     if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
  4327     if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
  4328     if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
  4329     return 1;
  4330  }
  4331  
  4332  static int stbi__parse_uncompressed_block(stbi__zbuf *a)
  4333  {
  4334     stbi_uc header[4];
  4335     int len,nlen,k;
  4336     if (a->num_bits & 7)
  4337        stbi__zreceive(a, a->num_bits & 7); // discard
  4338     // drain the bit-packed data into header
  4339     k = 0;
  4340     while (a->num_bits > 0) {
  4341        header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
  4342        a->code_buffer >>= 8;
  4343        a->num_bits -= 8;
  4344     }
  4345     if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG");
  4346     // now fill header the normal way
  4347     while (k < 4)
  4348        header[k++] = stbi__zget8(a);
  4349     len  = header[1] * 256 + header[0];
  4350     nlen = header[3] * 256 + header[2];
  4351     if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
  4352     if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
  4353     if (a->zout + len > a->zout_end)
  4354        if (!stbi__zexpand(a, a->zout, len)) return 0;
  4355     memcpy(a->zout, a->zbuffer, len);
  4356     a->zbuffer += len;
  4357     a->zout += len;
  4358     return 1;
  4359  }
  4360  
  4361  static int stbi__parse_zlib_header(stbi__zbuf *a)
  4362  {
  4363     int cmf   = stbi__zget8(a);
  4364     int cm    = cmf & 15;
  4365     /* int cinfo = cmf >> 4; */
  4366     int flg   = stbi__zget8(a);
  4367     if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
  4368     if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
  4369     if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
  4370     if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
  4371     // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
  4372     return 1;
  4373  }
  4374  
  4375  static const stbi_uc stbi__zdefault_length[STBI__ZNSYMS] =
  4376  {
  4377     8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
  4378     8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
  4379     8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
  4380     8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
  4381     8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
  4382     9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
  4383     9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
  4384     9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
  4385     7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
  4386  };
  4387  static const stbi_uc stbi__zdefault_distance[32] =
  4388  {
  4389     5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
  4390  };
  4391  /*
  4392  Init algorithm:
  4393  {
  4394     int i;   // use <= to match clearly with spec
  4395     for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
  4396     for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
  4397     for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
  4398     for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
  4399  
  4400     for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
  4401  }
  4402  */
  4403  
  4404  static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
  4405  {
  4406     int final, type;
  4407     if (parse_header)
  4408        if (!stbi__parse_zlib_header(a)) return 0;
  4409     a->num_bits = 0;
  4410     a->code_buffer = 0;
  4411     do {
  4412        final = stbi__zreceive(a,1);
  4413        type = stbi__zreceive(a,2);
  4414        if (type == 0) {
  4415           if (!stbi__parse_uncompressed_block(a)) return 0;
  4416        } else if (type == 3) {
  4417           return 0;
  4418        } else {
  4419           if (type == 1) {
  4420              // use fixed code lengths
  4421              if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , STBI__ZNSYMS)) return 0;
  4422              if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
  4423           } else {
  4424              if (!stbi__compute_huffman_codes(a)) return 0;
  4425           }
  4426           if (!stbi__parse_huffman_block(a)) return 0;
  4427        }
  4428     } while (!final);
  4429     return 1;
  4430  }
  4431  
  4432  static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
  4433  {
  4434     a->zout_start = obuf;
  4435     a->zout       = obuf;
  4436     a->zout_end   = obuf + olen;
  4437     a->z_expandable = exp;
  4438  
  4439     return stbi__parse_zlib(a, parse_header);
  4440  }
  4441  
  4442  STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
  4443  {
  4444     stbi__zbuf a;
  4445     char *p = (char *) stbi__malloc(initial_size);
  4446     if (p == NULL) return NULL;
  4447     a.zbuffer = (stbi_uc *) buffer;
  4448     a.zbuffer_end = (stbi_uc *) buffer + len;
  4449     if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
  4450        if (outlen) *outlen = (int) (a.zout - a.zout_start);
  4451        return a.zout_start;
  4452     } else {
  4453        STBI_FREE(a.zout_start);
  4454        return NULL;
  4455     }
  4456  }
  4457  
  4458  STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
  4459  {
  4460     return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
  4461  }
  4462  
  4463  STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
  4464  {
  4465     stbi__zbuf a;
  4466     char *p = (char *) stbi__malloc(initial_size);
  4467     if (p == NULL) return NULL;
  4468     a.zbuffer = (stbi_uc *) buffer;
  4469     a.zbuffer_end = (stbi_uc *) buffer + len;
  4470     if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
  4471        if (outlen) *outlen = (int) (a.zout - a.zout_start);
  4472        return a.zout_start;
  4473     } else {
  4474        STBI_FREE(a.zout_start);
  4475        return NULL;
  4476     }
  4477  }
  4478  
  4479  STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
  4480  {
  4481     stbi__zbuf a;
  4482     a.zbuffer = (stbi_uc *) ibuffer;
  4483     a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
  4484     if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
  4485        return (int) (a.zout - a.zout_start);
  4486     else
  4487        return -1;
  4488  }
  4489  
  4490  STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
  4491  {
  4492     stbi__zbuf a;
  4493     char *p = (char *) stbi__malloc(16384);
  4494     if (p == NULL) return NULL;
  4495     a.zbuffer = (stbi_uc *) buffer;
  4496     a.zbuffer_end = (stbi_uc *) buffer+len;
  4497     if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
  4498        if (outlen) *outlen = (int) (a.zout - a.zout_start);
  4499        return a.zout_start;
  4500     } else {
  4501        STBI_FREE(a.zout_start);
  4502        return NULL;
  4503     }
  4504  }
  4505  
  4506  STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
  4507  {
  4508     stbi__zbuf a;
  4509     a.zbuffer = (stbi_uc *) ibuffer;
  4510     a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
  4511     if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
  4512        return (int) (a.zout - a.zout_start);
  4513     else
  4514        return -1;
  4515  }
  4516  #endif
  4517  
  4518  // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
  4519  //    simple implementation
  4520  //      - only 8-bit samples
  4521  //      - no CRC checking
  4522  //      - allocates lots of intermediate memory
  4523  //        - avoids problem of streaming data between subsystems
  4524  //        - avoids explicit window management
  4525  //    performance
  4526  //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
  4527  
  4528  #ifndef STBI_NO_PNG
  4529  typedef struct
  4530  {
  4531     stbi__uint32 length;
  4532     stbi__uint32 type;
  4533  } stbi__pngchunk;
  4534  
  4535  static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
  4536  {
  4537     stbi__pngchunk c;
  4538     c.length = stbi__get32be(s);
  4539     c.type   = stbi__get32be(s);
  4540     return c;
  4541  }
  4542  
  4543  static int stbi__check_png_header(stbi__context *s)
  4544  {
  4545     static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
  4546     int i;
  4547     for (i=0; i < 8; ++i)
  4548        if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
  4549     return 1;
  4550  }
  4551  
  4552  typedef struct
  4553  {
  4554     stbi__context *s;
  4555     stbi_uc *idata, *expanded, *out;
  4556     int depth;
  4557  } stbi__png;
  4558  
  4559  
  4560  enum {
  4561     STBI__F_none=0,
  4562     STBI__F_sub=1,
  4563     STBI__F_up=2,
  4564     STBI__F_avg=3,
  4565     STBI__F_paeth=4,
  4566     // synthetic filters used for first scanline to avoid needing a dummy row of 0s
  4567     STBI__F_avg_first,
  4568     STBI__F_paeth_first
  4569  };
  4570  
  4571  static stbi_uc first_row_filter[5] =
  4572  {
  4573     STBI__F_none,
  4574     STBI__F_sub,
  4575     STBI__F_none,
  4576     STBI__F_avg_first,
  4577     STBI__F_paeth_first
  4578  };
  4579  
  4580  static int stbi__paeth(int a, int b, int c)
  4581  {
  4582     int p = a + b - c;
  4583     int pa = abs(p-a);
  4584     int pb = abs(p-b);
  4585     int pc = abs(p-c);
  4586     if (pa <= pb && pa <= pc) return a;
  4587     if (pb <= pc) return b;
  4588     return c;
  4589  }
  4590  
  4591  static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
  4592  
  4593  // create the png data from post-deflated data
  4594  static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
  4595  {
  4596     int bytes = (depth == 16? 2 : 1);
  4597     stbi__context *s = a->s;
  4598     stbi__uint32 i,j,stride = x*out_n*bytes;
  4599     stbi__uint32 img_len, img_width_bytes;
  4600     int k;
  4601     int img_n = s->img_n; // copy it into a local for later
  4602  
  4603     int output_bytes = out_n*bytes;
  4604     int filter_bytes = img_n*bytes;
  4605     int width = x;
  4606  
  4607     STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
  4608     a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
  4609     if (!a->out) return stbi__err("outofmem", "Out of memory");
  4610  
  4611     if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
  4612     img_width_bytes = (((img_n * x * depth) + 7) >> 3);
  4613     img_len = (img_width_bytes + 1) * y;
  4614  
  4615     // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
  4616     // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
  4617     // so just check for raw_len < img_len always.
  4618     if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
  4619  
  4620     for (j=0; j < y; ++j) {
  4621        stbi_uc *cur = a->out + stride*j;
  4622        stbi_uc *prior;
  4623        int filter = *raw++;
  4624  
  4625        if (filter > 4)
  4626           return stbi__err("invalid filter","Corrupt PNG");
  4627  
  4628        if (depth < 8) {
  4629           if (img_width_bytes > x) return stbi__err("invalid width","Corrupt PNG");
  4630           cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
  4631           filter_bytes = 1;
  4632           width = img_width_bytes;
  4633        }
  4634        prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
  4635  
  4636        // if first row, use special filter that doesn't sample previous row
  4637        if (j == 0) filter = first_row_filter[filter];
  4638  
  4639        // handle first byte explicitly
  4640        for (k=0; k < filter_bytes; ++k) {
  4641           switch (filter) {
  4642              case STBI__F_none       : cur[k] = raw[k]; break;
  4643              case STBI__F_sub        : cur[k] = raw[k]; break;
  4644              case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
  4645              case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
  4646              case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
  4647              case STBI__F_avg_first  : cur[k] = raw[k]; break;
  4648              case STBI__F_paeth_first: cur[k] = raw[k]; break;
  4649           }
  4650        }
  4651  
  4652        if (depth == 8) {
  4653           if (img_n != out_n)
  4654              cur[img_n] = 255; // first pixel
  4655           raw += img_n;
  4656           cur += out_n;
  4657           prior += out_n;
  4658        } else if (depth == 16) {
  4659           if (img_n != out_n) {
  4660              cur[filter_bytes]   = 255; // first pixel top byte
  4661              cur[filter_bytes+1] = 255; // first pixel bottom byte
  4662           }
  4663           raw += filter_bytes;
  4664           cur += output_bytes;
  4665           prior += output_bytes;
  4666        } else {
  4667           raw += 1;
  4668           cur += 1;
  4669           prior += 1;
  4670        }
  4671  
  4672        // this is a little gross, so that we don't switch per-pixel or per-component
  4673        if (depth < 8 || img_n == out_n) {
  4674           int nk = (width - 1)*filter_bytes;
  4675           #define STBI__CASE(f) \
  4676               case f:     \
  4677                  for (k=0; k < nk; ++k)
  4678           switch (filter) {
  4679              // "none" filter turns into a memcpy here; make that explicit.
  4680              case STBI__F_none:         memcpy(cur, raw, nk); break;
  4681              STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break;
  4682              STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
  4683              STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break;
  4684              STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break;
  4685              STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break;
  4686              STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break;
  4687           }
  4688           #undef STBI__CASE
  4689           raw += nk;
  4690        } else {
  4691           STBI_ASSERT(img_n+1 == out_n);
  4692           #define STBI__CASE(f) \
  4693               case f:     \
  4694                  for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
  4695                     for (k=0; k < filter_bytes; ++k)
  4696           switch (filter) {
  4697              STBI__CASE(STBI__F_none)         { cur[k] = raw[k]; } break;
  4698              STBI__CASE(STBI__F_sub)          { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break;
  4699              STBI__CASE(STBI__F_up)           { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
  4700              STBI__CASE(STBI__F_avg)          { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break;
  4701              STBI__CASE(STBI__F_paeth)        { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break;
  4702              STBI__CASE(STBI__F_avg_first)    { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break;
  4703              STBI__CASE(STBI__F_paeth_first)  { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break;
  4704           }
  4705           #undef STBI__CASE
  4706  
  4707           // the loop above sets the high byte of the pixels' alpha, but for
  4708           // 16 bit png files we also need the low byte set. we'll do that here.
  4709           if (depth == 16) {
  4710              cur = a->out + stride*j; // start at the beginning of the row again
  4711              for (i=0; i < x; ++i,cur+=output_bytes) {
  4712                 cur[filter_bytes+1] = 255;
  4713              }
  4714           }
  4715        }
  4716     }
  4717  
  4718     // we make a separate pass to expand bits to pixels; for performance,
  4719     // this could run two scanlines behind the above code, so it won't
  4720     // intefere with filtering but will still be in the cache.
  4721     if (depth < 8) {
  4722        for (j=0; j < y; ++j) {
  4723           stbi_uc *cur = a->out + stride*j;
  4724           stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
  4725           // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
  4726           // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
  4727           stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
  4728  
  4729           // note that the final byte might overshoot and write more data than desired.
  4730           // we can allocate enough data that this never writes out of memory, but it
  4731           // could also overwrite the next scanline. can it overwrite non-empty data
  4732           // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
  4733           // so we need to explicitly clamp the final ones
  4734  
  4735           if (depth == 4) {
  4736              for (k=x*img_n; k >= 2; k-=2, ++in) {
  4737                 *cur++ = scale * ((*in >> 4)       );
  4738                 *cur++ = scale * ((*in     ) & 0x0f);
  4739              }
  4740              if (k > 0) *cur++ = scale * ((*in >> 4)       );
  4741           } else if (depth == 2) {
  4742              for (k=x*img_n; k >= 4; k-=4, ++in) {
  4743                 *cur++ = scale * ((*in >> 6)       );
  4744                 *cur++ = scale * ((*in >> 4) & 0x03);
  4745                 *cur++ = scale * ((*in >> 2) & 0x03);
  4746                 *cur++ = scale * ((*in     ) & 0x03);
  4747              }
  4748              if (k > 0) *cur++ = scale * ((*in >> 6)       );
  4749              if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
  4750              if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
  4751           } else if (depth == 1) {
  4752              for (k=x*img_n; k >= 8; k-=8, ++in) {
  4753                 *cur++ = scale * ((*in >> 7)       );
  4754                 *cur++ = scale * ((*in >> 6) & 0x01);
  4755                 *cur++ = scale * ((*in >> 5) & 0x01);
  4756                 *cur++ = scale * ((*in >> 4) & 0x01);
  4757                 *cur++ = scale * ((*in >> 3) & 0x01);
  4758                 *cur++ = scale * ((*in >> 2) & 0x01);
  4759                 *cur++ = scale * ((*in >> 1) & 0x01);
  4760                 *cur++ = scale * ((*in     ) & 0x01);
  4761              }
  4762              if (k > 0) *cur++ = scale * ((*in >> 7)       );
  4763              if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
  4764              if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
  4765              if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
  4766              if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
  4767              if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
  4768              if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
  4769           }
  4770           if (img_n != out_n) {
  4771              int q;
  4772              // insert alpha = 255
  4773              cur = a->out + stride*j;
  4774              if (img_n == 1) {
  4775                 for (q=x-1; q >= 0; --q) {
  4776                    cur[q*2+1] = 255;
  4777                    cur[q*2+0] = cur[q];
  4778                 }
  4779              } else {
  4780                 STBI_ASSERT(img_n == 3);
  4781                 for (q=x-1; q >= 0; --q) {
  4782                    cur[q*4+3] = 255;
  4783                    cur[q*4+2] = cur[q*3+2];
  4784                    cur[q*4+1] = cur[q*3+1];
  4785                    cur[q*4+0] = cur[q*3+0];
  4786                 }
  4787              }
  4788           }
  4789        }
  4790     } else if (depth == 16) {
  4791        // force the image data from big-endian to platform-native.
  4792        // this is done in a separate pass due to the decoding relying
  4793        // on the data being untouched, but could probably be done
  4794        // per-line during decode if care is taken.
  4795        stbi_uc *cur = a->out;
  4796        stbi__uint16 *cur16 = (stbi__uint16*)cur;
  4797  
  4798        for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
  4799           *cur16 = (cur[0] << 8) | cur[1];
  4800        }
  4801     }
  4802  
  4803     return 1;
  4804  }
  4805  
  4806  static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
  4807  {
  4808     int bytes = (depth == 16 ? 2 : 1);
  4809     int out_bytes = out_n * bytes;
  4810     stbi_uc *final;
  4811     int p;
  4812     if (!interlaced)
  4813        return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
  4814  
  4815     // de-interlacing
  4816     final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
  4817     if (!final) return stbi__err("outofmem", "Out of memory");
  4818     for (p=0; p < 7; ++p) {
  4819        int xorig[] = { 0,4,0,2,0,1,0 };
  4820        int yorig[] = { 0,0,4,0,2,0,1 };
  4821        int xspc[]  = { 8,8,4,4,2,2,1 };
  4822        int yspc[]  = { 8,8,8,4,4,2,2 };
  4823        int i,j,x,y;
  4824        // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
  4825        x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
  4826        y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
  4827        if (x && y) {
  4828           stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
  4829           if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
  4830              STBI_FREE(final);
  4831              return 0;
  4832           }
  4833           for (j=0; j < y; ++j) {
  4834              for (i=0; i < x; ++i) {
  4835                 int out_y = j*yspc[p]+yorig[p];
  4836                 int out_x = i*xspc[p]+xorig[p];
  4837                 memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
  4838                        a->out + (j*x+i)*out_bytes, out_bytes);
  4839              }
  4840           }
  4841           STBI_FREE(a->out);
  4842           image_data += img_len;
  4843           image_data_len -= img_len;
  4844        }
  4845     }
  4846     a->out = final;
  4847  
  4848     return 1;
  4849  }
  4850  
  4851  static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
  4852  {
  4853     stbi__context *s = z->s;
  4854     stbi__uint32 i, pixel_count = s->img_x * s->img_y;
  4855     stbi_uc *p = z->out;
  4856  
  4857     // compute color-based transparency, assuming we've
  4858     // already got 255 as the alpha value in the output
  4859     STBI_ASSERT(out_n == 2 || out_n == 4);
  4860  
  4861     if (out_n == 2) {
  4862        for (i=0; i < pixel_count; ++i) {
  4863           p[1] = (p[0] == tc[0] ? 0 : 255);
  4864           p += 2;
  4865        }
  4866     } else {
  4867        for (i=0; i < pixel_count; ++i) {
  4868           if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
  4869              p[3] = 0;
  4870           p += 4;
  4871        }
  4872     }
  4873     return 1;
  4874  }
  4875  
  4876  static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
  4877  {
  4878     stbi__context *s = z->s;
  4879     stbi__uint32 i, pixel_count = s->img_x * s->img_y;
  4880     stbi__uint16 *p = (stbi__uint16*) z->out;
  4881  
  4882     // compute color-based transparency, assuming we've
  4883     // already got 65535 as the alpha value in the output
  4884     STBI_ASSERT(out_n == 2 || out_n == 4);
  4885  
  4886     if (out_n == 2) {
  4887        for (i = 0; i < pixel_count; ++i) {
  4888           p[1] = (p[0] == tc[0] ? 0 : 65535);
  4889           p += 2;
  4890        }
  4891     } else {
  4892        for (i = 0; i < pixel_count; ++i) {
  4893           if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
  4894              p[3] = 0;
  4895           p += 4;
  4896        }
  4897     }
  4898     return 1;
  4899  }
  4900  
  4901  static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
  4902  {
  4903     stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
  4904     stbi_uc *p, *temp_out, *orig = a->out;
  4905  
  4906     p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
  4907     if (p == NULL) return stbi__err("outofmem", "Out of memory");
  4908  
  4909     // between here and free(out) below, exitting would leak
  4910     temp_out = p;
  4911  
  4912     if (pal_img_n == 3) {
  4913        for (i=0; i < pixel_count; ++i) {
  4914           int n = orig[i]*4;
  4915           p[0] = palette[n  ];
  4916           p[1] = palette[n+1];
  4917           p[2] = palette[n+2];
  4918           p += 3;
  4919        }
  4920     } else {
  4921        for (i=0; i < pixel_count; ++i) {
  4922           int n = orig[i]*4;
  4923           p[0] = palette[n  ];
  4924           p[1] = palette[n+1];
  4925           p[2] = palette[n+2];
  4926           p[3] = palette[n+3];
  4927           p += 4;
  4928        }
  4929     }
  4930     STBI_FREE(a->out);
  4931     a->out = temp_out;
  4932  
  4933     STBI_NOTUSED(len);
  4934  
  4935     return 1;
  4936  }
  4937  
  4938  static int stbi__unpremultiply_on_load_global = 0;
  4939  static int stbi__de_iphone_flag_global = 0;
  4940  
  4941  STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
  4942  {
  4943     stbi__unpremultiply_on_load_global = flag_true_if_should_unpremultiply;
  4944  }
  4945  
  4946  STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
  4947  {
  4948     stbi__de_iphone_flag_global = flag_true_if_should_convert;
  4949  }
  4950  
  4951  #ifndef STBI_THREAD_LOCAL
  4952  #define stbi__unpremultiply_on_load  stbi__unpremultiply_on_load_global
  4953  #define stbi__de_iphone_flag  stbi__de_iphone_flag_global
  4954  #else
  4955  static STBI_THREAD_LOCAL int stbi__unpremultiply_on_load_local, stbi__unpremultiply_on_load_set;
  4956  static STBI_THREAD_LOCAL int stbi__de_iphone_flag_local, stbi__de_iphone_flag_set;
  4957  
  4958  STBIDEF void stbi__unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply)
  4959  {
  4960     stbi__unpremultiply_on_load_local = flag_true_if_should_unpremultiply;
  4961     stbi__unpremultiply_on_load_set = 1;
  4962  }
  4963  
  4964  STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert)
  4965  {
  4966     stbi__de_iphone_flag_local = flag_true_if_should_convert;
  4967     stbi__de_iphone_flag_set = 1;
  4968  }
  4969  
  4970  #define stbi__unpremultiply_on_load  (stbi__unpremultiply_on_load_set           \
  4971                                         ? stbi__unpremultiply_on_load_local      \
  4972                                         : stbi__unpremultiply_on_load_global)
  4973  #define stbi__de_iphone_flag  (stbi__de_iphone_flag_set                         \
  4974                                  ? stbi__de_iphone_flag_local                    \
  4975                                  : stbi__de_iphone_flag_global)
  4976  #endif // STBI_THREAD_LOCAL
  4977  
  4978  static void stbi__de_iphone(stbi__png *z)
  4979  {
  4980     stbi__context *s = z->s;
  4981     stbi__uint32 i, pixel_count = s->img_x * s->img_y;
  4982     stbi_uc *p = z->out;
  4983  
  4984     if (s->img_out_n == 3) {  // convert bgr to rgb
  4985        for (i=0; i < pixel_count; ++i) {
  4986           stbi_uc t = p[0];
  4987           p[0] = p[2];
  4988           p[2] = t;
  4989           p += 3;
  4990        }
  4991     } else {
  4992        STBI_ASSERT(s->img_out_n == 4);
  4993        if (stbi__unpremultiply_on_load) {
  4994           // convert bgr to rgb and unpremultiply
  4995           for (i=0; i < pixel_count; ++i) {
  4996              stbi_uc a = p[3];
  4997              stbi_uc t = p[0];
  4998              if (a) {
  4999                 stbi_uc half = a / 2;
  5000                 p[0] = (p[2] * 255 + half) / a;
  5001                 p[1] = (p[1] * 255 + half) / a;
  5002                 p[2] = ( t   * 255 + half) / a;
  5003              } else {
  5004                 p[0] = p[2];
  5005                 p[2] = t;
  5006              }
  5007              p += 4;
  5008           }
  5009        } else {
  5010           // convert bgr to rgb
  5011           for (i=0; i < pixel_count; ++i) {
  5012              stbi_uc t = p[0];
  5013              p[0] = p[2];
  5014              p[2] = t;
  5015              p += 4;
  5016           }
  5017        }
  5018     }
  5019  }
  5020  
  5021  #define STBI__PNG_TYPE(a,b,c,d)  (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
  5022  
  5023  static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
  5024  {
  5025     stbi_uc palette[1024], pal_img_n=0;
  5026     stbi_uc has_trans=0, tc[3]={0};
  5027     stbi__uint16 tc16[3];
  5028     stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
  5029     int first=1,k,interlace=0, color=0, is_iphone=0;
  5030     stbi__context *s = z->s;
  5031  
  5032     z->expanded = NULL;
  5033     z->idata = NULL;
  5034     z->out = NULL;
  5035  
  5036     if (!stbi__check_png_header(s)) return 0;
  5037  
  5038     if (scan == STBI__SCAN_type) return 1;
  5039  
  5040     for (;;) {
  5041        stbi__pngchunk c = stbi__get_chunk_header(s);
  5042        switch (c.type) {
  5043           case STBI__PNG_TYPE('C','g','B','I'):
  5044              is_iphone = 1;
  5045              stbi__skip(s, c.length);
  5046              break;
  5047           case STBI__PNG_TYPE('I','H','D','R'): {
  5048              int comp,filter;
  5049              if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
  5050              first = 0;
  5051              if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
  5052              s->img_x = stbi__get32be(s);
  5053              s->img_y = stbi__get32be(s);
  5054              if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  5055              if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  5056              z->depth = stbi__get8(s);  if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16)  return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
  5057              color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
  5058              if (color == 3 && z->depth == 16)                  return stbi__err("bad ctype","Corrupt PNG");
  5059              if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
  5060              comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
  5061              filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
  5062              interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
  5063              if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
  5064              if (!pal_img_n) {
  5065                 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
  5066                 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
  5067                 if (scan == STBI__SCAN_header) return 1;
  5068              } else {
  5069                 // if paletted, then pal_n is our final components, and
  5070                 // img_n is # components to decompress/filter.
  5071                 s->img_n = 1;
  5072                 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
  5073                 // if SCAN_header, have to scan to see if we have a tRNS
  5074              }
  5075              break;
  5076           }
  5077  
  5078           case STBI__PNG_TYPE('P','L','T','E'):  {
  5079              if (first) return stbi__err("first not IHDR", "Corrupt PNG");
  5080              if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
  5081              pal_len = c.length / 3;
  5082              if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
  5083              for (i=0; i < pal_len; ++i) {
  5084                 palette[i*4+0] = stbi__get8(s);
  5085                 palette[i*4+1] = stbi__get8(s);
  5086                 palette[i*4+2] = stbi__get8(s);
  5087                 palette[i*4+3] = 255;
  5088              }
  5089              break;
  5090           }
  5091  
  5092           case STBI__PNG_TYPE('t','R','N','S'): {
  5093              if (first) return stbi__err("first not IHDR", "Corrupt PNG");
  5094              if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
  5095              if (pal_img_n) {
  5096                 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
  5097                 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
  5098                 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
  5099                 pal_img_n = 4;
  5100                 for (i=0; i < c.length; ++i)
  5101                    palette[i*4+3] = stbi__get8(s);
  5102              } else {
  5103                 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
  5104                 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
  5105                 has_trans = 1;
  5106                 if (z->depth == 16) {
  5107                    for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
  5108                 } else {
  5109                    for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
  5110                 }
  5111              }
  5112              break;
  5113           }
  5114  
  5115           case STBI__PNG_TYPE('I','D','A','T'): {
  5116              if (first) return stbi__err("first not IHDR", "Corrupt PNG");
  5117              if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
  5118              if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
  5119              if ((int)(ioff + c.length) < (int)ioff) return 0;
  5120              if (ioff + c.length > idata_limit) {
  5121                 stbi__uint32 idata_limit_old = idata_limit;
  5122                 stbi_uc *p;
  5123                 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
  5124                 while (ioff + c.length > idata_limit)
  5125                    idata_limit *= 2;
  5126                 STBI_NOTUSED(idata_limit_old);
  5127                 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
  5128                 z->idata = p;
  5129              }
  5130              if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
  5131              ioff += c.length;
  5132              break;
  5133           }
  5134  
  5135           case STBI__PNG_TYPE('I','E','N','D'): {
  5136              stbi__uint32 raw_len, bpl;
  5137              if (first) return stbi__err("first not IHDR", "Corrupt PNG");
  5138              if (scan != STBI__SCAN_load) return 1;
  5139              if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
  5140              // initial guess for decoded data size to avoid unnecessary reallocs
  5141              bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
  5142              raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
  5143              z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
  5144              if (z->expanded == NULL) return 0; // zlib should set error
  5145              STBI_FREE(z->idata); z->idata = NULL;
  5146              if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
  5147                 s->img_out_n = s->img_n+1;
  5148              else
  5149                 s->img_out_n = s->img_n;
  5150              if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
  5151              if (has_trans) {
  5152                 if (z->depth == 16) {
  5153                    if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
  5154                 } else {
  5155                    if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
  5156                 }
  5157              }
  5158              if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
  5159                 stbi__de_iphone(z);
  5160              if (pal_img_n) {
  5161                 // pal_img_n == 3 or 4
  5162                 s->img_n = pal_img_n; // record the actual colors we had
  5163                 s->img_out_n = pal_img_n;
  5164                 if (req_comp >= 3) s->img_out_n = req_comp;
  5165                 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
  5166                    return 0;
  5167              } else if (has_trans) {
  5168                 // non-paletted image with tRNS -> source image has (constant) alpha
  5169                 ++s->img_n;
  5170              }
  5171              STBI_FREE(z->expanded); z->expanded = NULL;
  5172              // end of PNG chunk, read and skip CRC
  5173              stbi__get32be(s);
  5174              return 1;
  5175           }
  5176  
  5177           default:
  5178              // if critical, fail
  5179              if (first) return stbi__err("first not IHDR", "Corrupt PNG");
  5180              if ((c.type & (1 << 29)) == 0) {
  5181                 #ifndef STBI_NO_FAILURE_STRINGS
  5182                 // not threadsafe
  5183                 static char invalid_chunk[] = "XXXX PNG chunk not known";
  5184                 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
  5185                 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
  5186                 invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
  5187                 invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
  5188                 #endif
  5189                 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
  5190              }
  5191              stbi__skip(s, c.length);
  5192              break;
  5193        }
  5194        // end of PNG chunk, read and skip CRC
  5195        stbi__get32be(s);
  5196     }
  5197  }
  5198  
  5199  static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
  5200  {
  5201     void *result=NULL;
  5202     if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
  5203     if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
  5204        if (p->depth <= 8)
  5205           ri->bits_per_channel = 8;
  5206        else if (p->depth == 16)
  5207           ri->bits_per_channel = 16;
  5208        else
  5209           return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth");
  5210        result = p->out;
  5211        p->out = NULL;
  5212        if (req_comp && req_comp != p->s->img_out_n) {
  5213           if (ri->bits_per_channel == 8)
  5214              result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
  5215           else
  5216              result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
  5217           p->s->img_out_n = req_comp;
  5218           if (result == NULL) return result;
  5219        }
  5220        *x = p->s->img_x;
  5221        *y = p->s->img_y;
  5222        if (n) *n = p->s->img_n;
  5223     }
  5224     STBI_FREE(p->out);      p->out      = NULL;
  5225     STBI_FREE(p->expanded); p->expanded = NULL;
  5226     STBI_FREE(p->idata);    p->idata    = NULL;
  5227  
  5228     return result;
  5229  }
  5230  
  5231  static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  5232  {
  5233     stbi__png p;
  5234     p.s = s;
  5235     return stbi__do_png(&p, x,y,comp,req_comp, ri);
  5236  }
  5237  
  5238  static int stbi__png_test(stbi__context *s)
  5239  {
  5240     int r;
  5241     r = stbi__check_png_header(s);
  5242     stbi__rewind(s);
  5243     return r;
  5244  }
  5245  
  5246  static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
  5247  {
  5248     if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
  5249        stbi__rewind( p->s );
  5250        return 0;
  5251     }
  5252     if (x) *x = p->s->img_x;
  5253     if (y) *y = p->s->img_y;
  5254     if (comp) *comp = p->s->img_n;
  5255     return 1;
  5256  }
  5257  
  5258  static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
  5259  {
  5260     stbi__png p;
  5261     p.s = s;
  5262     return stbi__png_info_raw(&p, x, y, comp);
  5263  }
  5264  
  5265  static int stbi__png_is16(stbi__context *s)
  5266  {
  5267     stbi__png p;
  5268     p.s = s;
  5269     if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
  5270  	   return 0;
  5271     if (p.depth != 16) {
  5272        stbi__rewind(p.s);
  5273        return 0;
  5274     }
  5275     return 1;
  5276  }
  5277  #endif
  5278  
  5279  // Microsoft/Windows BMP image
  5280  
  5281  #ifndef STBI_NO_BMP
  5282  static int stbi__bmp_test_raw(stbi__context *s)
  5283  {
  5284     int r;
  5285     int sz;
  5286     if (stbi__get8(s) != 'B') return 0;
  5287     if (stbi__get8(s) != 'M') return 0;
  5288     stbi__get32le(s); // discard filesize
  5289     stbi__get16le(s); // discard reserved
  5290     stbi__get16le(s); // discard reserved
  5291     stbi__get32le(s); // discard data offset
  5292     sz = stbi__get32le(s);
  5293     r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
  5294     return r;
  5295  }
  5296  
  5297  static int stbi__bmp_test(stbi__context *s)
  5298  {
  5299     int r = stbi__bmp_test_raw(s);
  5300     stbi__rewind(s);
  5301     return r;
  5302  }
  5303  
  5304  
  5305  // returns 0..31 for the highest set bit
  5306  static int stbi__high_bit(unsigned int z)
  5307  {
  5308     int n=0;
  5309     if (z == 0) return -1;
  5310     if (z >= 0x10000) { n += 16; z >>= 16; }
  5311     if (z >= 0x00100) { n +=  8; z >>=  8; }
  5312     if (z >= 0x00010) { n +=  4; z >>=  4; }
  5313     if (z >= 0x00004) { n +=  2; z >>=  2; }
  5314     if (z >= 0x00002) { n +=  1;/* >>=  1;*/ }
  5315     return n;
  5316  }
  5317  
  5318  static int stbi__bitcount(unsigned int a)
  5319  {
  5320     a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
  5321     a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
  5322     a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
  5323     a = (a + (a >> 8)); // max 16 per 8 bits
  5324     a = (a + (a >> 16)); // max 32 per 8 bits
  5325     return a & 0xff;
  5326  }
  5327  
  5328  // extract an arbitrarily-aligned N-bit value (N=bits)
  5329  // from v, and then make it 8-bits long and fractionally
  5330  // extend it to full full range.
  5331  static int stbi__shiftsigned(unsigned int v, int shift, int bits)
  5332  {
  5333     static unsigned int mul_table[9] = {
  5334        0,
  5335        0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
  5336        0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
  5337     };
  5338     static unsigned int shift_table[9] = {
  5339        0, 0,0,1,0,2,4,6,0,
  5340     };
  5341     if (shift < 0)
  5342        v <<= -shift;
  5343     else
  5344        v >>= shift;
  5345     STBI_ASSERT(v < 256);
  5346     v >>= (8-bits);
  5347     STBI_ASSERT(bits >= 0 && bits <= 8);
  5348     return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
  5349  }
  5350  
  5351  typedef struct
  5352  {
  5353     int bpp, offset, hsz;
  5354     unsigned int mr,mg,mb,ma, all_a;
  5355     int extra_read;
  5356  } stbi__bmp_data;
  5357  
  5358  static int stbi__bmp_set_mask_defaults(stbi__bmp_data *info, int compress)
  5359  {
  5360     // BI_BITFIELDS specifies masks explicitly, don't override
  5361     if (compress == 3)
  5362        return 1;
  5363  
  5364     if (compress == 0) {
  5365        if (info->bpp == 16) {
  5366           info->mr = 31u << 10;
  5367           info->mg = 31u <<  5;
  5368           info->mb = 31u <<  0;
  5369        } else if (info->bpp == 32) {
  5370           info->mr = 0xffu << 16;
  5371           info->mg = 0xffu <<  8;
  5372           info->mb = 0xffu <<  0;
  5373           info->ma = 0xffu << 24;
  5374           info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
  5375        } else {
  5376           // otherwise, use defaults, which is all-0
  5377           info->mr = info->mg = info->mb = info->ma = 0;
  5378        }
  5379        return 1;
  5380     }
  5381     return 0; // error
  5382  }
  5383  
  5384  static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
  5385  {
  5386     int hsz;
  5387     if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
  5388     stbi__get32le(s); // discard filesize
  5389     stbi__get16le(s); // discard reserved
  5390     stbi__get16le(s); // discard reserved
  5391     info->offset = stbi__get32le(s);
  5392     info->hsz = hsz = stbi__get32le(s);
  5393     info->mr = info->mg = info->mb = info->ma = 0;
  5394     info->extra_read = 14;
  5395  
  5396     if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP");
  5397  
  5398     if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
  5399     if (hsz == 12) {
  5400        s->img_x = stbi__get16le(s);
  5401        s->img_y = stbi__get16le(s);
  5402     } else {
  5403        s->img_x = stbi__get32le(s);
  5404        s->img_y = stbi__get32le(s);
  5405     }
  5406     if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
  5407     info->bpp = stbi__get16le(s);
  5408     if (hsz != 12) {
  5409        int compress = stbi__get32le(s);
  5410        if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
  5411        if (compress >= 4) return stbi__errpuc("BMP JPEG/PNG", "BMP type not supported: unsupported compression"); // this includes PNG/JPEG modes
  5412        if (compress == 3 && info->bpp != 16 && info->bpp != 32) return stbi__errpuc("bad BMP", "bad BMP"); // bitfields requires 16 or 32 bits/pixel
  5413        stbi__get32le(s); // discard sizeof
  5414        stbi__get32le(s); // discard hres
  5415        stbi__get32le(s); // discard vres
  5416        stbi__get32le(s); // discard colorsused
  5417        stbi__get32le(s); // discard max important
  5418        if (hsz == 40 || hsz == 56) {
  5419           if (hsz == 56) {
  5420              stbi__get32le(s);
  5421              stbi__get32le(s);
  5422              stbi__get32le(s);
  5423              stbi__get32le(s);
  5424           }
  5425           if (info->bpp == 16 || info->bpp == 32) {
  5426              if (compress == 0) {
  5427                 stbi__bmp_set_mask_defaults(info, compress);
  5428              } else if (compress == 3) {
  5429                 info->mr = stbi__get32le(s);
  5430                 info->mg = stbi__get32le(s);
  5431                 info->mb = stbi__get32le(s);
  5432                 info->extra_read += 12;
  5433                 // not documented, but generated by photoshop and handled by mspaint
  5434                 if (info->mr == info->mg && info->mg == info->mb) {
  5435                    // ?!?!?
  5436                    return stbi__errpuc("bad BMP", "bad BMP");
  5437                 }
  5438              } else
  5439                 return stbi__errpuc("bad BMP", "bad BMP");
  5440           }
  5441        } else {
  5442           // V4/V5 header
  5443           int i;
  5444           if (hsz != 108 && hsz != 124)
  5445              return stbi__errpuc("bad BMP", "bad BMP");
  5446           info->mr = stbi__get32le(s);
  5447           info->mg = stbi__get32le(s);
  5448           info->mb = stbi__get32le(s);
  5449           info->ma = stbi__get32le(s);
  5450           if (compress != 3) // override mr/mg/mb unless in BI_BITFIELDS mode, as per docs
  5451              stbi__bmp_set_mask_defaults(info, compress);
  5452           stbi__get32le(s); // discard color space
  5453           for (i=0; i < 12; ++i)
  5454              stbi__get32le(s); // discard color space parameters
  5455           if (hsz == 124) {
  5456              stbi__get32le(s); // discard rendering intent
  5457              stbi__get32le(s); // discard offset of profile data
  5458              stbi__get32le(s); // discard size of profile data
  5459              stbi__get32le(s); // discard reserved
  5460           }
  5461        }
  5462     }
  5463     return (void *) 1;
  5464  }
  5465  
  5466  
  5467  static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  5468  {
  5469     stbi_uc *out;
  5470     unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
  5471     stbi_uc pal[256][4];
  5472     int psize=0,i,j,width;
  5473     int flip_vertically, pad, target;
  5474     stbi__bmp_data info;
  5475     STBI_NOTUSED(ri);
  5476  
  5477     info.all_a = 255;
  5478     if (stbi__bmp_parse_header(s, &info) == NULL)
  5479        return NULL; // error code already set
  5480  
  5481     flip_vertically = ((int) s->img_y) > 0;
  5482     s->img_y = abs((int) s->img_y);
  5483  
  5484     if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  5485     if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  5486  
  5487     mr = info.mr;
  5488     mg = info.mg;
  5489     mb = info.mb;
  5490     ma = info.ma;
  5491     all_a = info.all_a;
  5492  
  5493     if (info.hsz == 12) {
  5494        if (info.bpp < 24)
  5495           psize = (info.offset - info.extra_read - 24) / 3;
  5496     } else {
  5497        if (info.bpp < 16)
  5498           psize = (info.offset - info.extra_read - info.hsz) >> 2;
  5499     }
  5500     if (psize == 0) {
  5501        if (info.offset != s->callback_already_read + (s->img_buffer - s->img_buffer_original)) {
  5502          return stbi__errpuc("bad offset", "Corrupt BMP");
  5503        }
  5504     }
  5505  
  5506     if (info.bpp == 24 && ma == 0xff000000)
  5507        s->img_n = 3;
  5508     else
  5509        s->img_n = ma ? 4 : 3;
  5510     if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
  5511        target = req_comp;
  5512     else
  5513        target = s->img_n; // if they want monochrome, we'll post-convert
  5514  
  5515     // sanity-check size
  5516     if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
  5517        return stbi__errpuc("too large", "Corrupt BMP");
  5518  
  5519     out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
  5520     if (!out) return stbi__errpuc("outofmem", "Out of memory");
  5521     if (info.bpp < 16) {
  5522        int z=0;
  5523        if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
  5524        for (i=0; i < psize; ++i) {
  5525           pal[i][2] = stbi__get8(s);
  5526           pal[i][1] = stbi__get8(s);
  5527           pal[i][0] = stbi__get8(s);
  5528           if (info.hsz != 12) stbi__get8(s);
  5529           pal[i][3] = 255;
  5530        }
  5531        stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
  5532        if (info.bpp == 1) width = (s->img_x + 7) >> 3;
  5533        else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
  5534        else if (info.bpp == 8) width = s->img_x;
  5535        else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
  5536        pad = (-width)&3;
  5537        if (info.bpp == 1) {
  5538           for (j=0; j < (int) s->img_y; ++j) {
  5539              int bit_offset = 7, v = stbi__get8(s);
  5540              for (i=0; i < (int) s->img_x; ++i) {
  5541                 int color = (v>>bit_offset)&0x1;
  5542                 out[z++] = pal[color][0];
  5543                 out[z++] = pal[color][1];
  5544                 out[z++] = pal[color][2];
  5545                 if (target == 4) out[z++] = 255;
  5546                 if (i+1 == (int) s->img_x) break;
  5547                 if((--bit_offset) < 0) {
  5548                    bit_offset = 7;
  5549                    v = stbi__get8(s);
  5550                 }
  5551              }
  5552              stbi__skip(s, pad);
  5553           }
  5554        } else {
  5555           for (j=0; j < (int) s->img_y; ++j) {
  5556              for (i=0; i < (int) s->img_x; i += 2) {
  5557                 int v=stbi__get8(s),v2=0;
  5558                 if (info.bpp == 4) {
  5559                    v2 = v & 15;
  5560                    v >>= 4;
  5561                 }
  5562                 out[z++] = pal[v][0];
  5563                 out[z++] = pal[v][1];
  5564                 out[z++] = pal[v][2];
  5565                 if (target == 4) out[z++] = 255;
  5566                 if (i+1 == (int) s->img_x) break;
  5567                 v = (info.bpp == 8) ? stbi__get8(s) : v2;
  5568                 out[z++] = pal[v][0];
  5569                 out[z++] = pal[v][1];
  5570                 out[z++] = pal[v][2];
  5571                 if (target == 4) out[z++] = 255;
  5572              }
  5573              stbi__skip(s, pad);
  5574           }
  5575        }
  5576     } else {
  5577        int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
  5578        int z = 0;
  5579        int easy=0;
  5580        stbi__skip(s, info.offset - info.extra_read - info.hsz);
  5581        if (info.bpp == 24) width = 3 * s->img_x;
  5582        else if (info.bpp == 16) width = 2*s->img_x;
  5583        else /* bpp = 32 and pad = 0 */ width=0;
  5584        pad = (-width) & 3;
  5585        if (info.bpp == 24) {
  5586           easy = 1;
  5587        } else if (info.bpp == 32) {
  5588           if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
  5589              easy = 2;
  5590        }
  5591        if (!easy) {
  5592           if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
  5593           // right shift amt to put high bit in position #7
  5594           rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
  5595           gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
  5596           bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
  5597           ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
  5598           if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
  5599        }
  5600        for (j=0; j < (int) s->img_y; ++j) {
  5601           if (easy) {
  5602              for (i=0; i < (int) s->img_x; ++i) {
  5603                 unsigned char a;
  5604                 out[z+2] = stbi__get8(s);
  5605                 out[z+1] = stbi__get8(s);
  5606                 out[z+0] = stbi__get8(s);
  5607                 z += 3;
  5608                 a = (easy == 2 ? stbi__get8(s) : 255);
  5609                 all_a |= a;
  5610                 if (target == 4) out[z++] = a;
  5611              }
  5612           } else {
  5613              int bpp = info.bpp;
  5614              for (i=0; i < (int) s->img_x; ++i) {
  5615                 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
  5616                 unsigned int a;
  5617                 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
  5618                 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
  5619                 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
  5620                 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
  5621                 all_a |= a;
  5622                 if (target == 4) out[z++] = STBI__BYTECAST(a);
  5623              }
  5624           }
  5625           stbi__skip(s, pad);
  5626        }
  5627     }
  5628  
  5629     // if alpha channel is all 0s, replace with all 255s
  5630     if (target == 4 && all_a == 0)
  5631        for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
  5632           out[i] = 255;
  5633  
  5634     if (flip_vertically) {
  5635        stbi_uc t;
  5636        for (j=0; j < (int) s->img_y>>1; ++j) {
  5637           stbi_uc *p1 = out +      j     *s->img_x*target;
  5638           stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
  5639           for (i=0; i < (int) s->img_x*target; ++i) {
  5640              t = p1[i]; p1[i] = p2[i]; p2[i] = t;
  5641           }
  5642        }
  5643     }
  5644  
  5645     if (req_comp && req_comp != target) {
  5646        out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
  5647        if (out == NULL) return out; // stbi__convert_format frees input on failure
  5648     }
  5649  
  5650     *x = s->img_x;
  5651     *y = s->img_y;
  5652     if (comp) *comp = s->img_n;
  5653     return out;
  5654  }
  5655  #endif
  5656  
  5657  // Targa Truevision - TGA
  5658  // by Jonathan Dummer
  5659  #ifndef STBI_NO_TGA
  5660  // returns STBI_rgb or whatever, 0 on error
  5661  static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
  5662  {
  5663     // only RGB or RGBA (incl. 16bit) or grey allowed
  5664     if (is_rgb16) *is_rgb16 = 0;
  5665     switch(bits_per_pixel) {
  5666        case 8:  return STBI_grey;
  5667        case 16: if(is_grey) return STBI_grey_alpha;
  5668                 // fallthrough
  5669        case 15: if(is_rgb16) *is_rgb16 = 1;
  5670                 return STBI_rgb;
  5671        case 24: // fallthrough
  5672        case 32: return bits_per_pixel/8;
  5673        default: return 0;
  5674     }
  5675  }
  5676  
  5677  static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
  5678  {
  5679      int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
  5680      int sz, tga_colormap_type;
  5681      stbi__get8(s);                   // discard Offset
  5682      tga_colormap_type = stbi__get8(s); // colormap type
  5683      if( tga_colormap_type > 1 ) {
  5684          stbi__rewind(s);
  5685          return 0;      // only RGB or indexed allowed
  5686      }
  5687      tga_image_type = stbi__get8(s); // image type
  5688      if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
  5689          if (tga_image_type != 1 && tga_image_type != 9) {
  5690              stbi__rewind(s);
  5691              return 0;
  5692          }
  5693          stbi__skip(s,4);       // skip index of first colormap entry and number of entries
  5694          sz = stbi__get8(s);    //   check bits per palette color entry
  5695          if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
  5696              stbi__rewind(s);
  5697              return 0;
  5698          }
  5699          stbi__skip(s,4);       // skip image x and y origin
  5700          tga_colormap_bpp = sz;
  5701      } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
  5702          if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
  5703              stbi__rewind(s);
  5704              return 0; // only RGB or grey allowed, +/- RLE
  5705          }
  5706          stbi__skip(s,9); // skip colormap specification and image x/y origin
  5707          tga_colormap_bpp = 0;
  5708      }
  5709      tga_w = stbi__get16le(s);
  5710      if( tga_w < 1 ) {
  5711          stbi__rewind(s);
  5712          return 0;   // test width
  5713      }
  5714      tga_h = stbi__get16le(s);
  5715      if( tga_h < 1 ) {
  5716          stbi__rewind(s);
  5717          return 0;   // test height
  5718      }
  5719      tga_bits_per_pixel = stbi__get8(s); // bits per pixel
  5720      stbi__get8(s); // ignore alpha bits
  5721      if (tga_colormap_bpp != 0) {
  5722          if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
  5723              // when using a colormap, tga_bits_per_pixel is the size of the indexes
  5724              // I don't think anything but 8 or 16bit indexes makes sense
  5725              stbi__rewind(s);
  5726              return 0;
  5727          }
  5728          tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
  5729      } else {
  5730          tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
  5731      }
  5732      if(!tga_comp) {
  5733        stbi__rewind(s);
  5734        return 0;
  5735      }
  5736      if (x) *x = tga_w;
  5737      if (y) *y = tga_h;
  5738      if (comp) *comp = tga_comp;
  5739      return 1;                   // seems to have passed everything
  5740  }
  5741  
  5742  static int stbi__tga_test(stbi__context *s)
  5743  {
  5744     int res = 0;
  5745     int sz, tga_color_type;
  5746     stbi__get8(s);      //   discard Offset
  5747     tga_color_type = stbi__get8(s);   //   color type
  5748     if ( tga_color_type > 1 ) goto errorEnd;   //   only RGB or indexed allowed
  5749     sz = stbi__get8(s);   //   image type
  5750     if ( tga_color_type == 1 ) { // colormapped (paletted) image
  5751        if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
  5752        stbi__skip(s,4);       // skip index of first colormap entry and number of entries
  5753        sz = stbi__get8(s);    //   check bits per palette color entry
  5754        if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
  5755        stbi__skip(s,4);       // skip image x and y origin
  5756     } else { // "normal" image w/o colormap
  5757        if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
  5758        stbi__skip(s,9); // skip colormap specification and image x/y origin
  5759     }
  5760     if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test width
  5761     if ( stbi__get16le(s) < 1 ) goto errorEnd;      //   test height
  5762     sz = stbi__get8(s);   //   bits per pixel
  5763     if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
  5764     if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
  5765  
  5766     res = 1; // if we got this far, everything's good and we can return 1 instead of 0
  5767  
  5768  errorEnd:
  5769     stbi__rewind(s);
  5770     return res;
  5771  }
  5772  
  5773  // read 16bit value and convert to 24bit RGB
  5774  static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
  5775  {
  5776     stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
  5777     stbi__uint16 fiveBitMask = 31;
  5778     // we have 3 channels with 5bits each
  5779     int r = (px >> 10) & fiveBitMask;
  5780     int g = (px >> 5) & fiveBitMask;
  5781     int b = px & fiveBitMask;
  5782     // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
  5783     out[0] = (stbi_uc)((r * 255)/31);
  5784     out[1] = (stbi_uc)((g * 255)/31);
  5785     out[2] = (stbi_uc)((b * 255)/31);
  5786  
  5787     // some people claim that the most significant bit might be used for alpha
  5788     // (possibly if an alpha-bit is set in the "image descriptor byte")
  5789     // but that only made 16bit test images completely translucent..
  5790     // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
  5791  }
  5792  
  5793  static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  5794  {
  5795     //   read in the TGA header stuff
  5796     int tga_offset = stbi__get8(s);
  5797     int tga_indexed = stbi__get8(s);
  5798     int tga_image_type = stbi__get8(s);
  5799     int tga_is_RLE = 0;
  5800     int tga_palette_start = stbi__get16le(s);
  5801     int tga_palette_len = stbi__get16le(s);
  5802     int tga_palette_bits = stbi__get8(s);
  5803     int tga_x_origin = stbi__get16le(s);
  5804     int tga_y_origin = stbi__get16le(s);
  5805     int tga_width = stbi__get16le(s);
  5806     int tga_height = stbi__get16le(s);
  5807     int tga_bits_per_pixel = stbi__get8(s);
  5808     int tga_comp, tga_rgb16=0;
  5809     int tga_inverted = stbi__get8(s);
  5810     // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
  5811     //   image data
  5812     unsigned char *tga_data;
  5813     unsigned char *tga_palette = NULL;
  5814     int i, j;
  5815     unsigned char raw_data[4] = {0};
  5816     int RLE_count = 0;
  5817     int RLE_repeating = 0;
  5818     int read_next_pixel = 1;
  5819     STBI_NOTUSED(ri);
  5820     STBI_NOTUSED(tga_x_origin); // @TODO
  5821     STBI_NOTUSED(tga_y_origin); // @TODO
  5822  
  5823     if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  5824     if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  5825  
  5826     //   do a tiny bit of precessing
  5827     if ( tga_image_type >= 8 )
  5828     {
  5829        tga_image_type -= 8;
  5830        tga_is_RLE = 1;
  5831     }
  5832     tga_inverted = 1 - ((tga_inverted >> 5) & 1);
  5833  
  5834     //   If I'm paletted, then I'll use the number of bits from the palette
  5835     if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
  5836     else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
  5837  
  5838     if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
  5839        return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
  5840  
  5841     //   tga info
  5842     *x = tga_width;
  5843     *y = tga_height;
  5844     if (comp) *comp = tga_comp;
  5845  
  5846     if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
  5847        return stbi__errpuc("too large", "Corrupt TGA");
  5848  
  5849     tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
  5850     if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
  5851  
  5852     // skip to the data's starting position (offset usually = 0)
  5853     stbi__skip(s, tga_offset );
  5854  
  5855     if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
  5856        for (i=0; i < tga_height; ++i) {
  5857           int row = tga_inverted ? tga_height -i - 1 : i;
  5858           stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
  5859           stbi__getn(s, tga_row, tga_width * tga_comp);
  5860        }
  5861     } else  {
  5862        //   do I need to load a palette?
  5863        if ( tga_indexed)
  5864        {
  5865           if (tga_palette_len == 0) {  /* you have to have at least one entry! */
  5866              STBI_FREE(tga_data);
  5867              return stbi__errpuc("bad palette", "Corrupt TGA");
  5868           }
  5869  
  5870           //   any data to skip? (offset usually = 0)
  5871           stbi__skip(s, tga_palette_start );
  5872           //   load the palette
  5873           tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
  5874           if (!tga_palette) {
  5875              STBI_FREE(tga_data);
  5876              return stbi__errpuc("outofmem", "Out of memory");
  5877           }
  5878           if (tga_rgb16) {
  5879              stbi_uc *pal_entry = tga_palette;
  5880              STBI_ASSERT(tga_comp == STBI_rgb);
  5881              for (i=0; i < tga_palette_len; ++i) {
  5882                 stbi__tga_read_rgb16(s, pal_entry);
  5883                 pal_entry += tga_comp;
  5884              }
  5885           } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
  5886                 STBI_FREE(tga_data);
  5887                 STBI_FREE(tga_palette);
  5888                 return stbi__errpuc("bad palette", "Corrupt TGA");
  5889           }
  5890        }
  5891        //   load the data
  5892        for (i=0; i < tga_width * tga_height; ++i)
  5893        {
  5894           //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
  5895           if ( tga_is_RLE )
  5896           {
  5897              if ( RLE_count == 0 )
  5898              {
  5899                 //   yep, get the next byte as a RLE command
  5900                 int RLE_cmd = stbi__get8(s);
  5901                 RLE_count = 1 + (RLE_cmd & 127);
  5902                 RLE_repeating = RLE_cmd >> 7;
  5903                 read_next_pixel = 1;
  5904              } else if ( !RLE_repeating )
  5905              {
  5906                 read_next_pixel = 1;
  5907              }
  5908           } else
  5909           {
  5910              read_next_pixel = 1;
  5911           }
  5912           //   OK, if I need to read a pixel, do it now
  5913           if ( read_next_pixel )
  5914           {
  5915              //   load however much data we did have
  5916              if ( tga_indexed )
  5917              {
  5918                 // read in index, then perform the lookup
  5919                 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
  5920                 if ( pal_idx >= tga_palette_len ) {
  5921                    // invalid index
  5922                    pal_idx = 0;
  5923                 }
  5924                 pal_idx *= tga_comp;
  5925                 for (j = 0; j < tga_comp; ++j) {
  5926                    raw_data[j] = tga_palette[pal_idx+j];
  5927                 }
  5928              } else if(tga_rgb16) {
  5929                 STBI_ASSERT(tga_comp == STBI_rgb);
  5930                 stbi__tga_read_rgb16(s, raw_data);
  5931              } else {
  5932                 //   read in the data raw
  5933                 for (j = 0; j < tga_comp; ++j) {
  5934                    raw_data[j] = stbi__get8(s);
  5935                 }
  5936              }
  5937              //   clear the reading flag for the next pixel
  5938              read_next_pixel = 0;
  5939           } // end of reading a pixel
  5940  
  5941           // copy data
  5942           for (j = 0; j < tga_comp; ++j)
  5943             tga_data[i*tga_comp+j] = raw_data[j];
  5944  
  5945           //   in case we're in RLE mode, keep counting down
  5946           --RLE_count;
  5947        }
  5948        //   do I need to invert the image?
  5949        if ( tga_inverted )
  5950        {
  5951           for (j = 0; j*2 < tga_height; ++j)
  5952           {
  5953              int index1 = j * tga_width * tga_comp;
  5954              int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
  5955              for (i = tga_width * tga_comp; i > 0; --i)
  5956              {
  5957                 unsigned char temp = tga_data[index1];
  5958                 tga_data[index1] = tga_data[index2];
  5959                 tga_data[index2] = temp;
  5960                 ++index1;
  5961                 ++index2;
  5962              }
  5963           }
  5964        }
  5965        //   clear my palette, if I had one
  5966        if ( tga_palette != NULL )
  5967        {
  5968           STBI_FREE( tga_palette );
  5969        }
  5970     }
  5971  
  5972     // swap RGB - if the source data was RGB16, it already is in the right order
  5973     if (tga_comp >= 3 && !tga_rgb16)
  5974     {
  5975        unsigned char* tga_pixel = tga_data;
  5976        for (i=0; i < tga_width * tga_height; ++i)
  5977        {
  5978           unsigned char temp = tga_pixel[0];
  5979           tga_pixel[0] = tga_pixel[2];
  5980           tga_pixel[2] = temp;
  5981           tga_pixel += tga_comp;
  5982        }
  5983     }
  5984  
  5985     // convert to target component count
  5986     if (req_comp && req_comp != tga_comp)
  5987        tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
  5988  
  5989     //   the things I do to get rid of an error message, and yet keep
  5990     //   Microsoft's C compilers happy... [8^(
  5991     tga_palette_start = tga_palette_len = tga_palette_bits =
  5992           tga_x_origin = tga_y_origin = 0;
  5993     STBI_NOTUSED(tga_palette_start);
  5994     //   OK, done
  5995     return tga_data;
  5996  }
  5997  #endif
  5998  
  5999  // *************************************************************************************************
  6000  // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
  6001  
  6002  #ifndef STBI_NO_PSD
  6003  static int stbi__psd_test(stbi__context *s)
  6004  {
  6005     int r = (stbi__get32be(s) == 0x38425053);
  6006     stbi__rewind(s);
  6007     return r;
  6008  }
  6009  
  6010  static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
  6011  {
  6012     int count, nleft, len;
  6013  
  6014     count = 0;
  6015     while ((nleft = pixelCount - count) > 0) {
  6016        len = stbi__get8(s);
  6017        if (len == 128) {
  6018           // No-op.
  6019        } else if (len < 128) {
  6020           // Copy next len+1 bytes literally.
  6021           len++;
  6022           if (len > nleft) return 0; // corrupt data
  6023           count += len;
  6024           while (len) {
  6025              *p = stbi__get8(s);
  6026              p += 4;
  6027              len--;
  6028           }
  6029        } else if (len > 128) {
  6030           stbi_uc   val;
  6031           // Next -len+1 bytes in the dest are replicated from next source byte.
  6032           // (Interpret len as a negative 8-bit int.)
  6033           len = 257 - len;
  6034           if (len > nleft) return 0; // corrupt data
  6035           val = stbi__get8(s);
  6036           count += len;
  6037           while (len) {
  6038              *p = val;
  6039              p += 4;
  6040              len--;
  6041           }
  6042        }
  6043     }
  6044  
  6045     return 1;
  6046  }
  6047  
  6048  static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
  6049  {
  6050     int pixelCount;
  6051     int channelCount, compression;
  6052     int channel, i;
  6053     int bitdepth;
  6054     int w,h;
  6055     stbi_uc *out;
  6056     STBI_NOTUSED(ri);
  6057  
  6058     // Check identifier
  6059     if (stbi__get32be(s) != 0x38425053)   // "8BPS"
  6060        return stbi__errpuc("not PSD", "Corrupt PSD image");
  6061  
  6062     // Check file type version.
  6063     if (stbi__get16be(s) != 1)
  6064        return stbi__errpuc("wrong version", "Unsupported version of PSD image");
  6065  
  6066     // Skip 6 reserved bytes.
  6067     stbi__skip(s, 6 );
  6068  
  6069     // Read the number of channels (R, G, B, A, etc).
  6070     channelCount = stbi__get16be(s);
  6071     if (channelCount < 0 || channelCount > 16)
  6072        return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
  6073  
  6074     // Read the rows and columns of the image.
  6075     h = stbi__get32be(s);
  6076     w = stbi__get32be(s);
  6077  
  6078     if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  6079     if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  6080  
  6081     // Make sure the depth is 8 bits.
  6082     bitdepth = stbi__get16be(s);
  6083     if (bitdepth != 8 && bitdepth != 16)
  6084        return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
  6085  
  6086     // Make sure the color mode is RGB.
  6087     // Valid options are:
  6088     //   0: Bitmap
  6089     //   1: Grayscale
  6090     //   2: Indexed color
  6091     //   3: RGB color
  6092     //   4: CMYK color
  6093     //   7: Multichannel
  6094     //   8: Duotone
  6095     //   9: Lab color
  6096     if (stbi__get16be(s) != 3)
  6097        return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
  6098  
  6099     // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
  6100     stbi__skip(s,stbi__get32be(s) );
  6101  
  6102     // Skip the image resources.  (resolution, pen tool paths, etc)
  6103     stbi__skip(s, stbi__get32be(s) );
  6104  
  6105     // Skip the reserved data.
  6106     stbi__skip(s, stbi__get32be(s) );
  6107  
  6108     // Find out if the data is compressed.
  6109     // Known values:
  6110     //   0: no compression
  6111     //   1: RLE compressed
  6112     compression = stbi__get16be(s);
  6113     if (compression > 1)
  6114        return stbi__errpuc("bad compression", "PSD has an unknown compression format");
  6115  
  6116     // Check size
  6117     if (!stbi__mad3sizes_valid(4, w, h, 0))
  6118        return stbi__errpuc("too large", "Corrupt PSD");
  6119  
  6120     // Create the destination image.
  6121  
  6122     if (!compression && bitdepth == 16 && bpc == 16) {
  6123        out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
  6124        ri->bits_per_channel = 16;
  6125     } else
  6126        out = (stbi_uc *) stbi__malloc(4 * w*h);
  6127  
  6128     if (!out) return stbi__errpuc("outofmem", "Out of memory");
  6129     pixelCount = w*h;
  6130  
  6131     // Initialize the data to zero.
  6132     //memset( out, 0, pixelCount * 4 );
  6133  
  6134     // Finally, the image data.
  6135     if (compression) {
  6136        // RLE as used by .PSD and .TIFF
  6137        // Loop until you get the number of unpacked bytes you are expecting:
  6138        //     Read the next source byte into n.
  6139        //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
  6140        //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
  6141        //     Else if n is 128, noop.
  6142        // Endloop
  6143  
  6144        // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
  6145        // which we're going to just skip.
  6146        stbi__skip(s, h * channelCount * 2 );
  6147  
  6148        // Read the RLE data by channel.
  6149        for (channel = 0; channel < 4; channel++) {
  6150           stbi_uc *p;
  6151  
  6152           p = out+channel;
  6153           if (channel >= channelCount) {
  6154              // Fill this channel with default data.
  6155              for (i = 0; i < pixelCount; i++, p += 4)
  6156                 *p = (channel == 3 ? 255 : 0);
  6157           } else {
  6158              // Read the RLE data.
  6159              if (!stbi__psd_decode_rle(s, p, pixelCount)) {
  6160                 STBI_FREE(out);
  6161                 return stbi__errpuc("corrupt", "bad RLE data");
  6162              }
  6163           }
  6164        }
  6165  
  6166     } else {
  6167        // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
  6168        // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
  6169  
  6170        // Read the data by channel.
  6171        for (channel = 0; channel < 4; channel++) {
  6172           if (channel >= channelCount) {
  6173              // Fill this channel with default data.
  6174              if (bitdepth == 16 && bpc == 16) {
  6175                 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
  6176                 stbi__uint16 val = channel == 3 ? 65535 : 0;
  6177                 for (i = 0; i < pixelCount; i++, q += 4)
  6178                    *q = val;
  6179              } else {
  6180                 stbi_uc *p = out+channel;
  6181                 stbi_uc val = channel == 3 ? 255 : 0;
  6182                 for (i = 0; i < pixelCount; i++, p += 4)
  6183                    *p = val;
  6184              }
  6185           } else {
  6186              if (ri->bits_per_channel == 16) {    // output bpc
  6187                 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
  6188                 for (i = 0; i < pixelCount; i++, q += 4)
  6189                    *q = (stbi__uint16) stbi__get16be(s);
  6190              } else {
  6191                 stbi_uc *p = out+channel;
  6192                 if (bitdepth == 16) {  // input bpc
  6193                    for (i = 0; i < pixelCount; i++, p += 4)
  6194                       *p = (stbi_uc) (stbi__get16be(s) >> 8);
  6195                 } else {
  6196                    for (i = 0; i < pixelCount; i++, p += 4)
  6197                       *p = stbi__get8(s);
  6198                 }
  6199              }
  6200           }
  6201        }
  6202     }
  6203  
  6204     // remove weird white matte from PSD
  6205     if (channelCount >= 4) {
  6206        if (ri->bits_per_channel == 16) {
  6207           for (i=0; i < w*h; ++i) {
  6208              stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
  6209              if (pixel[3] != 0 && pixel[3] != 65535) {
  6210                 float a = pixel[3] / 65535.0f;
  6211                 float ra = 1.0f / a;
  6212                 float inv_a = 65535.0f * (1 - ra);
  6213                 pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
  6214                 pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
  6215                 pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
  6216              }
  6217           }
  6218        } else {
  6219           for (i=0; i < w*h; ++i) {
  6220              unsigned char *pixel = out + 4*i;
  6221              if (pixel[3] != 0 && pixel[3] != 255) {
  6222                 float a = pixel[3] / 255.0f;
  6223                 float ra = 1.0f / a;
  6224                 float inv_a = 255.0f * (1 - ra);
  6225                 pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
  6226                 pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
  6227                 pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
  6228              }
  6229           }
  6230        }
  6231     }
  6232  
  6233     // convert to desired output format
  6234     if (req_comp && req_comp != 4) {
  6235        if (ri->bits_per_channel == 16)
  6236           out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
  6237        else
  6238           out = stbi__convert_format(out, 4, req_comp, w, h);
  6239        if (out == NULL) return out; // stbi__convert_format frees input on failure
  6240     }
  6241  
  6242     if (comp) *comp = 4;
  6243     *y = h;
  6244     *x = w;
  6245  
  6246     return out;
  6247  }
  6248  #endif
  6249  
  6250  // *************************************************************************************************
  6251  // Softimage PIC loader
  6252  // by Tom Seddon
  6253  //
  6254  // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
  6255  // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
  6256  
  6257  #ifndef STBI_NO_PIC
  6258  static int stbi__pic_is4(stbi__context *s,const char *str)
  6259  {
  6260     int i;
  6261     for (i=0; i<4; ++i)
  6262        if (stbi__get8(s) != (stbi_uc)str[i])
  6263           return 0;
  6264  
  6265     return 1;
  6266  }
  6267  
  6268  static int stbi__pic_test_core(stbi__context *s)
  6269  {
  6270     int i;
  6271  
  6272     if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
  6273        return 0;
  6274  
  6275     for(i=0;i<84;++i)
  6276        stbi__get8(s);
  6277  
  6278     if (!stbi__pic_is4(s,"PICT"))
  6279        return 0;
  6280  
  6281     return 1;
  6282  }
  6283  
  6284  typedef struct
  6285  {
  6286     stbi_uc size,type,channel;
  6287  } stbi__pic_packet;
  6288  
  6289  static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
  6290  {
  6291     int mask=0x80, i;
  6292  
  6293     for (i=0; i<4; ++i, mask>>=1) {
  6294        if (channel & mask) {
  6295           if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
  6296           dest[i]=stbi__get8(s);
  6297        }
  6298     }
  6299  
  6300     return dest;
  6301  }
  6302  
  6303  static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
  6304  {
  6305     int mask=0x80,i;
  6306  
  6307     for (i=0;i<4; ++i, mask>>=1)
  6308        if (channel&mask)
  6309           dest[i]=src[i];
  6310  }
  6311  
  6312  static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
  6313  {
  6314     int act_comp=0,num_packets=0,y,chained;
  6315     stbi__pic_packet packets[10];
  6316  
  6317     // this will (should...) cater for even some bizarre stuff like having data
  6318      // for the same channel in multiple packets.
  6319     do {
  6320        stbi__pic_packet *packet;
  6321  
  6322        if (num_packets==sizeof(packets)/sizeof(packets[0]))
  6323           return stbi__errpuc("bad format","too many packets");
  6324  
  6325        packet = &packets[num_packets++];
  6326  
  6327        chained = stbi__get8(s);
  6328        packet->size    = stbi__get8(s);
  6329        packet->type    = stbi__get8(s);
  6330        packet->channel = stbi__get8(s);
  6331  
  6332        act_comp |= packet->channel;
  6333  
  6334        if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
  6335        if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
  6336     } while (chained);
  6337  
  6338     *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
  6339  
  6340     for(y=0; y<height; ++y) {
  6341        int packet_idx;
  6342  
  6343        for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
  6344           stbi__pic_packet *packet = &packets[packet_idx];
  6345           stbi_uc *dest = result+y*width*4;
  6346  
  6347           switch (packet->type) {
  6348              default:
  6349                 return stbi__errpuc("bad format","packet has bad compression type");
  6350  
  6351              case 0: {//uncompressed
  6352                 int x;
  6353  
  6354                 for(x=0;x<width;++x, dest+=4)
  6355                    if (!stbi__readval(s,packet->channel,dest))
  6356                       return 0;
  6357                 break;
  6358              }
  6359  
  6360              case 1://Pure RLE
  6361                 {
  6362                    int left=width, i;
  6363  
  6364                    while (left>0) {
  6365                       stbi_uc count,value[4];
  6366  
  6367                       count=stbi__get8(s);
  6368                       if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
  6369  
  6370                       if (count > left)
  6371                          count = (stbi_uc) left;
  6372  
  6373                       if (!stbi__readval(s,packet->channel,value))  return 0;
  6374  
  6375                       for(i=0; i<count; ++i,dest+=4)
  6376                          stbi__copyval(packet->channel,dest,value);
  6377                       left -= count;
  6378                    }
  6379                 }
  6380                 break;
  6381  
  6382              case 2: {//Mixed RLE
  6383                 int left=width;
  6384                 while (left>0) {
  6385                    int count = stbi__get8(s), i;
  6386                    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
  6387  
  6388                    if (count >= 128) { // Repeated
  6389                       stbi_uc value[4];
  6390  
  6391                       if (count==128)
  6392                          count = stbi__get16be(s);
  6393                       else
  6394                          count -= 127;
  6395                       if (count > left)
  6396                          return stbi__errpuc("bad file","scanline overrun");
  6397  
  6398                       if (!stbi__readval(s,packet->channel,value))
  6399                          return 0;
  6400  
  6401                       for(i=0;i<count;++i, dest += 4)
  6402                          stbi__copyval(packet->channel,dest,value);
  6403                    } else { // Raw
  6404                       ++count;
  6405                       if (count>left) return stbi__errpuc("bad file","scanline overrun");
  6406  
  6407                       for(i=0;i<count;++i, dest+=4)
  6408                          if (!stbi__readval(s,packet->channel,dest))
  6409                             return 0;
  6410                    }
  6411                    left-=count;
  6412                 }
  6413                 break;
  6414              }
  6415           }
  6416        }
  6417     }
  6418  
  6419     return result;
  6420  }
  6421  
  6422  static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
  6423  {
  6424     stbi_uc *result;
  6425     int i, x,y, internal_comp;
  6426     STBI_NOTUSED(ri);
  6427  
  6428     if (!comp) comp = &internal_comp;
  6429  
  6430     for (i=0; i<92; ++i)
  6431        stbi__get8(s);
  6432  
  6433     x = stbi__get16be(s);
  6434     y = stbi__get16be(s);
  6435  
  6436     if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  6437     if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  6438  
  6439     if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
  6440     if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
  6441  
  6442     stbi__get32be(s); //skip `ratio'
  6443     stbi__get16be(s); //skip `fields'
  6444     stbi__get16be(s); //skip `pad'
  6445  
  6446     // intermediate buffer is RGBA
  6447     result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
  6448     if (!result) return stbi__errpuc("outofmem", "Out of memory");
  6449     memset(result, 0xff, x*y*4);
  6450  
  6451     if (!stbi__pic_load_core(s,x,y,comp, result)) {
  6452        STBI_FREE(result);
  6453        result=0;
  6454     }
  6455     *px = x;
  6456     *py = y;
  6457     if (req_comp == 0) req_comp = *comp;
  6458     result=stbi__convert_format(result,4,req_comp,x,y);
  6459  
  6460     return result;
  6461  }
  6462  
  6463  static int stbi__pic_test(stbi__context *s)
  6464  {
  6465     int r = stbi__pic_test_core(s);
  6466     stbi__rewind(s);
  6467     return r;
  6468  }
  6469  #endif
  6470  
  6471  // *************************************************************************************************
  6472  // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
  6473  
  6474  #ifndef STBI_NO_GIF
  6475  typedef struct
  6476  {
  6477     stbi__int16 prefix;
  6478     stbi_uc first;
  6479     stbi_uc suffix;
  6480  } stbi__gif_lzw;
  6481  
  6482  typedef struct
  6483  {
  6484     int w,h;
  6485     stbi_uc *out;                 // output buffer (always 4 components)
  6486     stbi_uc *background;          // The current "background" as far as a gif is concerned
  6487     stbi_uc *history;
  6488     int flags, bgindex, ratio, transparent, eflags;
  6489     stbi_uc  pal[256][4];
  6490     stbi_uc lpal[256][4];
  6491     stbi__gif_lzw codes[8192];
  6492     stbi_uc *color_table;
  6493     int parse, step;
  6494     int lflags;
  6495     int start_x, start_y;
  6496     int max_x, max_y;
  6497     int cur_x, cur_y;
  6498     int line_size;
  6499     int delay;
  6500  } stbi__gif;
  6501  
  6502  static int stbi__gif_test_raw(stbi__context *s)
  6503  {
  6504     int sz;
  6505     if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
  6506     sz = stbi__get8(s);
  6507     if (sz != '9' && sz != '7') return 0;
  6508     if (stbi__get8(s) != 'a') return 0;
  6509     return 1;
  6510  }
  6511  
  6512  static int stbi__gif_test(stbi__context *s)
  6513  {
  6514     int r = stbi__gif_test_raw(s);
  6515     stbi__rewind(s);
  6516     return r;
  6517  }
  6518  
  6519  static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
  6520  {
  6521     int i;
  6522     for (i=0; i < num_entries; ++i) {
  6523        pal[i][2] = stbi__get8(s);
  6524        pal[i][1] = stbi__get8(s);
  6525        pal[i][0] = stbi__get8(s);
  6526        pal[i][3] = transp == i ? 0 : 255;
  6527     }
  6528  }
  6529  
  6530  static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
  6531  {
  6532     stbi_uc version;
  6533     if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
  6534        return stbi__err("not GIF", "Corrupt GIF");
  6535  
  6536     version = stbi__get8(s);
  6537     if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
  6538     if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
  6539  
  6540     stbi__g_failure_reason = "";
  6541     g->w = stbi__get16le(s);
  6542     g->h = stbi__get16le(s);
  6543     g->flags = stbi__get8(s);
  6544     g->bgindex = stbi__get8(s);
  6545     g->ratio = stbi__get8(s);
  6546     g->transparent = -1;
  6547  
  6548     if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  6549     if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
  6550  
  6551     if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
  6552  
  6553     if (is_info) return 1;
  6554  
  6555     if (g->flags & 0x80)
  6556        stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
  6557  
  6558     return 1;
  6559  }
  6560  
  6561  static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
  6562  {
  6563     stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
  6564     if (!g) return stbi__err("outofmem", "Out of memory");
  6565     if (!stbi__gif_header(s, g, comp, 1)) {
  6566        STBI_FREE(g);
  6567        stbi__rewind( s );
  6568        return 0;
  6569     }
  6570     if (x) *x = g->w;
  6571     if (y) *y = g->h;
  6572     STBI_FREE(g);
  6573     return 1;
  6574  }
  6575  
  6576  static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
  6577  {
  6578     stbi_uc *p, *c;
  6579     int idx;
  6580  
  6581     // recurse to decode the prefixes, since the linked-list is backwards,
  6582     // and working backwards through an interleaved image would be nasty
  6583     if (g->codes[code].prefix >= 0)
  6584        stbi__out_gif_code(g, g->codes[code].prefix);
  6585  
  6586     if (g->cur_y >= g->max_y) return;
  6587  
  6588     idx = g->cur_x + g->cur_y;
  6589     p = &g->out[idx];
  6590     g->history[idx / 4] = 1;
  6591  
  6592     c = &g->color_table[g->codes[code].suffix * 4];
  6593     if (c[3] > 128) { // don't render transparent pixels;
  6594        p[0] = c[2];
  6595        p[1] = c[1];
  6596        p[2] = c[0];
  6597        p[3] = c[3];
  6598     }
  6599     g->cur_x += 4;
  6600  
  6601     if (g->cur_x >= g->max_x) {
  6602        g->cur_x = g->start_x;
  6603        g->cur_y += g->step;
  6604  
  6605        while (g->cur_y >= g->max_y && g->parse > 0) {
  6606           g->step = (1 << g->parse) * g->line_size;
  6607           g->cur_y = g->start_y + (g->step >> 1);
  6608           --g->parse;
  6609        }
  6610     }
  6611  }
  6612  
  6613  static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
  6614  {
  6615     stbi_uc lzw_cs;
  6616     stbi__int32 len, init_code;
  6617     stbi__uint32 first;
  6618     stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
  6619     stbi__gif_lzw *p;
  6620  
  6621     lzw_cs = stbi__get8(s);
  6622     if (lzw_cs > 12) return NULL;
  6623     clear = 1 << lzw_cs;
  6624     first = 1;
  6625     codesize = lzw_cs + 1;
  6626     codemask = (1 << codesize) - 1;
  6627     bits = 0;
  6628     valid_bits = 0;
  6629     for (init_code = 0; init_code < clear; init_code++) {
  6630        g->codes[init_code].prefix = -1;
  6631        g->codes[init_code].first = (stbi_uc) init_code;
  6632        g->codes[init_code].suffix = (stbi_uc) init_code;
  6633     }
  6634  
  6635     // support no starting clear code
  6636     avail = clear+2;
  6637     oldcode = -1;
  6638  
  6639     len = 0;
  6640     for(;;) {
  6641        if (valid_bits < codesize) {
  6642           if (len == 0) {
  6643              len = stbi__get8(s); // start new block
  6644              if (len == 0)
  6645                 return g->out;
  6646           }
  6647           --len;
  6648           bits |= (stbi__int32) stbi__get8(s) << valid_bits;
  6649           valid_bits += 8;
  6650        } else {
  6651           stbi__int32 code = bits & codemask;
  6652           bits >>= codesize;
  6653           valid_bits -= codesize;
  6654           // @OPTIMIZE: is there some way we can accelerate the non-clear path?
  6655           if (code == clear) {  // clear code
  6656              codesize = lzw_cs + 1;
  6657              codemask = (1 << codesize) - 1;
  6658              avail = clear + 2;
  6659              oldcode = -1;
  6660              first = 0;
  6661           } else if (code == clear + 1) { // end of stream code
  6662              stbi__skip(s, len);
  6663              while ((len = stbi__get8(s)) > 0)
  6664                 stbi__skip(s,len);
  6665              return g->out;
  6666           } else if (code <= avail) {
  6667              if (first) {
  6668                 return stbi__errpuc("no clear code", "Corrupt GIF");
  6669              }
  6670  
  6671              if (oldcode >= 0) {
  6672                 p = &g->codes[avail++];
  6673                 if (avail > 8192) {
  6674                    return stbi__errpuc("too many codes", "Corrupt GIF");
  6675                 }
  6676  
  6677                 p->prefix = (stbi__int16) oldcode;
  6678                 p->first = g->codes[oldcode].first;
  6679                 p->suffix = (code == avail) ? p->first : g->codes[code].first;
  6680              } else if (code == avail)
  6681                 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
  6682  
  6683              stbi__out_gif_code(g, (stbi__uint16) code);
  6684  
  6685              if ((avail & codemask) == 0 && avail <= 0x0FFF) {
  6686                 codesize++;
  6687                 codemask = (1 << codesize) - 1;
  6688              }
  6689  
  6690              oldcode = code;
  6691           } else {
  6692              return stbi__errpuc("illegal code in raster", "Corrupt GIF");
  6693           }
  6694        }
  6695     }
  6696  }
  6697  
  6698  // this function is designed to support animated gifs, although stb_image doesn't support it
  6699  // two back is the image from two frames ago, used for a very specific disposal format
  6700  static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
  6701  {
  6702     int dispose;
  6703     int first_frame;
  6704     int pi;
  6705     int pcount;
  6706     STBI_NOTUSED(req_comp);
  6707  
  6708     // on first frame, any non-written pixels get the background colour (non-transparent)
  6709     first_frame = 0;
  6710     if (g->out == 0) {
  6711        if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
  6712        if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
  6713           return stbi__errpuc("too large", "GIF image is too large");
  6714        pcount = g->w * g->h;
  6715        g->out = (stbi_uc *) stbi__malloc(4 * pcount);
  6716        g->background = (stbi_uc *) stbi__malloc(4 * pcount);
  6717        g->history = (stbi_uc *) stbi__malloc(pcount);
  6718        if (!g->out || !g->background || !g->history)
  6719           return stbi__errpuc("outofmem", "Out of memory");
  6720  
  6721        // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
  6722        // background colour is only used for pixels that are not rendered first frame, after that "background"
  6723        // color refers to the color that was there the previous frame.
  6724        memset(g->out, 0x00, 4 * pcount);
  6725        memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
  6726        memset(g->history, 0x00, pcount);        // pixels that were affected previous frame
  6727        first_frame = 1;
  6728     } else {
  6729        // second frame - how do we dispose of the previous one?
  6730        dispose = (g->eflags & 0x1C) >> 2;
  6731        pcount = g->w * g->h;
  6732  
  6733        if ((dispose == 3) && (two_back == 0)) {
  6734           dispose = 2; // if I don't have an image to revert back to, default to the old background
  6735        }
  6736  
  6737        if (dispose == 3) { // use previous graphic
  6738           for (pi = 0; pi < pcount; ++pi) {
  6739              if (g->history[pi]) {
  6740                 memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
  6741              }
  6742           }
  6743        } else if (dispose == 2) {
  6744           // restore what was changed last frame to background before that frame;
  6745           for (pi = 0; pi < pcount; ++pi) {
  6746              if (g->history[pi]) {
  6747                 memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
  6748              }
  6749           }
  6750        } else {
  6751           // This is a non-disposal case eithe way, so just
  6752           // leave the pixels as is, and they will become the new background
  6753           // 1: do not dispose
  6754           // 0:  not specified.
  6755        }
  6756  
  6757        // background is what out is after the undoing of the previou frame;
  6758        memcpy( g->background, g->out, 4 * g->w * g->h );
  6759     }
  6760  
  6761     // clear my history;
  6762     memset( g->history, 0x00, g->w * g->h );        // pixels that were affected previous frame
  6763  
  6764     for (;;) {
  6765        int tag = stbi__get8(s);
  6766        switch (tag) {
  6767           case 0x2C: /* Image Descriptor */
  6768           {
  6769              stbi__int32 x, y, w, h;
  6770              stbi_uc *o;
  6771  
  6772              x = stbi__get16le(s);
  6773              y = stbi__get16le(s);
  6774              w = stbi__get16le(s);
  6775              h = stbi__get16le(s);
  6776              if (((x + w) > (g->w)) || ((y + h) > (g->h)))
  6777                 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
  6778  
  6779              g->line_size = g->w * 4;
  6780              g->start_x = x * 4;
  6781              g->start_y = y * g->line_size;
  6782              g->max_x   = g->start_x + w * 4;
  6783              g->max_y   = g->start_y + h * g->line_size;
  6784              g->cur_x   = g->start_x;
  6785              g->cur_y   = g->start_y;
  6786  
  6787              // if the width of the specified rectangle is 0, that means
  6788              // we may not see *any* pixels or the image is malformed;
  6789              // to make sure this is caught, move the current y down to
  6790              // max_y (which is what out_gif_code checks).
  6791              if (w == 0)
  6792                 g->cur_y = g->max_y;
  6793  
  6794              g->lflags = stbi__get8(s);
  6795  
  6796              if (g->lflags & 0x40) {
  6797                 g->step = 8 * g->line_size; // first interlaced spacing
  6798                 g->parse = 3;
  6799              } else {
  6800                 g->step = g->line_size;
  6801                 g->parse = 0;
  6802              }
  6803  
  6804              if (g->lflags & 0x80) {
  6805                 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
  6806                 g->color_table = (stbi_uc *) g->lpal;
  6807              } else if (g->flags & 0x80) {
  6808                 g->color_table = (stbi_uc *) g->pal;
  6809              } else
  6810                 return stbi__errpuc("missing color table", "Corrupt GIF");
  6811  
  6812              o = stbi__process_gif_raster(s, g);
  6813              if (!o) return NULL;
  6814  
  6815              // if this was the first frame,
  6816              pcount = g->w * g->h;
  6817              if (first_frame && (g->bgindex > 0)) {
  6818                 // if first frame, any pixel not drawn to gets the background color
  6819                 for (pi = 0; pi < pcount; ++pi) {
  6820                    if (g->history[pi] == 0) {
  6821                       g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
  6822                       memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
  6823                    }
  6824                 }
  6825              }
  6826  
  6827              return o;
  6828           }
  6829  
  6830           case 0x21: // Comment Extension.
  6831           {
  6832              int len;
  6833              int ext = stbi__get8(s);
  6834              if (ext == 0xF9) { // Graphic Control Extension.
  6835                 len = stbi__get8(s);
  6836                 if (len == 4) {
  6837                    g->eflags = stbi__get8(s);
  6838                    g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
  6839  
  6840                    // unset old transparent
  6841                    if (g->transparent >= 0) {
  6842                       g->pal[g->transparent][3] = 255;
  6843                    }
  6844                    if (g->eflags & 0x01) {
  6845                       g->transparent = stbi__get8(s);
  6846                       if (g->transparent >= 0) {
  6847                          g->pal[g->transparent][3] = 0;
  6848                       }
  6849                    } else {
  6850                       // don't need transparent
  6851                       stbi__skip(s, 1);
  6852                       g->transparent = -1;
  6853                    }
  6854                 } else {
  6855                    stbi__skip(s, len);
  6856                    break;
  6857                 }
  6858              }
  6859              while ((len = stbi__get8(s)) != 0) {
  6860                 stbi__skip(s, len);
  6861              }
  6862              break;
  6863           }
  6864  
  6865           case 0x3B: // gif stream termination code
  6866              return (stbi_uc *) s; // using '1' causes warning on some compilers
  6867  
  6868           default:
  6869              return stbi__errpuc("unknown code", "Corrupt GIF");
  6870        }
  6871     }
  6872  }
  6873  
  6874  static void *stbi__load_gif_main_outofmem(stbi__gif *g, stbi_uc *out, int **delays)
  6875  {
  6876     STBI_FREE(g->out);
  6877     STBI_FREE(g->history);
  6878     STBI_FREE(g->background);
  6879  
  6880     if (out) STBI_FREE(out);
  6881     if (delays && *delays) STBI_FREE(*delays);
  6882     return stbi__errpuc("outofmem", "Out of memory");
  6883  }
  6884  
  6885  static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
  6886  {
  6887     if (stbi__gif_test(s)) {
  6888        int layers = 0;
  6889        stbi_uc *u = 0;
  6890        stbi_uc *out = 0;
  6891        stbi_uc *two_back = 0;
  6892        stbi__gif g;
  6893        int stride;
  6894        int out_size = 0;
  6895        int delays_size = 0;
  6896  
  6897        STBI_NOTUSED(out_size);
  6898        STBI_NOTUSED(delays_size);
  6899  
  6900        memset(&g, 0, sizeof(g));
  6901        if (delays) {
  6902           *delays = 0;
  6903        }
  6904  
  6905        do {
  6906           u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
  6907           if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
  6908  
  6909           if (u) {
  6910              *x = g.w;
  6911              *y = g.h;
  6912              ++layers;
  6913              stride = g.w * g.h * 4;
  6914  
  6915              if (out) {
  6916                 void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride );
  6917                 if (!tmp)
  6918                    return stbi__load_gif_main_outofmem(&g, out, delays);
  6919                 else {
  6920                     out = (stbi_uc*) tmp;
  6921                     out_size = layers * stride;
  6922                 }
  6923  
  6924                 if (delays) {
  6925                    int *new_delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers );
  6926                    if (!new_delays)
  6927                       return stbi__load_gif_main_outofmem(&g, out, delays);
  6928                    *delays = new_delays;
  6929                    delays_size = layers * sizeof(int);
  6930                 }
  6931              } else {
  6932                 out = (stbi_uc*)stbi__malloc( layers * stride );
  6933                 if (!out)
  6934                    return stbi__load_gif_main_outofmem(&g, out, delays);
  6935                 out_size = layers * stride;
  6936                 if (delays) {
  6937                    *delays = (int*) stbi__malloc( layers * sizeof(int) );
  6938                    if (!*delays)
  6939                       return stbi__load_gif_main_outofmem(&g, out, delays);
  6940                    delays_size = layers * sizeof(int);
  6941                 }
  6942              }
  6943              memcpy( out + ((layers - 1) * stride), u, stride );
  6944              if (layers >= 2) {
  6945                 two_back = out - 2 * stride;
  6946              }
  6947  
  6948              if (delays) {
  6949                 (*delays)[layers - 1U] = g.delay;
  6950              }
  6951           }
  6952        } while (u != 0);
  6953  
  6954        // free temp buffer;
  6955        STBI_FREE(g.out);
  6956        STBI_FREE(g.history);
  6957        STBI_FREE(g.background);
  6958  
  6959        // do the final conversion after loading everything;
  6960        if (req_comp && req_comp != 4)
  6961           out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
  6962  
  6963        *z = layers;
  6964        return out;
  6965     } else {
  6966        return stbi__errpuc("not GIF", "Image was not as a gif type.");
  6967     }
  6968  }
  6969  
  6970  static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  6971  {
  6972     stbi_uc *u = 0;
  6973     stbi__gif g;
  6974     memset(&g, 0, sizeof(g));
  6975     STBI_NOTUSED(ri);
  6976  
  6977     u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
  6978     if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
  6979     if (u) {
  6980        *x = g.w;
  6981        *y = g.h;
  6982  
  6983        // moved conversion to after successful load so that the same
  6984        // can be done for multiple frames.
  6985        if (req_comp && req_comp != 4)
  6986           u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
  6987     } else if (g.out) {
  6988        // if there was an error and we allocated an image buffer, free it!
  6989        STBI_FREE(g.out);
  6990     }
  6991  
  6992     // free buffers needed for multiple frame loading;
  6993     STBI_FREE(g.history);
  6994     STBI_FREE(g.background);
  6995  
  6996     return u;
  6997  }
  6998  
  6999  static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
  7000  {
  7001     return stbi__gif_info_raw(s,x,y,comp);
  7002  }
  7003  #endif
  7004  
  7005  // *************************************************************************************************
  7006  // Radiance RGBE HDR loader
  7007  // originally by Nicolas Schulz
  7008  #ifndef STBI_NO_HDR
  7009  static int stbi__hdr_test_core(stbi__context *s, const char *signature)
  7010  {
  7011     int i;
  7012     for (i=0; signature[i]; ++i)
  7013        if (stbi__get8(s) != signature[i])
  7014            return 0;
  7015     stbi__rewind(s);
  7016     return 1;
  7017  }
  7018  
  7019  static int stbi__hdr_test(stbi__context* s)
  7020  {
  7021     int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
  7022     stbi__rewind(s);
  7023     if(!r) {
  7024         r = stbi__hdr_test_core(s, "#?RGBE\n");
  7025         stbi__rewind(s);
  7026     }
  7027     return r;
  7028  }
  7029  
  7030  #define STBI__HDR_BUFLEN  1024
  7031  static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
  7032  {
  7033     int len=0;
  7034     char c = '\0';
  7035  
  7036     c = (char) stbi__get8(z);
  7037  
  7038     while (!stbi__at_eof(z) && c != '\n') {
  7039        buffer[len++] = c;
  7040        if (len == STBI__HDR_BUFLEN-1) {
  7041           // flush to end of line
  7042           while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
  7043              ;
  7044           break;
  7045        }
  7046        c = (char) stbi__get8(z);
  7047     }
  7048  
  7049     buffer[len] = 0;
  7050     return buffer;
  7051  }
  7052  
  7053  static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
  7054  {
  7055     if ( input[3] != 0 ) {
  7056        float f1;
  7057        // Exponent
  7058        f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
  7059        if (req_comp <= 2)
  7060           output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
  7061        else {
  7062           output[0] = input[0] * f1;
  7063           output[1] = input[1] * f1;
  7064           output[2] = input[2] * f1;
  7065        }
  7066        if (req_comp == 2) output[1] = 1;
  7067        if (req_comp == 4) output[3] = 1;
  7068     } else {
  7069        switch (req_comp) {
  7070           case 4: output[3] = 1; /* fallthrough */
  7071           case 3: output[0] = output[1] = output[2] = 0;
  7072                   break;
  7073           case 2: output[1] = 1; /* fallthrough */
  7074           case 1: output[0] = 0;
  7075                   break;
  7076        }
  7077     }
  7078  }
  7079  
  7080  static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  7081  {
  7082     char buffer[STBI__HDR_BUFLEN];
  7083     char *token;
  7084     int valid = 0;
  7085     int width, height;
  7086     stbi_uc *scanline;
  7087     float *hdr_data;
  7088     int len;
  7089     unsigned char count, value;
  7090     int i, j, k, c1,c2, z;
  7091     const char *headerToken;
  7092     STBI_NOTUSED(ri);
  7093  
  7094     // Check identifier
  7095     headerToken = stbi__hdr_gettoken(s,buffer);
  7096     if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
  7097        return stbi__errpf("not HDR", "Corrupt HDR image");
  7098  
  7099     // Parse header
  7100     for(;;) {
  7101        token = stbi__hdr_gettoken(s,buffer);
  7102        if (token[0] == 0) break;
  7103        if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
  7104     }
  7105  
  7106     if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
  7107  
  7108     // Parse width and height
  7109     // can't use sscanf() if we're not using stdio!
  7110     token = stbi__hdr_gettoken(s,buffer);
  7111     if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
  7112     token += 3;
  7113     height = (int) strtol(token, &token, 10);
  7114     while (*token == ' ') ++token;
  7115     if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
  7116     token += 3;
  7117     width = (int) strtol(token, NULL, 10);
  7118  
  7119     if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
  7120     if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
  7121  
  7122     *x = width;
  7123     *y = height;
  7124  
  7125     if (comp) *comp = 3;
  7126     if (req_comp == 0) req_comp = 3;
  7127  
  7128     if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
  7129        return stbi__errpf("too large", "HDR image is too large");
  7130  
  7131     // Read data
  7132     hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
  7133     if (!hdr_data)
  7134        return stbi__errpf("outofmem", "Out of memory");
  7135  
  7136     // Load image data
  7137     // image data is stored as some number of sca
  7138     if ( width < 8 || width >= 32768) {
  7139        // Read flat data
  7140        for (j=0; j < height; ++j) {
  7141           for (i=0; i < width; ++i) {
  7142              stbi_uc rgbe[4];
  7143             main_decode_loop:
  7144              stbi__getn(s, rgbe, 4);
  7145              stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
  7146           }
  7147        }
  7148     } else {
  7149        // Read RLE-encoded data
  7150        scanline = NULL;
  7151  
  7152        for (j = 0; j < height; ++j) {
  7153           c1 = stbi__get8(s);
  7154           c2 = stbi__get8(s);
  7155           len = stbi__get8(s);
  7156           if (c1 != 2 || c2 != 2 || (len & 0x80)) {
  7157              // not run-length encoded, so we have to actually use THIS data as a decoded
  7158              // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
  7159              stbi_uc rgbe[4];
  7160              rgbe[0] = (stbi_uc) c1;
  7161              rgbe[1] = (stbi_uc) c2;
  7162              rgbe[2] = (stbi_uc) len;
  7163              rgbe[3] = (stbi_uc) stbi__get8(s);
  7164              stbi__hdr_convert(hdr_data, rgbe, req_comp);
  7165              i = 1;
  7166              j = 0;
  7167              STBI_FREE(scanline);
  7168              goto main_decode_loop; // yes, this makes no sense
  7169           }
  7170           len <<= 8;
  7171           len |= stbi__get8(s);
  7172           if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
  7173           if (scanline == NULL) {
  7174              scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
  7175              if (!scanline) {
  7176                 STBI_FREE(hdr_data);
  7177                 return stbi__errpf("outofmem", "Out of memory");
  7178              }
  7179           }
  7180  
  7181           for (k = 0; k < 4; ++k) {
  7182              int nleft;
  7183              i = 0;
  7184              while ((nleft = width - i) > 0) {
  7185                 count = stbi__get8(s);
  7186                 if (count > 128) {
  7187                    // Run
  7188                    value = stbi__get8(s);
  7189                    count -= 128;
  7190                    if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
  7191                    for (z = 0; z < count; ++z)
  7192                       scanline[i++ * 4 + k] = value;
  7193                 } else {
  7194                    // Dump
  7195                    if (count > nleft) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
  7196                    for (z = 0; z < count; ++z)
  7197                       scanline[i++ * 4 + k] = stbi__get8(s);
  7198                 }
  7199              }
  7200           }
  7201           for (i=0; i < width; ++i)
  7202              stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
  7203        }
  7204        if (scanline)
  7205           STBI_FREE(scanline);
  7206     }
  7207  
  7208     return hdr_data;
  7209  }
  7210  
  7211  static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
  7212  {
  7213     char buffer[STBI__HDR_BUFLEN];
  7214     char *token;
  7215     int valid = 0;
  7216     int dummy;
  7217  
  7218     if (!x) x = &dummy;
  7219     if (!y) y = &dummy;
  7220     if (!comp) comp = &dummy;
  7221  
  7222     if (stbi__hdr_test(s) == 0) {
  7223         stbi__rewind( s );
  7224         return 0;
  7225     }
  7226  
  7227     for(;;) {
  7228        token = stbi__hdr_gettoken(s,buffer);
  7229        if (token[0] == 0) break;
  7230        if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
  7231     }
  7232  
  7233     if (!valid) {
  7234         stbi__rewind( s );
  7235         return 0;
  7236     }
  7237     token = stbi__hdr_gettoken(s,buffer);
  7238     if (strncmp(token, "-Y ", 3)) {
  7239         stbi__rewind( s );
  7240         return 0;
  7241     }
  7242     token += 3;
  7243     *y = (int) strtol(token, &token, 10);
  7244     while (*token == ' ') ++token;
  7245     if (strncmp(token, "+X ", 3)) {
  7246         stbi__rewind( s );
  7247         return 0;
  7248     }
  7249     token += 3;
  7250     *x = (int) strtol(token, NULL, 10);
  7251     *comp = 3;
  7252     return 1;
  7253  }
  7254  #endif // STBI_NO_HDR
  7255  
  7256  #ifndef STBI_NO_BMP
  7257  static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
  7258  {
  7259     void *p;
  7260     stbi__bmp_data info;
  7261  
  7262     info.all_a = 255;
  7263     p = stbi__bmp_parse_header(s, &info);
  7264     if (p == NULL) {
  7265        stbi__rewind( s );
  7266        return 0;
  7267     }
  7268     if (x) *x = s->img_x;
  7269     if (y) *y = s->img_y;
  7270     if (comp) {
  7271        if (info.bpp == 24 && info.ma == 0xff000000)
  7272           *comp = 3;
  7273        else
  7274           *comp = info.ma ? 4 : 3;
  7275     }
  7276     return 1;
  7277  }
  7278  #endif
  7279  
  7280  #ifndef STBI_NO_PSD
  7281  static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
  7282  {
  7283     int channelCount, dummy, depth;
  7284     if (!x) x = &dummy;
  7285     if (!y) y = &dummy;
  7286     if (!comp) comp = &dummy;
  7287     if (stbi__get32be(s) != 0x38425053) {
  7288         stbi__rewind( s );
  7289         return 0;
  7290     }
  7291     if (stbi__get16be(s) != 1) {
  7292         stbi__rewind( s );
  7293         return 0;
  7294     }
  7295     stbi__skip(s, 6);
  7296     channelCount = stbi__get16be(s);
  7297     if (channelCount < 0 || channelCount > 16) {
  7298         stbi__rewind( s );
  7299         return 0;
  7300     }
  7301     *y = stbi__get32be(s);
  7302     *x = stbi__get32be(s);
  7303     depth = stbi__get16be(s);
  7304     if (depth != 8 && depth != 16) {
  7305         stbi__rewind( s );
  7306         return 0;
  7307     }
  7308     if (stbi__get16be(s) != 3) {
  7309         stbi__rewind( s );
  7310         return 0;
  7311     }
  7312     *comp = 4;
  7313     return 1;
  7314  }
  7315  
  7316  static int stbi__psd_is16(stbi__context *s)
  7317  {
  7318     int channelCount, depth;
  7319     if (stbi__get32be(s) != 0x38425053) {
  7320         stbi__rewind( s );
  7321         return 0;
  7322     }
  7323     if (stbi__get16be(s) != 1) {
  7324         stbi__rewind( s );
  7325         return 0;
  7326     }
  7327     stbi__skip(s, 6);
  7328     channelCount = stbi__get16be(s);
  7329     if (channelCount < 0 || channelCount > 16) {
  7330         stbi__rewind( s );
  7331         return 0;
  7332     }
  7333     STBI_NOTUSED(stbi__get32be(s));
  7334     STBI_NOTUSED(stbi__get32be(s));
  7335     depth = stbi__get16be(s);
  7336     if (depth != 16) {
  7337         stbi__rewind( s );
  7338         return 0;
  7339     }
  7340     return 1;
  7341  }
  7342  #endif
  7343  
  7344  #ifndef STBI_NO_PIC
  7345  static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
  7346  {
  7347     int act_comp=0,num_packets=0,chained,dummy;
  7348     stbi__pic_packet packets[10];
  7349  
  7350     if (!x) x = &dummy;
  7351     if (!y) y = &dummy;
  7352     if (!comp) comp = &dummy;
  7353  
  7354     if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
  7355        stbi__rewind(s);
  7356        return 0;
  7357     }
  7358  
  7359     stbi__skip(s, 88);
  7360  
  7361     *x = stbi__get16be(s);
  7362     *y = stbi__get16be(s);
  7363     if (stbi__at_eof(s)) {
  7364        stbi__rewind( s);
  7365        return 0;
  7366     }
  7367     if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
  7368        stbi__rewind( s );
  7369        return 0;
  7370     }
  7371  
  7372     stbi__skip(s, 8);
  7373  
  7374     do {
  7375        stbi__pic_packet *packet;
  7376  
  7377        if (num_packets==sizeof(packets)/sizeof(packets[0]))
  7378           return 0;
  7379  
  7380        packet = &packets[num_packets++];
  7381        chained = stbi__get8(s);
  7382        packet->size    = stbi__get8(s);
  7383        packet->type    = stbi__get8(s);
  7384        packet->channel = stbi__get8(s);
  7385        act_comp |= packet->channel;
  7386  
  7387        if (stbi__at_eof(s)) {
  7388            stbi__rewind( s );
  7389            return 0;
  7390        }
  7391        if (packet->size != 8) {
  7392            stbi__rewind( s );
  7393            return 0;
  7394        }
  7395     } while (chained);
  7396  
  7397     *comp = (act_comp & 0x10 ? 4 : 3);
  7398  
  7399     return 1;
  7400  }
  7401  #endif
  7402  
  7403  // *************************************************************************************************
  7404  // Portable Gray Map and Portable Pixel Map loader
  7405  // by Ken Miller
  7406  //
  7407  // PGM: http://netpbm.sourceforge.net/doc/pgm.html
  7408  // PPM: http://netpbm.sourceforge.net/doc/ppm.html
  7409  //
  7410  // Known limitations:
  7411  //    Does not support comments in the header section
  7412  //    Does not support ASCII image data (formats P2 and P3)
  7413  
  7414  #ifndef STBI_NO_PNM
  7415  
  7416  static int      stbi__pnm_test(stbi__context *s)
  7417  {
  7418     char p, t;
  7419     p = (char) stbi__get8(s);
  7420     t = (char) stbi__get8(s);
  7421     if (p != 'P' || (t != '5' && t != '6')) {
  7422         stbi__rewind( s );
  7423         return 0;
  7424     }
  7425     return 1;
  7426  }
  7427  
  7428  static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
  7429  {
  7430     stbi_uc *out;
  7431     STBI_NOTUSED(ri);
  7432  
  7433     ri->bits_per_channel = stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n);
  7434     if (ri->bits_per_channel == 0)
  7435        return 0;
  7436  
  7437     if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  7438     if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
  7439  
  7440     *x = s->img_x;
  7441     *y = s->img_y;
  7442     if (comp) *comp = s->img_n;
  7443  
  7444     if (!stbi__mad4sizes_valid(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0))
  7445        return stbi__errpuc("too large", "PNM too large");
  7446  
  7447     out = (stbi_uc *) stbi__malloc_mad4(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0);
  7448     if (!out) return stbi__errpuc("outofmem", "Out of memory");
  7449     stbi__getn(s, out, s->img_n * s->img_x * s->img_y * (ri->bits_per_channel / 8));
  7450  
  7451     if (req_comp && req_comp != s->img_n) {
  7452        out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
  7453        if (out == NULL) return out; // stbi__convert_format frees input on failure
  7454     }
  7455     return out;
  7456  }
  7457  
  7458  static int      stbi__pnm_isspace(char c)
  7459  {
  7460     return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
  7461  }
  7462  
  7463  static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
  7464  {
  7465     for (;;) {
  7466        while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
  7467           *c = (char) stbi__get8(s);
  7468  
  7469        if (stbi__at_eof(s) || *c != '#')
  7470           break;
  7471  
  7472        while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
  7473           *c = (char) stbi__get8(s);
  7474     }
  7475  }
  7476  
  7477  static int      stbi__pnm_isdigit(char c)
  7478  {
  7479     return c >= '0' && c <= '9';
  7480  }
  7481  
  7482  static int      stbi__pnm_getinteger(stbi__context *s, char *c)
  7483  {
  7484     int value = 0;
  7485  
  7486     while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
  7487        value = value*10 + (*c - '0');
  7488        *c = (char) stbi__get8(s);
  7489     }
  7490  
  7491     return value;
  7492  }
  7493  
  7494  static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
  7495  {
  7496     int maxv, dummy;
  7497     char c, p, t;
  7498  
  7499     if (!x) x = &dummy;
  7500     if (!y) y = &dummy;
  7501     if (!comp) comp = &dummy;
  7502  
  7503     stbi__rewind(s);
  7504  
  7505     // Get identifier
  7506     p = (char) stbi__get8(s);
  7507     t = (char) stbi__get8(s);
  7508     if (p != 'P' || (t != '5' && t != '6')) {
  7509         stbi__rewind(s);
  7510         return 0;
  7511     }
  7512  
  7513     *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
  7514  
  7515     c = (char) stbi__get8(s);
  7516     stbi__pnm_skip_whitespace(s, &c);
  7517  
  7518     *x = stbi__pnm_getinteger(s, &c); // read width
  7519     stbi__pnm_skip_whitespace(s, &c);
  7520  
  7521     *y = stbi__pnm_getinteger(s, &c); // read height
  7522     stbi__pnm_skip_whitespace(s, &c);
  7523  
  7524     maxv = stbi__pnm_getinteger(s, &c);  // read max value
  7525     if (maxv > 65535)
  7526        return stbi__err("max value > 65535", "PPM image supports only 8-bit and 16-bit images");
  7527     else if (maxv > 255)
  7528        return 16;
  7529     else
  7530        return 8;
  7531  }
  7532  
  7533  static int stbi__pnm_is16(stbi__context *s)
  7534  {
  7535     if (stbi__pnm_info(s, NULL, NULL, NULL) == 16)
  7536  	   return 1;
  7537     return 0;
  7538  }
  7539  #endif
  7540  
  7541  static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
  7542  {
  7543     #ifndef STBI_NO_JPEG
  7544     if (stbi__jpeg_info(s, x, y, comp)) return 1;
  7545     #endif
  7546  
  7547     #ifndef STBI_NO_PNG
  7548     if (stbi__png_info(s, x, y, comp))  return 1;
  7549     #endif
  7550  
  7551     #ifndef STBI_NO_GIF
  7552     if (stbi__gif_info(s, x, y, comp))  return 1;
  7553     #endif
  7554  
  7555     #ifndef STBI_NO_BMP
  7556     if (stbi__bmp_info(s, x, y, comp))  return 1;
  7557     #endif
  7558  
  7559     #ifndef STBI_NO_PSD
  7560     if (stbi__psd_info(s, x, y, comp))  return 1;
  7561     #endif
  7562  
  7563     #ifndef STBI_NO_PIC
  7564     if (stbi__pic_info(s, x, y, comp))  return 1;
  7565     #endif
  7566  
  7567     #ifndef STBI_NO_PNM
  7568     if (stbi__pnm_info(s, x, y, comp))  return 1;
  7569     #endif
  7570  
  7571     #ifndef STBI_NO_HDR
  7572     if (stbi__hdr_info(s, x, y, comp))  return 1;
  7573     #endif
  7574  
  7575     // test tga last because it's a crappy test!
  7576     #ifndef STBI_NO_TGA
  7577     if (stbi__tga_info(s, x, y, comp))
  7578         return 1;
  7579     #endif
  7580     return stbi__err("unknown image type", "Image not of any known type, or corrupt");
  7581  }
  7582  
  7583  static int stbi__is_16_main(stbi__context *s)
  7584  {
  7585     #ifndef STBI_NO_PNG
  7586     if (stbi__png_is16(s))  return 1;
  7587     #endif
  7588  
  7589     #ifndef STBI_NO_PSD
  7590     if (stbi__psd_is16(s))  return 1;
  7591     #endif
  7592  
  7593     #ifndef STBI_NO_PNM
  7594     if (stbi__pnm_is16(s))  return 1;
  7595     #endif
  7596     return 0;
  7597  }
  7598  
  7599  #ifndef STBI_NO_STDIO
  7600  STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
  7601  {
  7602      FILE *f = stbi__fopen(filename, "rb");
  7603      int result;
  7604      if (!f) return stbi__err("can't fopen", "Unable to open file");
  7605      result = stbi_info_from_file(f, x, y, comp);
  7606      fclose(f);
  7607      return result;
  7608  }
  7609  
  7610  STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
  7611  {
  7612     int r;
  7613     stbi__context s;
  7614     long pos = ftell(f);
  7615     stbi__start_file(&s, f);
  7616     r = stbi__info_main(&s,x,y,comp);
  7617     fseek(f,pos,SEEK_SET);
  7618     return r;
  7619  }
  7620  
  7621  STBIDEF int stbi_is_16_bit(char const *filename)
  7622  {
  7623      FILE *f = stbi__fopen(filename, "rb");
  7624      int result;
  7625      if (!f) return stbi__err("can't fopen", "Unable to open file");
  7626      result = stbi_is_16_bit_from_file(f);
  7627      fclose(f);
  7628      return result;
  7629  }
  7630  
  7631  STBIDEF int stbi_is_16_bit_from_file(FILE *f)
  7632  {
  7633     int r;
  7634     stbi__context s;
  7635     long pos = ftell(f);
  7636     stbi__start_file(&s, f);
  7637     r = stbi__is_16_main(&s);
  7638     fseek(f,pos,SEEK_SET);
  7639     return r;
  7640  }
  7641  #endif // !STBI_NO_STDIO
  7642  
  7643  STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
  7644  {
  7645     stbi__context s;
  7646     stbi__start_mem(&s,buffer,len);
  7647     return stbi__info_main(&s,x,y,comp);
  7648  }
  7649  
  7650  STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
  7651  {
  7652     stbi__context s;
  7653     stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
  7654     return stbi__info_main(&s,x,y,comp);
  7655  }
  7656  
  7657  STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
  7658  {
  7659     stbi__context s;
  7660     stbi__start_mem(&s,buffer,len);
  7661     return stbi__is_16_main(&s);
  7662  }
  7663  
  7664  STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
  7665  {
  7666     stbi__context s;
  7667     stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
  7668     return stbi__is_16_main(&s);
  7669  }
  7670  
  7671  #endif // STB_IMAGE_IMPLEMENTATION
  7672  
  7673  /*
  7674     revision history:
  7675        2.20  (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
  7676        2.19  (2018-02-11) fix warning
  7677        2.18  (2018-01-30) fix warnings
  7678        2.17  (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
  7679                           1-bit BMP
  7680                           *_is_16_bit api
  7681                           avoid warnings
  7682        2.16  (2017-07-23) all functions have 16-bit variants;
  7683                           STBI_NO_STDIO works again;
  7684                           compilation fixes;
  7685                           fix rounding in unpremultiply;
  7686                           optimize vertical flip;
  7687                           disable raw_len validation;
  7688                           documentation fixes
  7689        2.15  (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
  7690                           warning fixes; disable run-time SSE detection on gcc;
  7691                           uniform handling of optional "return" values;
  7692                           thread-safe initialization of zlib tables
  7693        2.14  (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
  7694        2.13  (2016-11-29) add 16-bit API, only supported for PNG right now
  7695        2.12  (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
  7696        2.11  (2016-04-02) allocate large structures on the stack
  7697                           remove white matting for transparent PSD
  7698                           fix reported channel count for PNG & BMP
  7699                           re-enable SSE2 in non-gcc 64-bit
  7700                           support RGB-formatted JPEG
  7701                           read 16-bit PNGs (only as 8-bit)
  7702        2.10  (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
  7703        2.09  (2016-01-16) allow comments in PNM files
  7704                           16-bit-per-pixel TGA (not bit-per-component)
  7705                           info() for TGA could break due to .hdr handling
  7706                           info() for BMP to shares code instead of sloppy parse
  7707                           can use STBI_REALLOC_SIZED if allocator doesn't support realloc
  7708                           code cleanup
  7709        2.08  (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
  7710        2.07  (2015-09-13) fix compiler warnings
  7711                           partial animated GIF support
  7712                           limited 16-bpc PSD support
  7713                           #ifdef unused functions
  7714                           bug with < 92 byte PIC,PNM,HDR,TGA
  7715        2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
  7716        2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
  7717        2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
  7718        2.03  (2015-04-12) extra corruption checking (mmozeiko)
  7719                           stbi_set_flip_vertically_on_load (nguillemot)
  7720                           fix NEON support; fix mingw support
  7721        2.02  (2015-01-19) fix incorrect assert, fix warning
  7722        2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
  7723        2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
  7724        2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
  7725                           progressive JPEG (stb)
  7726                           PGM/PPM support (Ken Miller)
  7727                           STBI_MALLOC,STBI_REALLOC,STBI_FREE
  7728                           GIF bugfix -- seemingly never worked
  7729                           STBI_NO_*, STBI_ONLY_*
  7730        1.48  (2014-12-14) fix incorrectly-named assert()
  7731        1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
  7732                           optimize PNG (ryg)
  7733                           fix bug in interlaced PNG with user-specified channel count (stb)
  7734        1.46  (2014-08-26)
  7735                fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
  7736        1.45  (2014-08-16)
  7737                fix MSVC-ARM internal compiler error by wrapping malloc
  7738        1.44  (2014-08-07)
  7739                various warning fixes from Ronny Chevalier
  7740        1.43  (2014-07-15)
  7741                fix MSVC-only compiler problem in code changed in 1.42
  7742        1.42  (2014-07-09)
  7743                don't define _CRT_SECURE_NO_WARNINGS (affects user code)
  7744                fixes to stbi__cleanup_jpeg path
  7745                added STBI_ASSERT to avoid requiring assert.h
  7746        1.41  (2014-06-25)
  7747                fix search&replace from 1.36 that messed up comments/error messages
  7748        1.40  (2014-06-22)
  7749                fix gcc struct-initialization warning
  7750        1.39  (2014-06-15)
  7751                fix to TGA optimization when req_comp != number of components in TGA;
  7752                fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
  7753                add support for BMP version 5 (more ignored fields)
  7754        1.38  (2014-06-06)
  7755                suppress MSVC warnings on integer casts truncating values
  7756                fix accidental rename of 'skip' field of I/O
  7757        1.37  (2014-06-04)
  7758                remove duplicate typedef
  7759        1.36  (2014-06-03)
  7760                convert to header file single-file library
  7761                if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
  7762        1.35  (2014-05-27)
  7763                various warnings
  7764                fix broken STBI_SIMD path
  7765                fix bug where stbi_load_from_file no longer left file pointer in correct place
  7766                fix broken non-easy path for 32-bit BMP (possibly never used)
  7767                TGA optimization by Arseny Kapoulkine
  7768        1.34  (unknown)
  7769                use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
  7770        1.33  (2011-07-14)
  7771                make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
  7772        1.32  (2011-07-13)
  7773                support for "info" function for all supported filetypes (SpartanJ)
  7774        1.31  (2011-06-20)
  7775                a few more leak fixes, bug in PNG handling (SpartanJ)
  7776        1.30  (2011-06-11)
  7777                added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
  7778                removed deprecated format-specific test/load functions
  7779                removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
  7780                error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
  7781                fix inefficiency in decoding 32-bit BMP (David Woo)
  7782        1.29  (2010-08-16)
  7783                various warning fixes from Aurelien Pocheville
  7784        1.28  (2010-08-01)
  7785                fix bug in GIF palette transparency (SpartanJ)
  7786        1.27  (2010-08-01)
  7787                cast-to-stbi_uc to fix warnings
  7788        1.26  (2010-07-24)
  7789                fix bug in file buffering for PNG reported by SpartanJ
  7790        1.25  (2010-07-17)
  7791                refix trans_data warning (Won Chun)
  7792        1.24  (2010-07-12)
  7793                perf improvements reading from files on platforms with lock-heavy fgetc()
  7794                minor perf improvements for jpeg
  7795                deprecated type-specific functions so we'll get feedback if they're needed
  7796                attempt to fix trans_data warning (Won Chun)
  7797        1.23    fixed bug in iPhone support
  7798        1.22  (2010-07-10)
  7799                removed image *writing* support
  7800                stbi_info support from Jetro Lauha
  7801                GIF support from Jean-Marc Lienher
  7802                iPhone PNG-extensions from James Brown
  7803                warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
  7804        1.21    fix use of 'stbi_uc' in header (reported by jon blow)
  7805        1.20    added support for Softimage PIC, by Tom Seddon
  7806        1.19    bug in interlaced PNG corruption check (found by ryg)
  7807        1.18  (2008-08-02)
  7808                fix a threading bug (local mutable static)
  7809        1.17    support interlaced PNG
  7810        1.16    major bugfix - stbi__convert_format converted one too many pixels
  7811        1.15    initialize some fields for thread safety
  7812        1.14    fix threadsafe conversion bug
  7813                header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
  7814        1.13    threadsafe
  7815        1.12    const qualifiers in the API
  7816        1.11    Support installable IDCT, colorspace conversion routines
  7817        1.10    Fixes for 64-bit (don't use "unsigned long")
  7818                optimized upsampling by Fabian "ryg" Giesen
  7819        1.09    Fix format-conversion for PSD code (bad global variables!)
  7820        1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
  7821        1.07    attempt to fix C++ warning/errors again
  7822        1.06    attempt to fix C++ warning/errors again
  7823        1.05    fix TGA loading to return correct *comp and use good luminance calc
  7824        1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
  7825        1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
  7826        1.02    support for (subset of) HDR files, float interface for preferred access to them
  7827        1.01    fix bug: possible bug in handling right-side up bmps... not sure
  7828                fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
  7829        1.00    interface to zlib that skips zlib header
  7830        0.99    correct handling of alpha in palette
  7831        0.98    TGA loader by lonesock; dynamically add loaders (untested)
  7832        0.97    jpeg errors on too large a file; also catch another malloc failure
  7833        0.96    fix detection of invalid v value - particleman@mollyrocket forum
  7834        0.95    during header scan, seek to markers in case of padding
  7835        0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
  7836        0.93    handle jpegtran output; verbose errors
  7837        0.92    read 4,8,16,24,32-bit BMP files of several formats
  7838        0.91    output 24-bit Windows 3.0 BMP files
  7839        0.90    fix a few more warnings; bump version number to approach 1.0
  7840        0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
  7841        0.60    fix compiling as c++
  7842        0.59    fix warnings: merge Dave Moore's -Wall fixes
  7843        0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
  7844        0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
  7845        0.56    fix bug: zlib uncompressed mode len vs. nlen
  7846        0.55    fix bug: restart_interval not initialized to 0
  7847        0.54    allow NULL for 'int *comp'
  7848        0.53    fix bug in png 3->4; speedup png decoding
  7849        0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
  7850        0.51    obey req_comp requests, 1-component jpegs return as 1-component,
  7851                on 'test' only check type, not whether we support this variant
  7852        0.50  (2006-11-19)
  7853                first released version
  7854  */
  7855  
  7856  
  7857  /*
  7858  ------------------------------------------------------------------------------
  7859  This software is available under 2 licenses -- choose whichever you prefer.
  7860  ------------------------------------------------------------------------------
  7861  ALTERNATIVE A - MIT License
  7862  Copyright (c) 2017 Sean Barrett
  7863  Permission is hereby granted, free of charge, to any person obtaining a copy of
  7864  this software and associated documentation files (the "Software"), to deal in
  7865  the Software without restriction, including without limitation the rights to
  7866  use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
  7867  of the Software, and to permit persons to whom the Software is furnished to do
  7868  so, subject to the following conditions:
  7869  The above copyright notice and this permission notice shall be included in all
  7870  copies or substantial portions of the Software.
  7871  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  7872  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  7873  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  7874  AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  7875  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  7876  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  7877  SOFTWARE.
  7878  ------------------------------------------------------------------------------
  7879  ALTERNATIVE B - Public Domain (www.unlicense.org)
  7880  This is free and unencumbered software released into the public domain.
  7881  Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
  7882  software, either in source code form or as a compiled binary, for any purpose,
  7883  commercial or non-commercial, and by any means.
  7884  In jurisdictions that recognize copyright laws, the author or authors of this
  7885  software dedicate any and all copyright interest in the software to the public
  7886  domain. We make this dedication for the benefit of the public at large and to
  7887  the detriment of our heirs and successors. We intend this dedication to be an
  7888  overt act of relinquishment in perpetuity of all present and future rights to
  7889  this software under copyright law.
  7890  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  7891  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  7892  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  7893  AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
  7894  ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  7895  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  7896  ------------------------------------------------------------------------------
  7897  */