github.com/Konstantin8105/c4go@v0.0.0-20240505174241-768bb1c65a51/tests/raylib/external/dr_flac.h (about) 1 /* 2 FLAC audio decoder. Choice of public domain or MIT-0. See license statements at the end of this file. 3 dr_flac - v0.12.31 - 2021-08-16 4 5 David Reid - mackron@gmail.com 6 7 GitHub: https://github.com/mackron/dr_libs 8 */ 9 10 /* 11 RELEASE NOTES - v0.12.0 12 ======================= 13 Version 0.12.0 has breaking API changes including changes to the existing API and the removal of deprecated APIs. 14 15 16 Improved Client-Defined Memory Allocation 17 ----------------------------------------- 18 The main change with this release is the addition of a more flexible way of implementing custom memory allocation routines. The 19 existing system of DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE are still in place and will be used by default when no custom 20 allocation callbacks are specified. 21 22 To use the new system, you pass in a pointer to a drflac_allocation_callbacks object to drflac_open() and family, like this: 23 24 void* my_malloc(size_t sz, void* pUserData) 25 { 26 return malloc(sz); 27 } 28 void* my_realloc(void* p, size_t sz, void* pUserData) 29 { 30 return realloc(p, sz); 31 } 32 void my_free(void* p, void* pUserData) 33 { 34 free(p); 35 } 36 37 ... 38 39 drflac_allocation_callbacks allocationCallbacks; 40 allocationCallbacks.pUserData = &myData; 41 allocationCallbacks.onMalloc = my_malloc; 42 allocationCallbacks.onRealloc = my_realloc; 43 allocationCallbacks.onFree = my_free; 44 drflac* pFlac = drflac_open_file("my_file.flac", &allocationCallbacks); 45 46 The advantage of this new system is that it allows you to specify user data which will be passed in to the allocation routines. 47 48 Passing in null for the allocation callbacks object will cause dr_flac to use defaults which is the same as DRFLAC_MALLOC, 49 DRFLAC_REALLOC and DRFLAC_FREE and the equivalent of how it worked in previous versions. 50 51 Every API that opens a drflac object now takes this extra parameter. These include the following: 52 53 drflac_open() 54 drflac_open_relaxed() 55 drflac_open_with_metadata() 56 drflac_open_with_metadata_relaxed() 57 drflac_open_file() 58 drflac_open_file_with_metadata() 59 drflac_open_memory() 60 drflac_open_memory_with_metadata() 61 drflac_open_and_read_pcm_frames_s32() 62 drflac_open_and_read_pcm_frames_s16() 63 drflac_open_and_read_pcm_frames_f32() 64 drflac_open_file_and_read_pcm_frames_s32() 65 drflac_open_file_and_read_pcm_frames_s16() 66 drflac_open_file_and_read_pcm_frames_f32() 67 drflac_open_memory_and_read_pcm_frames_s32() 68 drflac_open_memory_and_read_pcm_frames_s16() 69 drflac_open_memory_and_read_pcm_frames_f32() 70 71 72 73 Optimizations 74 ------------- 75 Seeking performance has been greatly improved. A new binary search based seeking algorithm has been introduced which significantly 76 improves performance over the brute force method which was used when no seek table was present. Seek table based seeking also takes 77 advantage of the new binary search seeking system to further improve performance there as well. Note that this depends on CRC which 78 means it will be disabled when DR_FLAC_NO_CRC is used. 79 80 The SSE4.1 pipeline has been cleaned up and optimized. You should see some improvements with decoding speed of 24-bit files in 81 particular. 16-bit streams should also see some improvement. 82 83 drflac_read_pcm_frames_s16() has been optimized. Previously this sat on top of drflac_read_pcm_frames_s32() and performed it's s32 84 to s16 conversion in a second pass. This is now all done in a single pass. This includes SSE2 and ARM NEON optimized paths. 85 86 A minor optimization has been implemented for drflac_read_pcm_frames_s32(). This will now use an SSE2 optimized pipeline for stereo 87 channel reconstruction which is the last part of the decoding process. 88 89 The ARM build has seen a few improvements. The CLZ (count leading zeroes) and REV (byte swap) instructions are now used when 90 compiling with GCC and Clang which is achieved using inline assembly. The CLZ instruction requires ARM architecture version 5 at 91 compile time and the REV instruction requires ARM architecture version 6. 92 93 An ARM NEON optimized pipeline has been implemented. To enable this you'll need to add -mfpu=neon to the command line when compiling. 94 95 96 Removed APIs 97 ------------ 98 The following APIs were deprecated in version 0.11.0 and have been completely removed in version 0.12.0: 99 100 drflac_read_s32() -> drflac_read_pcm_frames_s32() 101 drflac_read_s16() -> drflac_read_pcm_frames_s16() 102 drflac_read_f32() -> drflac_read_pcm_frames_f32() 103 drflac_seek_to_sample() -> drflac_seek_to_pcm_frame() 104 drflac_open_and_decode_s32() -> drflac_open_and_read_pcm_frames_s32() 105 drflac_open_and_decode_s16() -> drflac_open_and_read_pcm_frames_s16() 106 drflac_open_and_decode_f32() -> drflac_open_and_read_pcm_frames_f32() 107 drflac_open_and_decode_file_s32() -> drflac_open_file_and_read_pcm_frames_s32() 108 drflac_open_and_decode_file_s16() -> drflac_open_file_and_read_pcm_frames_s16() 109 drflac_open_and_decode_file_f32() -> drflac_open_file_and_read_pcm_frames_f32() 110 drflac_open_and_decode_memory_s32() -> drflac_open_memory_and_read_pcm_frames_s32() 111 drflac_open_and_decode_memory_s16() -> drflac_open_memory_and_read_pcm_frames_s16() 112 drflac_open_and_decode_memory_f32() -> drflac_open_memroy_and_read_pcm_frames_f32() 113 114 Prior versions of dr_flac operated on a per-sample basis whereas now it operates on PCM frames. The removed APIs all relate 115 to the old per-sample APIs. You now need to use the "pcm_frame" versions. 116 */ 117 118 119 /* 120 Introduction 121 ============ 122 dr_flac is a single file library. To use it, do something like the following in one .c file. 123 124 ```c 125 #define DR_FLAC_IMPLEMENTATION 126 #include "dr_flac.h" 127 ``` 128 129 You can then #include this file in other parts of the program as you would with any other header file. To decode audio data, do something like the following: 130 131 ```c 132 drflac* pFlac = drflac_open_file("MySong.flac", NULL); 133 if (pFlac == NULL) { 134 // Failed to open FLAC file 135 } 136 137 drflac_int32* pSamples = malloc(pFlac->totalPCMFrameCount * pFlac->channels * sizeof(drflac_int32)); 138 drflac_uint64 numberOfInterleavedSamplesActuallyRead = drflac_read_pcm_frames_s32(pFlac, pFlac->totalPCMFrameCount, pSamples); 139 ``` 140 141 The drflac object represents the decoder. It is a transparent type so all the information you need, such as the number of channels and the bits per sample, 142 should be directly accessible - just make sure you don't change their values. Samples are always output as interleaved signed 32-bit PCM. In the example above 143 a native FLAC stream was opened, however dr_flac has seamless support for Ogg encapsulated FLAC streams as well. 144 145 You do not need to decode the entire stream in one go - you just specify how many samples you'd like at any given time and the decoder will give you as many 146 samples as it can, up to the amount requested. Later on when you need the next batch of samples, just call it again. Example: 147 148 ```c 149 while (drflac_read_pcm_frames_s32(pFlac, chunkSizeInPCMFrames, pChunkSamples) > 0) { 150 do_something(); 151 } 152 ``` 153 154 You can seek to a specific PCM frame with `drflac_seek_to_pcm_frame()`. 155 156 If you just want to quickly decode an entire FLAC file in one go you can do something like this: 157 158 ```c 159 unsigned int channels; 160 unsigned int sampleRate; 161 drflac_uint64 totalPCMFrameCount; 162 drflac_int32* pSampleData = drflac_open_file_and_read_pcm_frames_s32("MySong.flac", &channels, &sampleRate, &totalPCMFrameCount, NULL); 163 if (pSampleData == NULL) { 164 // Failed to open and decode FLAC file. 165 } 166 167 ... 168 169 drflac_free(pSampleData, NULL); 170 ``` 171 172 You can read samples as signed 16-bit integer and 32-bit floating-point PCM with the *_s16() and *_f32() family of APIs respectively, but note that these 173 should be considered lossy. 174 175 176 If you need access to metadata (album art, etc.), use `drflac_open_with_metadata()`, `drflac_open_file_with_metdata()` or `drflac_open_memory_with_metadata()`. 177 The rationale for keeping these APIs separate is that they're slightly slower than the normal versions and also just a little bit harder to use. dr_flac 178 reports metadata to the application through the use of a callback, and every metadata block is reported before `drflac_open_with_metdata()` returns. 179 180 The main opening APIs (`drflac_open()`, etc.) will fail if the header is not present. The presents a problem in certain scenarios such as broadcast style 181 streams or internet radio where the header may not be present because the user has started playback mid-stream. To handle this, use the relaxed APIs: 182 183 `drflac_open_relaxed()` 184 `drflac_open_with_metadata_relaxed()` 185 186 It is not recommended to use these APIs for file based streams because a missing header would usually indicate a corrupt or perverse file. In addition, these 187 APIs can take a long time to initialize because they may need to spend a lot of time finding the first frame. 188 189 190 191 Build Options 192 ============= 193 #define these options before including this file. 194 195 #define DR_FLAC_NO_STDIO 196 Disable `drflac_open_file()` and family. 197 198 #define DR_FLAC_NO_OGG 199 Disables support for Ogg/FLAC streams. 200 201 #define DR_FLAC_BUFFER_SIZE <number> 202 Defines the size of the internal buffer to store data from onRead(). This buffer is used to reduce the number of calls back to the client for more data. 203 Larger values means more memory, but better performance. My tests show diminishing returns after about 4KB (which is the default). Consider reducing this if 204 you have a very efficient implementation of onRead(), or increase it if it's very inefficient. Must be a multiple of 8. 205 206 #define DR_FLAC_NO_CRC 207 Disables CRC checks. This will offer a performance boost when CRC is unnecessary. This will disable binary search seeking. When seeking, the seek table will 208 be used if available. Otherwise the seek will be performed using brute force. 209 210 #define DR_FLAC_NO_SIMD 211 Disables SIMD optimizations (SSE on x86/x64 architectures, NEON on ARM architectures). Use this if you are having compatibility issues with your compiler. 212 213 214 215 Notes 216 ===== 217 - dr_flac does not support changing the sample rate nor channel count mid stream. 218 - dr_flac is not thread-safe, but its APIs can be called from any thread so long as you do your own synchronization. 219 - When using Ogg encapsulation, a corrupted metadata block will result in `drflac_open_with_metadata()` and `drflac_open()` returning inconsistent samples due 220 to differences in corrupted stream recorvery logic between the two APIs. 221 */ 222 223 #ifndef dr_flac_h 224 #define dr_flac_h 225 226 #ifdef __cplusplus 227 extern "C" { 228 #endif 229 230 #define DRFLAC_STRINGIFY(x) #x 231 #define DRFLAC_XSTRINGIFY(x) DRFLAC_STRINGIFY(x) 232 233 #define DRFLAC_VERSION_MAJOR 0 234 #define DRFLAC_VERSION_MINOR 12 235 #define DRFLAC_VERSION_REVISION 31 236 #define DRFLAC_VERSION_STRING DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MAJOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MINOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_REVISION) 237 238 #include <stddef.h> /* For size_t. */ 239 240 /* Sized types. */ 241 typedef signed char drflac_int8; 242 typedef unsigned char drflac_uint8; 243 typedef signed short drflac_int16; 244 typedef unsigned short drflac_uint16; 245 typedef signed int drflac_int32; 246 typedef unsigned int drflac_uint32; 247 #if defined(_MSC_VER) 248 typedef signed __int64 drflac_int64; 249 typedef unsigned __int64 drflac_uint64; 250 #else 251 #if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))) 252 #pragma GCC diagnostic push 253 #pragma GCC diagnostic ignored "-Wlong-long" 254 #if defined(__clang__) 255 #pragma GCC diagnostic ignored "-Wc++11-long-long" 256 #endif 257 #endif 258 typedef signed long long drflac_int64; 259 typedef unsigned long long drflac_uint64; 260 #if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))) 261 #pragma GCC diagnostic pop 262 #endif 263 #endif 264 #if defined(__LP64__) || defined(_WIN64) || (defined(__x86_64__) && !defined(__ILP32__)) || defined(_M_X64) || defined(__ia64) || defined (_M_IA64) || defined(__aarch64__) || defined(_M_ARM64) || defined(__powerpc64__) 265 typedef drflac_uint64 drflac_uintptr; 266 #else 267 typedef drflac_uint32 drflac_uintptr; 268 #endif 269 typedef drflac_uint8 drflac_bool8; 270 typedef drflac_uint32 drflac_bool32; 271 #define DRFLAC_TRUE 1 272 #define DRFLAC_FALSE 0 273 274 #if !defined(DRFLAC_API) 275 #if defined(DRFLAC_DLL) 276 #if defined(_WIN32) 277 #define DRFLAC_DLL_IMPORT __declspec(dllimport) 278 #define DRFLAC_DLL_EXPORT __declspec(dllexport) 279 #define DRFLAC_DLL_PRIVATE static 280 #else 281 #if defined(__GNUC__) && __GNUC__ >= 4 282 #define DRFLAC_DLL_IMPORT __attribute__((visibility("default"))) 283 #define DRFLAC_DLL_EXPORT __attribute__((visibility("default"))) 284 #define DRFLAC_DLL_PRIVATE __attribute__((visibility("hidden"))) 285 #else 286 #define DRFLAC_DLL_IMPORT 287 #define DRFLAC_DLL_EXPORT 288 #define DRFLAC_DLL_PRIVATE static 289 #endif 290 #endif 291 292 #if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION) 293 #define DRFLAC_API DRFLAC_DLL_EXPORT 294 #else 295 #define DRFLAC_API DRFLAC_DLL_IMPORT 296 #endif 297 #define DRFLAC_PRIVATE DRFLAC_DLL_PRIVATE 298 #else 299 #define DRFLAC_API extern 300 #define DRFLAC_PRIVATE static 301 #endif 302 #endif 303 304 #if defined(_MSC_VER) && _MSC_VER >= 1700 /* Visual Studio 2012 */ 305 #define DRFLAC_DEPRECATED __declspec(deprecated) 306 #elif (defined(__GNUC__) && __GNUC__ >= 4) /* GCC 4 */ 307 #define DRFLAC_DEPRECATED __attribute__((deprecated)) 308 #elif defined(__has_feature) /* Clang */ 309 #if __has_feature(attribute_deprecated) 310 #define DRFLAC_DEPRECATED __attribute__((deprecated)) 311 #else 312 #define DRFLAC_DEPRECATED 313 #endif 314 #else 315 #define DRFLAC_DEPRECATED 316 #endif 317 318 DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision); 319 DRFLAC_API const char* drflac_version_string(void); 320 321 /* 322 As data is read from the client it is placed into an internal buffer for fast access. This controls the size of that buffer. Larger values means more speed, 323 but also more memory. In my testing there is diminishing returns after about 4KB, but you can fiddle with this to suit your own needs. Must be a multiple of 8. 324 */ 325 #ifndef DR_FLAC_BUFFER_SIZE 326 #define DR_FLAC_BUFFER_SIZE 4096 327 #endif 328 329 /* Check if we can enable 64-bit optimizations. */ 330 #if defined(_WIN64) || defined(_LP64) || defined(__LP64__) 331 #define DRFLAC_64BIT 332 #endif 333 334 #ifdef DRFLAC_64BIT 335 typedef drflac_uint64 drflac_cache_t; 336 #else 337 typedef drflac_uint32 drflac_cache_t; 338 #endif 339 340 /* The various metadata block types. */ 341 #define DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO 0 342 #define DRFLAC_METADATA_BLOCK_TYPE_PADDING 1 343 #define DRFLAC_METADATA_BLOCK_TYPE_APPLICATION 2 344 #define DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE 3 345 #define DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT 4 346 #define DRFLAC_METADATA_BLOCK_TYPE_CUESHEET 5 347 #define DRFLAC_METADATA_BLOCK_TYPE_PICTURE 6 348 #define DRFLAC_METADATA_BLOCK_TYPE_INVALID 127 349 350 /* The various picture types specified in the PICTURE block. */ 351 #define DRFLAC_PICTURE_TYPE_OTHER 0 352 #define DRFLAC_PICTURE_TYPE_FILE_ICON 1 353 #define DRFLAC_PICTURE_TYPE_OTHER_FILE_ICON 2 354 #define DRFLAC_PICTURE_TYPE_COVER_FRONT 3 355 #define DRFLAC_PICTURE_TYPE_COVER_BACK 4 356 #define DRFLAC_PICTURE_TYPE_LEAFLET_PAGE 5 357 #define DRFLAC_PICTURE_TYPE_MEDIA 6 358 #define DRFLAC_PICTURE_TYPE_LEAD_ARTIST 7 359 #define DRFLAC_PICTURE_TYPE_ARTIST 8 360 #define DRFLAC_PICTURE_TYPE_CONDUCTOR 9 361 #define DRFLAC_PICTURE_TYPE_BAND 10 362 #define DRFLAC_PICTURE_TYPE_COMPOSER 11 363 #define DRFLAC_PICTURE_TYPE_LYRICIST 12 364 #define DRFLAC_PICTURE_TYPE_RECORDING_LOCATION 13 365 #define DRFLAC_PICTURE_TYPE_DURING_RECORDING 14 366 #define DRFLAC_PICTURE_TYPE_DURING_PERFORMANCE 15 367 #define DRFLAC_PICTURE_TYPE_SCREEN_CAPTURE 16 368 #define DRFLAC_PICTURE_TYPE_BRIGHT_COLORED_FISH 17 369 #define DRFLAC_PICTURE_TYPE_ILLUSTRATION 18 370 #define DRFLAC_PICTURE_TYPE_BAND_LOGOTYPE 19 371 #define DRFLAC_PICTURE_TYPE_PUBLISHER_LOGOTYPE 20 372 373 typedef enum 374 { 375 drflac_container_native, 376 drflac_container_ogg, 377 drflac_container_unknown 378 } drflac_container; 379 380 typedef enum 381 { 382 drflac_seek_origin_start, 383 drflac_seek_origin_current 384 } drflac_seek_origin; 385 386 /* Packing is important on this structure because we map this directly to the raw data within the SEEKTABLE metadata block. */ 387 #pragma pack(2) 388 typedef struct 389 { 390 drflac_uint64 firstPCMFrame; 391 drflac_uint64 flacFrameOffset; /* The offset from the first byte of the header of the first frame. */ 392 drflac_uint16 pcmFrameCount; 393 } drflac_seekpoint; 394 #pragma pack() 395 396 typedef struct 397 { 398 drflac_uint16 minBlockSizeInPCMFrames; 399 drflac_uint16 maxBlockSizeInPCMFrames; 400 drflac_uint32 minFrameSizeInPCMFrames; 401 drflac_uint32 maxFrameSizeInPCMFrames; 402 drflac_uint32 sampleRate; 403 drflac_uint8 channels; 404 drflac_uint8 bitsPerSample; 405 drflac_uint64 totalPCMFrameCount; 406 drflac_uint8 md5[16]; 407 } drflac_streaminfo; 408 409 typedef struct 410 { 411 /* 412 The metadata type. Use this to know how to interpret the data below. Will be set to one of the 413 DRFLAC_METADATA_BLOCK_TYPE_* tokens. 414 */ 415 drflac_uint32 type; 416 417 /* 418 A pointer to the raw data. This points to a temporary buffer so don't hold on to it. It's best to 419 not modify the contents of this buffer. Use the structures below for more meaningful and structured 420 information about the metadata. It's possible for this to be null. 421 */ 422 const void* pRawData; 423 424 /* The size in bytes of the block and the buffer pointed to by pRawData if it's non-NULL. */ 425 drflac_uint32 rawDataSize; 426 427 union 428 { 429 drflac_streaminfo streaminfo; 430 431 struct 432 { 433 int unused; 434 } padding; 435 436 struct 437 { 438 drflac_uint32 id; 439 const void* pData; 440 drflac_uint32 dataSize; 441 } application; 442 443 struct 444 { 445 drflac_uint32 seekpointCount; 446 const drflac_seekpoint* pSeekpoints; 447 } seektable; 448 449 struct 450 { 451 drflac_uint32 vendorLength; 452 const char* vendor; 453 drflac_uint32 commentCount; 454 const void* pComments; 455 } vorbis_comment; 456 457 struct 458 { 459 char catalog[128]; 460 drflac_uint64 leadInSampleCount; 461 drflac_bool32 isCD; 462 drflac_uint8 trackCount; 463 const void* pTrackData; 464 } cuesheet; 465 466 struct 467 { 468 drflac_uint32 type; 469 drflac_uint32 mimeLength; 470 const char* mime; 471 drflac_uint32 descriptionLength; 472 const char* description; 473 drflac_uint32 width; 474 drflac_uint32 height; 475 drflac_uint32 colorDepth; 476 drflac_uint32 indexColorCount; 477 drflac_uint32 pictureDataSize; 478 const drflac_uint8* pPictureData; 479 } picture; 480 } data; 481 } drflac_metadata; 482 483 484 /* 485 Callback for when data needs to be read from the client. 486 487 488 Parameters 489 ---------- 490 pUserData (in) 491 The user data that was passed to drflac_open() and family. 492 493 pBufferOut (out) 494 The output buffer. 495 496 bytesToRead (in) 497 The number of bytes to read. 498 499 500 Return Value 501 ------------ 502 The number of bytes actually read. 503 504 505 Remarks 506 ------- 507 A return value of less than bytesToRead indicates the end of the stream. Do _not_ return from this callback until either the entire bytesToRead is filled or 508 you have reached the end of the stream. 509 */ 510 typedef size_t (* drflac_read_proc)(void* pUserData, void* pBufferOut, size_t bytesToRead); 511 512 /* 513 Callback for when data needs to be seeked. 514 515 516 Parameters 517 ---------- 518 pUserData (in) 519 The user data that was passed to drflac_open() and family. 520 521 offset (in) 522 The number of bytes to move, relative to the origin. Will never be negative. 523 524 origin (in) 525 The origin of the seek - the current position or the start of the stream. 526 527 528 Return Value 529 ------------ 530 Whether or not the seek was successful. 531 532 533 Remarks 534 ------- 535 The offset will never be negative. Whether or not it is relative to the beginning or current position is determined by the "origin" parameter which will be 536 either drflac_seek_origin_start or drflac_seek_origin_current. 537 538 When seeking to a PCM frame using drflac_seek_to_pcm_frame(), dr_flac may call this with an offset beyond the end of the FLAC stream. This needs to be detected 539 and handled by returning DRFLAC_FALSE. 540 */ 541 typedef drflac_bool32 (* drflac_seek_proc)(void* pUserData, int offset, drflac_seek_origin origin); 542 543 /* 544 Callback for when a metadata block is read. 545 546 547 Parameters 548 ---------- 549 pUserData (in) 550 The user data that was passed to drflac_open() and family. 551 552 pMetadata (in) 553 A pointer to a structure containing the data of the metadata block. 554 555 556 Remarks 557 ------- 558 Use pMetadata->type to determine which metadata block is being handled and how to read the data. This 559 will be set to one of the DRFLAC_METADATA_BLOCK_TYPE_* tokens. 560 */ 561 typedef void (* drflac_meta_proc)(void* pUserData, drflac_metadata* pMetadata); 562 563 564 typedef struct 565 { 566 void* pUserData; 567 void* (* onMalloc)(size_t sz, void* pUserData); 568 void* (* onRealloc)(void* p, size_t sz, void* pUserData); 569 void (* onFree)(void* p, void* pUserData); 570 } drflac_allocation_callbacks; 571 572 /* Structure for internal use. Only used for decoders opened with drflac_open_memory. */ 573 typedef struct 574 { 575 const drflac_uint8* data; 576 size_t dataSize; 577 size_t currentReadPos; 578 } drflac__memory_stream; 579 580 /* Structure for internal use. Used for bit streaming. */ 581 typedef struct 582 { 583 /* The function to call when more data needs to be read. */ 584 drflac_read_proc onRead; 585 586 /* The function to call when the current read position needs to be moved. */ 587 drflac_seek_proc onSeek; 588 589 /* The user data to pass around to onRead and onSeek. */ 590 void* pUserData; 591 592 593 /* 594 The number of unaligned bytes in the L2 cache. This will always be 0 until the end of the stream is hit. At the end of the 595 stream there will be a number of bytes that don't cleanly fit in an L1 cache line, so we use this variable to know whether 596 or not the bistreamer needs to run on a slower path to read those last bytes. This will never be more than sizeof(drflac_cache_t). 597 */ 598 size_t unalignedByteCount; 599 600 /* The content of the unaligned bytes. */ 601 drflac_cache_t unalignedCache; 602 603 /* The index of the next valid cache line in the "L2" cache. */ 604 drflac_uint32 nextL2Line; 605 606 /* The number of bits that have been consumed by the cache. This is used to determine how many valid bits are remaining. */ 607 drflac_uint32 consumedBits; 608 609 /* 610 The cached data which was most recently read from the client. There are two levels of cache. Data flows as such: 611 Client -> L2 -> L1. The L2 -> L1 movement is aligned and runs on a fast path in just a few instructions. 612 */ 613 drflac_cache_t cacheL2[DR_FLAC_BUFFER_SIZE/sizeof(drflac_cache_t)]; 614 drflac_cache_t cache; 615 616 /* 617 CRC-16. This is updated whenever bits are read from the bit stream. Manually set this to 0 to reset the CRC. For FLAC, this 618 is reset to 0 at the beginning of each frame. 619 */ 620 drflac_uint16 crc16; 621 drflac_cache_t crc16Cache; /* A cache for optimizing CRC calculations. This is filled when when the L1 cache is reloaded. */ 622 drflac_uint32 crc16CacheIgnoredBytes; /* The number of bytes to ignore when updating the CRC-16 from the CRC-16 cache. */ 623 } drflac_bs; 624 625 typedef struct 626 { 627 /* The type of the subframe: SUBFRAME_CONSTANT, SUBFRAME_VERBATIM, SUBFRAME_FIXED or SUBFRAME_LPC. */ 628 drflac_uint8 subframeType; 629 630 /* The number of wasted bits per sample as specified by the sub-frame header. */ 631 drflac_uint8 wastedBitsPerSample; 632 633 /* The order to use for the prediction stage for SUBFRAME_FIXED and SUBFRAME_LPC. */ 634 drflac_uint8 lpcOrder; 635 636 /* A pointer to the buffer containing the decoded samples in the subframe. This pointer is an offset from drflac::pExtraData. */ 637 drflac_int32* pSamplesS32; 638 } drflac_subframe; 639 640 typedef struct 641 { 642 /* 643 If the stream uses variable block sizes, this will be set to the index of the first PCM frame. If fixed block sizes are used, this will 644 always be set to 0. This is 64-bit because the decoded PCM frame number will be 36 bits. 645 */ 646 drflac_uint64 pcmFrameNumber; 647 648 /* 649 If the stream uses fixed block sizes, this will be set to the frame number. If variable block sizes are used, this will always be 0. This 650 is 32-bit because in fixed block sizes, the maximum frame number will be 31 bits. 651 */ 652 drflac_uint32 flacFrameNumber; 653 654 /* The sample rate of this frame. */ 655 drflac_uint32 sampleRate; 656 657 /* The number of PCM frames in each sub-frame within this frame. */ 658 drflac_uint16 blockSizeInPCMFrames; 659 660 /* 661 The channel assignment of this frame. This is not always set to the channel count. If interchannel decorrelation is being used this 662 will be set to DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE, DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE or DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE. 663 */ 664 drflac_uint8 channelAssignment; 665 666 /* The number of bits per sample within this frame. */ 667 drflac_uint8 bitsPerSample; 668 669 /* The frame's CRC. */ 670 drflac_uint8 crc8; 671 } drflac_frame_header; 672 673 typedef struct 674 { 675 /* The header. */ 676 drflac_frame_header header; 677 678 /* 679 The number of PCM frames left to be read in this FLAC frame. This is initially set to the block size. As PCM frames are read, 680 this will be decremented. When it reaches 0, the decoder will see this frame as fully consumed and load the next frame. 681 */ 682 drflac_uint32 pcmFramesRemaining; 683 684 /* The list of sub-frames within the frame. There is one sub-frame for each channel, and there's a maximum of 8 channels. */ 685 drflac_subframe subframes[8]; 686 } drflac_frame; 687 688 typedef struct 689 { 690 /* The function to call when a metadata block is read. */ 691 drflac_meta_proc onMeta; 692 693 /* The user data posted to the metadata callback function. */ 694 void* pUserDataMD; 695 696 /* Memory allocation callbacks. */ 697 drflac_allocation_callbacks allocationCallbacks; 698 699 700 /* The sample rate. Will be set to something like 44100. */ 701 drflac_uint32 sampleRate; 702 703 /* 704 The number of channels. This will be set to 1 for monaural streams, 2 for stereo, etc. Maximum 8. This is set based on the 705 value specified in the STREAMINFO block. 706 */ 707 drflac_uint8 channels; 708 709 /* The bits per sample. Will be set to something like 16, 24, etc. */ 710 drflac_uint8 bitsPerSample; 711 712 /* The maximum block size, in samples. This number represents the number of samples in each channel (not combined). */ 713 drflac_uint16 maxBlockSizeInPCMFrames; 714 715 /* 716 The total number of PCM Frames making up the stream. Can be 0 in which case it's still a valid stream, but just means 717 the total PCM frame count is unknown. Likely the case with streams like internet radio. 718 */ 719 drflac_uint64 totalPCMFrameCount; 720 721 722 /* The container type. This is set based on whether or not the decoder was opened from a native or Ogg stream. */ 723 drflac_container container; 724 725 /* The number of seekpoints in the seektable. */ 726 drflac_uint32 seekpointCount; 727 728 729 /* Information about the frame the decoder is currently sitting on. */ 730 drflac_frame currentFLACFrame; 731 732 733 /* The index of the PCM frame the decoder is currently sitting on. This is only used for seeking. */ 734 drflac_uint64 currentPCMFrame; 735 736 /* The position of the first FLAC frame in the stream. This is only ever used for seeking. */ 737 drflac_uint64 firstFLACFramePosInBytes; 738 739 740 /* A hack to avoid a malloc() when opening a decoder with drflac_open_memory(). */ 741 drflac__memory_stream memoryStream; 742 743 744 /* A pointer to the decoded sample data. This is an offset of pExtraData. */ 745 drflac_int32* pDecodedSamples; 746 747 /* A pointer to the seek table. This is an offset of pExtraData, or NULL if there is no seek table. */ 748 drflac_seekpoint* pSeekpoints; 749 750 /* Internal use only. Only used with Ogg containers. Points to a drflac_oggbs object. This is an offset of pExtraData. */ 751 void* _oggbs; 752 753 /* Internal use only. Used for profiling and testing different seeking modes. */ 754 drflac_bool32 _noSeekTableSeek : 1; 755 drflac_bool32 _noBinarySearchSeek : 1; 756 drflac_bool32 _noBruteForceSeek : 1; 757 758 /* The bit streamer. The raw FLAC data is fed through this object. */ 759 drflac_bs bs; 760 761 /* Variable length extra data. We attach this to the end of the object so we can avoid unnecessary mallocs. */ 762 drflac_uint8 pExtraData[1]; 763 } drflac; 764 765 766 /* 767 Opens a FLAC decoder. 768 769 770 Parameters 771 ---------- 772 onRead (in) 773 The function to call when data needs to be read from the client. 774 775 onSeek (in) 776 The function to call when the read position of the client data needs to move. 777 778 pUserData (in, optional) 779 A pointer to application defined data that will be passed to onRead and onSeek. 780 781 pAllocationCallbacks (in, optional) 782 A pointer to application defined callbacks for managing memory allocations. 783 784 785 Return Value 786 ------------ 787 Returns a pointer to an object representing the decoder. 788 789 790 Remarks 791 ------- 792 Close the decoder with `drflac_close()`. 793 794 `pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`. 795 796 This function will automatically detect whether or not you are attempting to open a native or Ogg encapsulated FLAC, both of which should work seamlessly 797 without any manual intervention. Ogg encapsulation also works with multiplexed streams which basically means it can play FLAC encoded audio tracks in videos. 798 799 This is the lowest level function for opening a FLAC stream. You can also use `drflac_open_file()` and `drflac_open_memory()` to open the stream from a file or 800 from a block of memory respectively. 801 802 The STREAMINFO block must be present for this to succeed. Use `drflac_open_relaxed()` to open a FLAC stream where the header may not be present. 803 804 Use `drflac_open_with_metadata()` if you need access to metadata. 805 806 807 Seek Also 808 --------- 809 drflac_open_file() 810 drflac_open_memory() 811 drflac_open_with_metadata() 812 drflac_close() 813 */ 814 DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 815 816 /* 817 Opens a FLAC stream with relaxed validation of the header block. 818 819 820 Parameters 821 ---------- 822 onRead (in) 823 The function to call when data needs to be read from the client. 824 825 onSeek (in) 826 The function to call when the read position of the client data needs to move. 827 828 container (in) 829 Whether or not the FLAC stream is encapsulated using standard FLAC encapsulation or Ogg encapsulation. 830 831 pUserData (in, optional) 832 A pointer to application defined data that will be passed to onRead and onSeek. 833 834 pAllocationCallbacks (in, optional) 835 A pointer to application defined callbacks for managing memory allocations. 836 837 838 Return Value 839 ------------ 840 A pointer to an object representing the decoder. 841 842 843 Remarks 844 ------- 845 The same as drflac_open(), except attempts to open the stream even when a header block is not present. 846 847 Because the header is not necessarily available, the caller must explicitly define the container (Native or Ogg). Do not set this to `drflac_container_unknown` 848 as that is for internal use only. 849 850 Opening in relaxed mode will continue reading data from onRead until it finds a valid frame. If a frame is never found it will continue forever. To abort, 851 force your `onRead` callback to return 0, which dr_flac will use as an indicator that the end of the stream was found. 852 853 Use `drflac_open_with_metadata_relaxed()` if you need access to metadata. 854 */ 855 DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 856 857 /* 858 Opens a FLAC decoder and notifies the caller of the metadata chunks (album art, etc.). 859 860 861 Parameters 862 ---------- 863 onRead (in) 864 The function to call when data needs to be read from the client. 865 866 onSeek (in) 867 The function to call when the read position of the client data needs to move. 868 869 onMeta (in) 870 The function to call for every metadata block. 871 872 pUserData (in, optional) 873 A pointer to application defined data that will be passed to onRead, onSeek and onMeta. 874 875 pAllocationCallbacks (in, optional) 876 A pointer to application defined callbacks for managing memory allocations. 877 878 879 Return Value 880 ------------ 881 A pointer to an object representing the decoder. 882 883 884 Remarks 885 ------- 886 Close the decoder with `drflac_close()`. 887 888 `pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`. 889 890 This is slower than `drflac_open()`, so avoid this one if you don't need metadata. Internally, this will allocate and free memory on the heap for every 891 metadata block except for STREAMINFO and PADDING blocks. 892 893 The caller is notified of the metadata via the `onMeta` callback. All metadata blocks will be handled before the function returns. This callback takes a 894 pointer to a `drflac_metadata` object which is a union containing the data of all relevant metadata blocks. Use the `type` member to discriminate against 895 the different metadata types. 896 897 The STREAMINFO block must be present for this to succeed. Use `drflac_open_with_metadata_relaxed()` to open a FLAC stream where the header may not be present. 898 899 Note that this will behave inconsistently with `drflac_open()` if the stream is an Ogg encapsulated stream and a metadata block is corrupted. This is due to 900 the way the Ogg stream recovers from corrupted pages. When `drflac_open_with_metadata()` is being used, the open routine will try to read the contents of the 901 metadata block, whereas `drflac_open()` will simply seek past it (for the sake of efficiency). This inconsistency can result in different samples being 902 returned depending on whether or not the stream is being opened with metadata. 903 904 905 Seek Also 906 --------- 907 drflac_open_file_with_metadata() 908 drflac_open_memory_with_metadata() 909 drflac_open() 910 drflac_close() 911 */ 912 DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 913 914 /* 915 The same as drflac_open_with_metadata(), except attempts to open the stream even when a header block is not present. 916 917 See Also 918 -------- 919 drflac_open_with_metadata() 920 drflac_open_relaxed() 921 */ 922 DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 923 924 /* 925 Closes the given FLAC decoder. 926 927 928 Parameters 929 ---------- 930 pFlac (in) 931 The decoder to close. 932 933 934 Remarks 935 ------- 936 This will destroy the decoder object. 937 938 939 See Also 940 -------- 941 drflac_open() 942 drflac_open_with_metadata() 943 drflac_open_file() 944 drflac_open_file_w() 945 drflac_open_file_with_metadata() 946 drflac_open_file_with_metadata_w() 947 drflac_open_memory() 948 drflac_open_memory_with_metadata() 949 */ 950 DRFLAC_API void drflac_close(drflac* pFlac); 951 952 953 /* 954 Reads sample data from the given FLAC decoder, output as interleaved signed 32-bit PCM. 955 956 957 Parameters 958 ---------- 959 pFlac (in) 960 The decoder. 961 962 framesToRead (in) 963 The number of PCM frames to read. 964 965 pBufferOut (out, optional) 966 A pointer to the buffer that will receive the decoded samples. 967 968 969 Return Value 970 ------------ 971 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 972 973 974 Remarks 975 ------- 976 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 977 */ 978 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut); 979 980 981 /* 982 Reads sample data from the given FLAC decoder, output as interleaved signed 16-bit PCM. 983 984 985 Parameters 986 ---------- 987 pFlac (in) 988 The decoder. 989 990 framesToRead (in) 991 The number of PCM frames to read. 992 993 pBufferOut (out, optional) 994 A pointer to the buffer that will receive the decoded samples. 995 996 997 Return Value 998 ------------ 999 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 1000 1001 1002 Remarks 1003 ------- 1004 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 1005 1006 Note that this is lossy for streams where the bits per sample is larger than 16. 1007 */ 1008 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut); 1009 1010 /* 1011 Reads sample data from the given FLAC decoder, output as interleaved 32-bit floating point PCM. 1012 1013 1014 Parameters 1015 ---------- 1016 pFlac (in) 1017 The decoder. 1018 1019 framesToRead (in) 1020 The number of PCM frames to read. 1021 1022 pBufferOut (out, optional) 1023 A pointer to the buffer that will receive the decoded samples. 1024 1025 1026 Return Value 1027 ------------ 1028 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 1029 1030 1031 Remarks 1032 ------- 1033 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 1034 1035 Note that this should be considered lossy due to the nature of floating point numbers not being able to exactly represent every possible number. 1036 */ 1037 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut); 1038 1039 /* 1040 Seeks to the PCM frame at the given index. 1041 1042 1043 Parameters 1044 ---------- 1045 pFlac (in) 1046 The decoder. 1047 1048 pcmFrameIndex (in) 1049 The index of the PCM frame to seek to. See notes below. 1050 1051 1052 Return Value 1053 ------------- 1054 `DRFLAC_TRUE` if successful; `DRFLAC_FALSE` otherwise. 1055 */ 1056 DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex); 1057 1058 1059 1060 #ifndef DR_FLAC_NO_STDIO 1061 /* 1062 Opens a FLAC decoder from the file at the given path. 1063 1064 1065 Parameters 1066 ---------- 1067 pFileName (in) 1068 The path of the file to open, either absolute or relative to the current directory. 1069 1070 pAllocationCallbacks (in, optional) 1071 A pointer to application defined callbacks for managing memory allocations. 1072 1073 1074 Return Value 1075 ------------ 1076 A pointer to an object representing the decoder. 1077 1078 1079 Remarks 1080 ------- 1081 Close the decoder with drflac_close(). 1082 1083 1084 Remarks 1085 ------- 1086 This will hold a handle to the file until the decoder is closed with drflac_close(). Some platforms will restrict the number of files a process can have open 1087 at any given time, so keep this mind if you have many decoders open at the same time. 1088 1089 1090 See Also 1091 -------- 1092 drflac_open_file_with_metadata() 1093 drflac_open() 1094 drflac_close() 1095 */ 1096 DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks); 1097 DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks); 1098 1099 /* 1100 Opens a FLAC decoder from the file at the given path and notifies the caller of the metadata chunks (album art, etc.) 1101 1102 1103 Parameters 1104 ---------- 1105 pFileName (in) 1106 The path of the file to open, either absolute or relative to the current directory. 1107 1108 pAllocationCallbacks (in, optional) 1109 A pointer to application defined callbacks for managing memory allocations. 1110 1111 onMeta (in) 1112 The callback to fire for each metadata block. 1113 1114 pUserData (in) 1115 A pointer to the user data to pass to the metadata callback. 1116 1117 pAllocationCallbacks (in) 1118 A pointer to application defined callbacks for managing memory allocations. 1119 1120 1121 Remarks 1122 ------- 1123 Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled. 1124 1125 1126 See Also 1127 -------- 1128 drflac_open_with_metadata() 1129 drflac_open() 1130 drflac_close() 1131 */ 1132 DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 1133 DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 1134 #endif 1135 1136 /* 1137 Opens a FLAC decoder from a pre-allocated block of memory 1138 1139 1140 Parameters 1141 ---------- 1142 pData (in) 1143 A pointer to the raw encoded FLAC data. 1144 1145 dataSize (in) 1146 The size in bytes of `data`. 1147 1148 pAllocationCallbacks (in) 1149 A pointer to application defined callbacks for managing memory allocations. 1150 1151 1152 Return Value 1153 ------------ 1154 A pointer to an object representing the decoder. 1155 1156 1157 Remarks 1158 ------- 1159 This does not create a copy of the data. It is up to the application to ensure the buffer remains valid for the lifetime of the decoder. 1160 1161 1162 See Also 1163 -------- 1164 drflac_open() 1165 drflac_close() 1166 */ 1167 DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks); 1168 1169 /* 1170 Opens a FLAC decoder from a pre-allocated block of memory and notifies the caller of the metadata chunks (album art, etc.) 1171 1172 1173 Parameters 1174 ---------- 1175 pData (in) 1176 A pointer to the raw encoded FLAC data. 1177 1178 dataSize (in) 1179 The size in bytes of `data`. 1180 1181 onMeta (in) 1182 The callback to fire for each metadata block. 1183 1184 pUserData (in) 1185 A pointer to the user data to pass to the metadata callback. 1186 1187 pAllocationCallbacks (in) 1188 A pointer to application defined callbacks for managing memory allocations. 1189 1190 1191 Remarks 1192 ------- 1193 Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled. 1194 1195 1196 See Also 1197 ------- 1198 drflac_open_with_metadata() 1199 drflac_open() 1200 drflac_close() 1201 */ 1202 DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 1203 1204 1205 1206 /* High Level APIs */ 1207 1208 /* 1209 Opens a FLAC stream from the given callbacks and fully decodes it in a single operation. The return value is a 1210 pointer to the sample data as interleaved signed 32-bit PCM. The returned data must be freed with drflac_free(). 1211 1212 You can pass in custom memory allocation callbacks via the pAllocationCallbacks parameter. This can be NULL in which 1213 case it will use DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE. 1214 1215 Sometimes a FLAC file won't keep track of the total sample count. In this situation the function will continuously 1216 read samples into a dynamically sized buffer on the heap until no samples are left. 1217 1218 Do not call this function on a broadcast type of stream (like internet radio streams and whatnot). 1219 */ 1220 DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1221 1222 /* Same as drflac_open_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 1223 DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1224 1225 /* Same as drflac_open_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 1226 DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1227 1228 #ifndef DR_FLAC_NO_STDIO 1229 /* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a file. */ 1230 DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1231 1232 /* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 1233 DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1234 1235 /* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 1236 DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1237 #endif 1238 1239 /* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a block of memory. */ 1240 DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1241 1242 /* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 1243 DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1244 1245 /* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 1246 DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 1247 1248 /* 1249 Frees memory that was allocated internally by dr_flac. 1250 1251 Set pAllocationCallbacks to the same object that was passed to drflac_open_*_and_read_pcm_frames_*(). If you originally passed in NULL, pass in NULL for this. 1252 */ 1253 DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks); 1254 1255 1256 /* Structure representing an iterator for vorbis comments in a VORBIS_COMMENT metadata block. */ 1257 typedef struct 1258 { 1259 drflac_uint32 countRemaining; 1260 const char* pRunningData; 1261 } drflac_vorbis_comment_iterator; 1262 1263 /* 1264 Initializes a vorbis comment iterator. This can be used for iterating over the vorbis comments in a VORBIS_COMMENT 1265 metadata block. 1266 */ 1267 DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments); 1268 1269 /* 1270 Goes to the next vorbis comment in the given iterator. If null is returned it means there are no more comments. The 1271 returned string is NOT null terminated. 1272 */ 1273 DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut); 1274 1275 1276 /* Structure representing an iterator for cuesheet tracks in a CUESHEET metadata block. */ 1277 typedef struct 1278 { 1279 drflac_uint32 countRemaining; 1280 const char* pRunningData; 1281 } drflac_cuesheet_track_iterator; 1282 1283 /* Packing is important on this structure because we map this directly to the raw data within the CUESHEET metadata block. */ 1284 #pragma pack(4) 1285 typedef struct 1286 { 1287 drflac_uint64 offset; 1288 drflac_uint8 index; 1289 drflac_uint8 reserved[3]; 1290 } drflac_cuesheet_track_index; 1291 #pragma pack() 1292 1293 typedef struct 1294 { 1295 drflac_uint64 offset; 1296 drflac_uint8 trackNumber; 1297 char ISRC[12]; 1298 drflac_bool8 isAudio; 1299 drflac_bool8 preEmphasis; 1300 drflac_uint8 indexCount; 1301 const drflac_cuesheet_track_index* pIndexPoints; 1302 } drflac_cuesheet_track; 1303 1304 /* 1305 Initializes a cuesheet track iterator. This can be used for iterating over the cuesheet tracks in a CUESHEET metadata 1306 block. 1307 */ 1308 DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData); 1309 1310 /* Goes to the next cuesheet track in the given iterator. If DRFLAC_FALSE is returned it means there are no more comments. */ 1311 DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack); 1312 1313 1314 #ifdef __cplusplus 1315 } 1316 #endif 1317 #endif /* dr_flac_h */ 1318 1319 1320 /************************************************************************************************************************************************************ 1321 ************************************************************************************************************************************************************ 1322 1323 IMPLEMENTATION 1324 1325 ************************************************************************************************************************************************************ 1326 ************************************************************************************************************************************************************/ 1327 #if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION) 1328 #ifndef dr_flac_c 1329 #define dr_flac_c 1330 1331 /* Disable some annoying warnings. */ 1332 #if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))) 1333 #pragma GCC diagnostic push 1334 #if __GNUC__ >= 7 1335 #pragma GCC diagnostic ignored "-Wimplicit-fallthrough" 1336 #endif 1337 #endif 1338 1339 #ifdef __linux__ 1340 #ifndef _BSD_SOURCE 1341 #define _BSD_SOURCE 1342 #endif 1343 #ifndef _DEFAULT_SOURCE 1344 #define _DEFAULT_SOURCE 1345 #endif 1346 #ifndef __USE_BSD 1347 #define __USE_BSD 1348 #endif 1349 #include <endian.h> 1350 #endif 1351 1352 #include <stdlib.h> 1353 #include <string.h> 1354 1355 #ifdef _MSC_VER 1356 #define DRFLAC_INLINE __forceinline 1357 #elif defined(__GNUC__) 1358 /* 1359 I've had a bug report where GCC is emitting warnings about functions possibly not being inlineable. This warning happens when 1360 the __attribute__((always_inline)) attribute is defined without an "inline" statement. I think therefore there must be some 1361 case where "__inline__" is not always defined, thus the compiler emitting these warnings. When using -std=c89 or -ansi on the 1362 command line, we cannot use the "inline" keyword and instead need to use "__inline__". In an attempt to work around this issue 1363 I am using "__inline__" only when we're compiling in strict ANSI mode. 1364 */ 1365 #if defined(__STRICT_ANSI__) 1366 #define DRFLAC_INLINE __inline__ __attribute__((always_inline)) 1367 #else 1368 #define DRFLAC_INLINE inline __attribute__((always_inline)) 1369 #endif 1370 #elif defined(__WATCOMC__) 1371 #define DRFLAC_INLINE __inline 1372 #else 1373 #define DRFLAC_INLINE 1374 #endif 1375 1376 /* CPU architecture. */ 1377 #if defined(__x86_64__) || defined(_M_X64) 1378 #define DRFLAC_X64 1379 #elif defined(__i386) || defined(_M_IX86) 1380 #define DRFLAC_X86 1381 #elif defined(__arm__) || defined(_M_ARM) || defined(_M_ARM64) 1382 #define DRFLAC_ARM 1383 #endif 1384 1385 /* 1386 Intrinsics Support 1387 1388 There's a bug in GCC 4.2.x which results in an incorrect compilation error when using _mm_slli_epi32() where it complains with 1389 1390 "error: shift must be an immediate" 1391 1392 Unfortuantely dr_flac depends on this for a few things so we're just going to disable SSE on GCC 4.2 and below. 1393 */ 1394 #if !defined(DR_FLAC_NO_SIMD) 1395 #if defined(DRFLAC_X64) || defined(DRFLAC_X86) 1396 #if defined(_MSC_VER) && !defined(__clang__) 1397 /* MSVC. */ 1398 #if _MSC_VER >= 1400 && !defined(DRFLAC_NO_SSE2) /* 2005 */ 1399 #define DRFLAC_SUPPORT_SSE2 1400 #endif 1401 #if _MSC_VER >= 1600 && !defined(DRFLAC_NO_SSE41) /* 2010 */ 1402 #define DRFLAC_SUPPORT_SSE41 1403 #endif 1404 #elif defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) 1405 /* Assume GNUC-style. */ 1406 #if defined(__SSE2__) && !defined(DRFLAC_NO_SSE2) 1407 #define DRFLAC_SUPPORT_SSE2 1408 #endif 1409 #if defined(__SSE4_1__) && !defined(DRFLAC_NO_SSE41) 1410 #define DRFLAC_SUPPORT_SSE41 1411 #endif 1412 #endif 1413 1414 /* If at this point we still haven't determined compiler support for the intrinsics just fall back to __has_include. */ 1415 #if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include) 1416 #if !defined(DRFLAC_SUPPORT_SSE2) && !defined(DRFLAC_NO_SSE2) && __has_include(<emmintrin.h>) 1417 #define DRFLAC_SUPPORT_SSE2 1418 #endif 1419 #if !defined(DRFLAC_SUPPORT_SSE41) && !defined(DRFLAC_NO_SSE41) && __has_include(<smmintrin.h>) 1420 #define DRFLAC_SUPPORT_SSE41 1421 #endif 1422 #endif 1423 1424 #if defined(DRFLAC_SUPPORT_SSE41) 1425 #include <smmintrin.h> 1426 #elif defined(DRFLAC_SUPPORT_SSE2) 1427 #include <emmintrin.h> 1428 #endif 1429 #endif 1430 1431 #if defined(DRFLAC_ARM) 1432 #if !defined(DRFLAC_NO_NEON) && (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64)) 1433 #define DRFLAC_SUPPORT_NEON 1434 #endif 1435 1436 /* Fall back to looking for the #include file. */ 1437 #if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include) 1438 #if !defined(DRFLAC_SUPPORT_NEON) && !defined(DRFLAC_NO_NEON) && __has_include(<arm_neon.h>) 1439 #define DRFLAC_SUPPORT_NEON 1440 #endif 1441 #endif 1442 1443 #if defined(DRFLAC_SUPPORT_NEON) 1444 #include <arm_neon.h> 1445 #endif 1446 #endif 1447 #endif 1448 1449 /* Compile-time CPU feature support. */ 1450 #if !defined(DR_FLAC_NO_SIMD) && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) 1451 #if defined(_MSC_VER) && !defined(__clang__) 1452 #if _MSC_VER >= 1400 1453 #include <intrin.h> 1454 static void drflac__cpuid(int info[4], int fid) 1455 { 1456 __cpuid(info, fid); 1457 } 1458 #else 1459 #define DRFLAC_NO_CPUID 1460 #endif 1461 #else 1462 #if defined(__GNUC__) || defined(__clang__) 1463 static void drflac__cpuid(int info[4], int fid) 1464 { 1465 /* 1466 It looks like the -fPIC option uses the ebx register which GCC complains about. We can work around this by just using a different register, the 1467 specific register of which I'm letting the compiler decide on. The "k" prefix is used to specify a 32-bit register. The {...} syntax is for 1468 supporting different assembly dialects. 1469 1470 What's basically happening is that we're saving and restoring the ebx register manually. 1471 */ 1472 #if defined(DRFLAC_X86) && defined(__PIC__) 1473 __asm__ __volatile__ ( 1474 "xchg{l} {%%}ebx, %k1;" 1475 "cpuid;" 1476 "xchg{l} {%%}ebx, %k1;" 1477 : "=a"(info[0]), "=&r"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0) 1478 ); 1479 #else 1480 __asm__ __volatile__ ( 1481 "cpuid" : "=a"(info[0]), "=b"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0) 1482 ); 1483 #endif 1484 } 1485 #else 1486 #define DRFLAC_NO_CPUID 1487 #endif 1488 #endif 1489 #else 1490 #define DRFLAC_NO_CPUID 1491 #endif 1492 1493 static DRFLAC_INLINE drflac_bool32 drflac_has_sse2(void) 1494 { 1495 #if defined(DRFLAC_SUPPORT_SSE2) 1496 #if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE2) 1497 #if defined(DRFLAC_X64) 1498 return DRFLAC_TRUE; /* 64-bit targets always support SSE2. */ 1499 #elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE2__) 1500 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE2 code we can assume support. */ 1501 #else 1502 #if defined(DRFLAC_NO_CPUID) 1503 return DRFLAC_FALSE; 1504 #else 1505 int info[4]; 1506 drflac__cpuid(info, 1); 1507 return (info[3] & (1 << 26)) != 0; 1508 #endif 1509 #endif 1510 #else 1511 return DRFLAC_FALSE; /* SSE2 is only supported on x86 and x64 architectures. */ 1512 #endif 1513 #else 1514 return DRFLAC_FALSE; /* No compiler support. */ 1515 #endif 1516 } 1517 1518 static DRFLAC_INLINE drflac_bool32 drflac_has_sse41(void) 1519 { 1520 #if defined(DRFLAC_SUPPORT_SSE41) 1521 #if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE41) 1522 #if defined(DRFLAC_X64) 1523 return DRFLAC_TRUE; /* 64-bit targets always support SSE4.1. */ 1524 #elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE4_1__) 1525 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE41 code we can assume support. */ 1526 #else 1527 #if defined(DRFLAC_NO_CPUID) 1528 return DRFLAC_FALSE; 1529 #else 1530 int info[4]; 1531 drflac__cpuid(info, 1); 1532 return (info[2] & (1 << 19)) != 0; 1533 #endif 1534 #endif 1535 #else 1536 return DRFLAC_FALSE; /* SSE41 is only supported on x86 and x64 architectures. */ 1537 #endif 1538 #else 1539 return DRFLAC_FALSE; /* No compiler support. */ 1540 #endif 1541 } 1542 1543 1544 #if defined(_MSC_VER) && _MSC_VER >= 1500 && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) && !defined(__clang__) 1545 #define DRFLAC_HAS_LZCNT_INTRINSIC 1546 #elif (defined(__GNUC__) && ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7))) 1547 #define DRFLAC_HAS_LZCNT_INTRINSIC 1548 #elif defined(__clang__) 1549 #if defined(__has_builtin) 1550 #if __has_builtin(__builtin_clzll) || __has_builtin(__builtin_clzl) 1551 #define DRFLAC_HAS_LZCNT_INTRINSIC 1552 #endif 1553 #endif 1554 #endif 1555 1556 #if defined(_MSC_VER) && _MSC_VER >= 1400 && !defined(__clang__) 1557 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 1558 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 1559 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 1560 #elif defined(__clang__) 1561 #if defined(__has_builtin) 1562 #if __has_builtin(__builtin_bswap16) 1563 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 1564 #endif 1565 #if __has_builtin(__builtin_bswap32) 1566 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 1567 #endif 1568 #if __has_builtin(__builtin_bswap64) 1569 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 1570 #endif 1571 #endif 1572 #elif defined(__GNUC__) 1573 #if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)) 1574 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 1575 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 1576 #endif 1577 #if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)) 1578 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 1579 #endif 1580 #elif defined(__WATCOMC__) && defined(__386__) 1581 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 1582 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 1583 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 1584 extern __inline drflac_uint16 _watcom_bswap16(drflac_uint16); 1585 extern __inline drflac_uint32 _watcom_bswap32(drflac_uint32); 1586 extern __inline drflac_uint64 _watcom_bswap64(drflac_uint64); 1587 #pragma aux _watcom_bswap16 = \ 1588 "xchg al, ah" \ 1589 parm [ax] \ 1590 modify [ax]; 1591 #pragma aux _watcom_bswap32 = \ 1592 "bswap eax" \ 1593 parm [eax] \ 1594 modify [eax]; 1595 #pragma aux _watcom_bswap64 = \ 1596 "bswap eax" \ 1597 "bswap edx" \ 1598 "xchg eax,edx" \ 1599 parm [eax edx] \ 1600 modify [eax edx]; 1601 #endif 1602 1603 1604 /* Standard library stuff. */ 1605 #ifndef DRFLAC_ASSERT 1606 #include <assert.h> 1607 #define DRFLAC_ASSERT(expression) assert(expression) 1608 #endif 1609 #ifndef DRFLAC_MALLOC 1610 #define DRFLAC_MALLOC(sz) malloc((sz)) 1611 #endif 1612 #ifndef DRFLAC_REALLOC 1613 #define DRFLAC_REALLOC(p, sz) realloc((p), (sz)) 1614 #endif 1615 #ifndef DRFLAC_FREE 1616 #define DRFLAC_FREE(p) free((p)) 1617 #endif 1618 #ifndef DRFLAC_COPY_MEMORY 1619 #define DRFLAC_COPY_MEMORY(dst, src, sz) memcpy((dst), (src), (sz)) 1620 #endif 1621 #ifndef DRFLAC_ZERO_MEMORY 1622 #define DRFLAC_ZERO_MEMORY(p, sz) memset((p), 0, (sz)) 1623 #endif 1624 #ifndef DRFLAC_ZERO_OBJECT 1625 #define DRFLAC_ZERO_OBJECT(p) DRFLAC_ZERO_MEMORY((p), sizeof(*(p))) 1626 #endif 1627 1628 #define DRFLAC_MAX_SIMD_VECTOR_SIZE 64 /* 64 for AVX-512 in the future. */ 1629 1630 typedef drflac_int32 drflac_result; 1631 #define DRFLAC_SUCCESS 0 1632 #define DRFLAC_ERROR -1 /* A generic error. */ 1633 #define DRFLAC_INVALID_ARGS -2 1634 #define DRFLAC_INVALID_OPERATION -3 1635 #define DRFLAC_OUT_OF_MEMORY -4 1636 #define DRFLAC_OUT_OF_RANGE -5 1637 #define DRFLAC_ACCESS_DENIED -6 1638 #define DRFLAC_DOES_NOT_EXIST -7 1639 #define DRFLAC_ALREADY_EXISTS -8 1640 #define DRFLAC_TOO_MANY_OPEN_FILES -9 1641 #define DRFLAC_INVALID_FILE -10 1642 #define DRFLAC_TOO_BIG -11 1643 #define DRFLAC_PATH_TOO_LONG -12 1644 #define DRFLAC_NAME_TOO_LONG -13 1645 #define DRFLAC_NOT_DIRECTORY -14 1646 #define DRFLAC_IS_DIRECTORY -15 1647 #define DRFLAC_DIRECTORY_NOT_EMPTY -16 1648 #define DRFLAC_END_OF_FILE -17 1649 #define DRFLAC_NO_SPACE -18 1650 #define DRFLAC_BUSY -19 1651 #define DRFLAC_IO_ERROR -20 1652 #define DRFLAC_INTERRUPT -21 1653 #define DRFLAC_UNAVAILABLE -22 1654 #define DRFLAC_ALREADY_IN_USE -23 1655 #define DRFLAC_BAD_ADDRESS -24 1656 #define DRFLAC_BAD_SEEK -25 1657 #define DRFLAC_BAD_PIPE -26 1658 #define DRFLAC_DEADLOCK -27 1659 #define DRFLAC_TOO_MANY_LINKS -28 1660 #define DRFLAC_NOT_IMPLEMENTED -29 1661 #define DRFLAC_NO_MESSAGE -30 1662 #define DRFLAC_BAD_MESSAGE -31 1663 #define DRFLAC_NO_DATA_AVAILABLE -32 1664 #define DRFLAC_INVALID_DATA -33 1665 #define DRFLAC_TIMEOUT -34 1666 #define DRFLAC_NO_NETWORK -35 1667 #define DRFLAC_NOT_UNIQUE -36 1668 #define DRFLAC_NOT_SOCKET -37 1669 #define DRFLAC_NO_ADDRESS -38 1670 #define DRFLAC_BAD_PROTOCOL -39 1671 #define DRFLAC_PROTOCOL_UNAVAILABLE -40 1672 #define DRFLAC_PROTOCOL_NOT_SUPPORTED -41 1673 #define DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED -42 1674 #define DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED -43 1675 #define DRFLAC_SOCKET_NOT_SUPPORTED -44 1676 #define DRFLAC_CONNECTION_RESET -45 1677 #define DRFLAC_ALREADY_CONNECTED -46 1678 #define DRFLAC_NOT_CONNECTED -47 1679 #define DRFLAC_CONNECTION_REFUSED -48 1680 #define DRFLAC_NO_HOST -49 1681 #define DRFLAC_IN_PROGRESS -50 1682 #define DRFLAC_CANCELLED -51 1683 #define DRFLAC_MEMORY_ALREADY_MAPPED -52 1684 #define DRFLAC_AT_END -53 1685 #define DRFLAC_CRC_MISMATCH -128 1686 1687 #define DRFLAC_SUBFRAME_CONSTANT 0 1688 #define DRFLAC_SUBFRAME_VERBATIM 1 1689 #define DRFLAC_SUBFRAME_FIXED 8 1690 #define DRFLAC_SUBFRAME_LPC 32 1691 #define DRFLAC_SUBFRAME_RESERVED 255 1692 1693 #define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE 0 1694 #define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2 1 1695 1696 #define DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT 0 1697 #define DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE 8 1698 #define DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE 9 1699 #define DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE 10 1700 1701 #define drflac_align(x, a) ((((x) + (a) - 1) / (a)) * (a)) 1702 1703 1704 DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision) 1705 { 1706 if (pMajor) { 1707 *pMajor = DRFLAC_VERSION_MAJOR; 1708 } 1709 1710 if (pMinor) { 1711 *pMinor = DRFLAC_VERSION_MINOR; 1712 } 1713 1714 if (pRevision) { 1715 *pRevision = DRFLAC_VERSION_REVISION; 1716 } 1717 } 1718 1719 DRFLAC_API const char* drflac_version_string(void) 1720 { 1721 return DRFLAC_VERSION_STRING; 1722 } 1723 1724 1725 /* CPU caps. */ 1726 #if defined(__has_feature) 1727 #if __has_feature(thread_sanitizer) 1728 #define DRFLAC_NO_THREAD_SANITIZE __attribute__((no_sanitize("thread"))) 1729 #else 1730 #define DRFLAC_NO_THREAD_SANITIZE 1731 #endif 1732 #else 1733 #define DRFLAC_NO_THREAD_SANITIZE 1734 #endif 1735 1736 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 1737 static drflac_bool32 drflac__gIsLZCNTSupported = DRFLAC_FALSE; 1738 #endif 1739 1740 #ifndef DRFLAC_NO_CPUID 1741 static drflac_bool32 drflac__gIsSSE2Supported = DRFLAC_FALSE; 1742 static drflac_bool32 drflac__gIsSSE41Supported = DRFLAC_FALSE; 1743 1744 /* 1745 I've had a bug report that Clang's ThreadSanitizer presents a warning in this function. Having reviewed this, this does 1746 actually make sense. However, since CPU caps should never differ for a running process, I don't think the trade off of 1747 complicating internal API's by passing around CPU caps versus just disabling the warnings is worthwhile. I'm therefore 1748 just going to disable these warnings. This is disabled via the DRFLAC_NO_THREAD_SANITIZE attribute. 1749 */ 1750 DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void) 1751 { 1752 static drflac_bool32 isCPUCapsInitialized = DRFLAC_FALSE; 1753 1754 if (!isCPUCapsInitialized) { 1755 /* LZCNT */ 1756 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 1757 int info[4] = {0}; 1758 drflac__cpuid(info, 0x80000001); 1759 drflac__gIsLZCNTSupported = (info[2] & (1 << 5)) != 0; 1760 #endif 1761 1762 /* SSE2 */ 1763 drflac__gIsSSE2Supported = drflac_has_sse2(); 1764 1765 /* SSE4.1 */ 1766 drflac__gIsSSE41Supported = drflac_has_sse41(); 1767 1768 /* Initialized. */ 1769 isCPUCapsInitialized = DRFLAC_TRUE; 1770 } 1771 } 1772 #else 1773 static drflac_bool32 drflac__gIsNEONSupported = DRFLAC_FALSE; 1774 1775 static DRFLAC_INLINE drflac_bool32 drflac__has_neon(void) 1776 { 1777 #if defined(DRFLAC_SUPPORT_NEON) 1778 #if defined(DRFLAC_ARM) && !defined(DRFLAC_NO_NEON) 1779 #if (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64)) 1780 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate NEON code we can assume support. */ 1781 #else 1782 /* TODO: Runtime check. */ 1783 return DRFLAC_FALSE; 1784 #endif 1785 #else 1786 return DRFLAC_FALSE; /* NEON is only supported on ARM architectures. */ 1787 #endif 1788 #else 1789 return DRFLAC_FALSE; /* No compiler support. */ 1790 #endif 1791 } 1792 1793 DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void) 1794 { 1795 drflac__gIsNEONSupported = drflac__has_neon(); 1796 1797 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) 1798 drflac__gIsLZCNTSupported = DRFLAC_TRUE; 1799 #endif 1800 } 1801 #endif 1802 1803 1804 /* Endian Management */ 1805 static DRFLAC_INLINE drflac_bool32 drflac__is_little_endian(void) 1806 { 1807 #if defined(DRFLAC_X86) || defined(DRFLAC_X64) 1808 return DRFLAC_TRUE; 1809 #elif defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && __BYTE_ORDER == __LITTLE_ENDIAN 1810 return DRFLAC_TRUE; 1811 #else 1812 int n = 1; 1813 return (*(char*)&n) == 1; 1814 #endif 1815 } 1816 1817 static DRFLAC_INLINE drflac_uint16 drflac__swap_endian_uint16(drflac_uint16 n) 1818 { 1819 #ifdef DRFLAC_HAS_BYTESWAP16_INTRINSIC 1820 #if defined(_MSC_VER) && !defined(__clang__) 1821 return _byteswap_ushort(n); 1822 #elif defined(__GNUC__) || defined(__clang__) 1823 return __builtin_bswap16(n); 1824 #elif defined(__WATCOMC__) && defined(__386__) 1825 return _watcom_bswap16(n); 1826 #else 1827 #error "This compiler does not support the byte swap intrinsic." 1828 #endif 1829 #else 1830 return ((n & 0xFF00) >> 8) | 1831 ((n & 0x00FF) << 8); 1832 #endif 1833 } 1834 1835 static DRFLAC_INLINE drflac_uint32 drflac__swap_endian_uint32(drflac_uint32 n) 1836 { 1837 #ifdef DRFLAC_HAS_BYTESWAP32_INTRINSIC 1838 #if defined(_MSC_VER) && !defined(__clang__) 1839 return _byteswap_ulong(n); 1840 #elif defined(__GNUC__) || defined(__clang__) 1841 #if defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 6) && !defined(DRFLAC_64BIT) /* <-- 64-bit inline assembly has not been tested, so disabling for now. */ 1842 /* Inline assembly optimized implementation for ARM. In my testing, GCC does not generate optimized code with __builtin_bswap32(). */ 1843 drflac_uint32 r; 1844 __asm__ __volatile__ ( 1845 #if defined(DRFLAC_64BIT) 1846 "rev %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(n) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */ 1847 #else 1848 "rev %[out], %[in]" : [out]"=r"(r) : [in]"r"(n) 1849 #endif 1850 ); 1851 return r; 1852 #else 1853 return __builtin_bswap32(n); 1854 #endif 1855 #elif defined(__WATCOMC__) && defined(__386__) 1856 return _watcom_bswap32(n); 1857 #else 1858 #error "This compiler does not support the byte swap intrinsic." 1859 #endif 1860 #else 1861 return ((n & 0xFF000000) >> 24) | 1862 ((n & 0x00FF0000) >> 8) | 1863 ((n & 0x0000FF00) << 8) | 1864 ((n & 0x000000FF) << 24); 1865 #endif 1866 } 1867 1868 static DRFLAC_INLINE drflac_uint64 drflac__swap_endian_uint64(drflac_uint64 n) 1869 { 1870 #ifdef DRFLAC_HAS_BYTESWAP64_INTRINSIC 1871 #if defined(_MSC_VER) && !defined(__clang__) 1872 return _byteswap_uint64(n); 1873 #elif defined(__GNUC__) || defined(__clang__) 1874 return __builtin_bswap64(n); 1875 #elif defined(__WATCOMC__) && defined(__386__) 1876 return _watcom_bswap64(n); 1877 #else 1878 #error "This compiler does not support the byte swap intrinsic." 1879 #endif 1880 #else 1881 /* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */ 1882 return ((n & ((drflac_uint64)0xFF000000 << 32)) >> 56) | 1883 ((n & ((drflac_uint64)0x00FF0000 << 32)) >> 40) | 1884 ((n & ((drflac_uint64)0x0000FF00 << 32)) >> 24) | 1885 ((n & ((drflac_uint64)0x000000FF << 32)) >> 8) | 1886 ((n & ((drflac_uint64)0xFF000000 )) << 8) | 1887 ((n & ((drflac_uint64)0x00FF0000 )) << 24) | 1888 ((n & ((drflac_uint64)0x0000FF00 )) << 40) | 1889 ((n & ((drflac_uint64)0x000000FF )) << 56); 1890 #endif 1891 } 1892 1893 1894 static DRFLAC_INLINE drflac_uint16 drflac__be2host_16(drflac_uint16 n) 1895 { 1896 if (drflac__is_little_endian()) { 1897 return drflac__swap_endian_uint16(n); 1898 } 1899 1900 return n; 1901 } 1902 1903 static DRFLAC_INLINE drflac_uint32 drflac__be2host_32(drflac_uint32 n) 1904 { 1905 if (drflac__is_little_endian()) { 1906 return drflac__swap_endian_uint32(n); 1907 } 1908 1909 return n; 1910 } 1911 1912 static DRFLAC_INLINE drflac_uint64 drflac__be2host_64(drflac_uint64 n) 1913 { 1914 if (drflac__is_little_endian()) { 1915 return drflac__swap_endian_uint64(n); 1916 } 1917 1918 return n; 1919 } 1920 1921 1922 static DRFLAC_INLINE drflac_uint32 drflac__le2host_32(drflac_uint32 n) 1923 { 1924 if (!drflac__is_little_endian()) { 1925 return drflac__swap_endian_uint32(n); 1926 } 1927 1928 return n; 1929 } 1930 1931 1932 static DRFLAC_INLINE drflac_uint32 drflac__unsynchsafe_32(drflac_uint32 n) 1933 { 1934 drflac_uint32 result = 0; 1935 result |= (n & 0x7F000000) >> 3; 1936 result |= (n & 0x007F0000) >> 2; 1937 result |= (n & 0x00007F00) >> 1; 1938 result |= (n & 0x0000007F) >> 0; 1939 1940 return result; 1941 } 1942 1943 1944 1945 /* The CRC code below is based on this document: http://zlib.net/crc_v3.txt */ 1946 static drflac_uint8 drflac__crc8_table[] = { 1947 0x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15, 0x38, 0x3F, 0x36, 0x31, 0x24, 0x23, 0x2A, 0x2D, 1948 0x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65, 0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D, 1949 0xE0, 0xE7, 0xEE, 0xE9, 0xFC, 0xFB, 0xF2, 0xF5, 0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD, 1950 0x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85, 0xA8, 0xAF, 0xA6, 0xA1, 0xB4, 0xB3, 0xBA, 0xBD, 1951 0xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2, 0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA, 1952 0xB7, 0xB0, 0xB9, 0xBE, 0xAB, 0xAC, 0xA5, 0xA2, 0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A, 1953 0x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32, 0x1F, 0x18, 0x11, 0x16, 0x03, 0x04, 0x0D, 0x0A, 1954 0x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42, 0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A, 1955 0x89, 0x8E, 0x87, 0x80, 0x95, 0x92, 0x9B, 0x9C, 0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4, 1956 0xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC, 0xC1, 0xC6, 0xCF, 0xC8, 0xDD, 0xDA, 0xD3, 0xD4, 1957 0x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C, 0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44, 1958 0x19, 0x1E, 0x17, 0x10, 0x05, 0x02, 0x0B, 0x0C, 0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34, 1959 0x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B, 0x76, 0x71, 0x78, 0x7F, 0x6A, 0x6D, 0x64, 0x63, 1960 0x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B, 0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13, 1961 0xAE, 0xA9, 0xA0, 0xA7, 0xB2, 0xB5, 0xBC, 0xBB, 0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83, 1962 0xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB, 0xE6, 0xE1, 0xE8, 0xEF, 0xFA, 0xFD, 0xF4, 0xF3 1963 }; 1964 1965 static drflac_uint16 drflac__crc16_table[] = { 1966 0x0000, 0x8005, 0x800F, 0x000A, 0x801B, 0x001E, 0x0014, 0x8011, 1967 0x8033, 0x0036, 0x003C, 0x8039, 0x0028, 0x802D, 0x8027, 0x0022, 1968 0x8063, 0x0066, 0x006C, 0x8069, 0x0078, 0x807D, 0x8077, 0x0072, 1969 0x0050, 0x8055, 0x805F, 0x005A, 0x804B, 0x004E, 0x0044, 0x8041, 1970 0x80C3, 0x00C6, 0x00CC, 0x80C9, 0x00D8, 0x80DD, 0x80D7, 0x00D2, 1971 0x00F0, 0x80F5, 0x80FF, 0x00FA, 0x80EB, 0x00EE, 0x00E4, 0x80E1, 1972 0x00A0, 0x80A5, 0x80AF, 0x00AA, 0x80BB, 0x00BE, 0x00B4, 0x80B1, 1973 0x8093, 0x0096, 0x009C, 0x8099, 0x0088, 0x808D, 0x8087, 0x0082, 1974 0x8183, 0x0186, 0x018C, 0x8189, 0x0198, 0x819D, 0x8197, 0x0192, 1975 0x01B0, 0x81B5, 0x81BF, 0x01BA, 0x81AB, 0x01AE, 0x01A4, 0x81A1, 1976 0x01E0, 0x81E5, 0x81EF, 0x01EA, 0x81FB, 0x01FE, 0x01F4, 0x81F1, 1977 0x81D3, 0x01D6, 0x01DC, 0x81D9, 0x01C8, 0x81CD, 0x81C7, 0x01C2, 1978 0x0140, 0x8145, 0x814F, 0x014A, 0x815B, 0x015E, 0x0154, 0x8151, 1979 0x8173, 0x0176, 0x017C, 0x8179, 0x0168, 0x816D, 0x8167, 0x0162, 1980 0x8123, 0x0126, 0x012C, 0x8129, 0x0138, 0x813D, 0x8137, 0x0132, 1981 0x0110, 0x8115, 0x811F, 0x011A, 0x810B, 0x010E, 0x0104, 0x8101, 1982 0x8303, 0x0306, 0x030C, 0x8309, 0x0318, 0x831D, 0x8317, 0x0312, 1983 0x0330, 0x8335, 0x833F, 0x033A, 0x832B, 0x032E, 0x0324, 0x8321, 1984 0x0360, 0x8365, 0x836F, 0x036A, 0x837B, 0x037E, 0x0374, 0x8371, 1985 0x8353, 0x0356, 0x035C, 0x8359, 0x0348, 0x834D, 0x8347, 0x0342, 1986 0x03C0, 0x83C5, 0x83CF, 0x03CA, 0x83DB, 0x03DE, 0x03D4, 0x83D1, 1987 0x83F3, 0x03F6, 0x03FC, 0x83F9, 0x03E8, 0x83ED, 0x83E7, 0x03E2, 1988 0x83A3, 0x03A6, 0x03AC, 0x83A9, 0x03B8, 0x83BD, 0x83B7, 0x03B2, 1989 0x0390, 0x8395, 0x839F, 0x039A, 0x838B, 0x038E, 0x0384, 0x8381, 1990 0x0280, 0x8285, 0x828F, 0x028A, 0x829B, 0x029E, 0x0294, 0x8291, 1991 0x82B3, 0x02B6, 0x02BC, 0x82B9, 0x02A8, 0x82AD, 0x82A7, 0x02A2, 1992 0x82E3, 0x02E6, 0x02EC, 0x82E9, 0x02F8, 0x82FD, 0x82F7, 0x02F2, 1993 0x02D0, 0x82D5, 0x82DF, 0x02DA, 0x82CB, 0x02CE, 0x02C4, 0x82C1, 1994 0x8243, 0x0246, 0x024C, 0x8249, 0x0258, 0x825D, 0x8257, 0x0252, 1995 0x0270, 0x8275, 0x827F, 0x027A, 0x826B, 0x026E, 0x0264, 0x8261, 1996 0x0220, 0x8225, 0x822F, 0x022A, 0x823B, 0x023E, 0x0234, 0x8231, 1997 0x8213, 0x0216, 0x021C, 0x8219, 0x0208, 0x820D, 0x8207, 0x0202 1998 }; 1999 2000 static DRFLAC_INLINE drflac_uint8 drflac_crc8_byte(drflac_uint8 crc, drflac_uint8 data) 2001 { 2002 return drflac__crc8_table[crc ^ data]; 2003 } 2004 2005 static DRFLAC_INLINE drflac_uint8 drflac_crc8(drflac_uint8 crc, drflac_uint32 data, drflac_uint32 count) 2006 { 2007 #ifdef DR_FLAC_NO_CRC 2008 (void)crc; 2009 (void)data; 2010 (void)count; 2011 return 0; 2012 #else 2013 #if 0 2014 /* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc8(crc, 0, 8);") */ 2015 drflac_uint8 p = 0x07; 2016 for (int i = count-1; i >= 0; --i) { 2017 drflac_uint8 bit = (data & (1 << i)) >> i; 2018 if (crc & 0x80) { 2019 crc = ((crc << 1) | bit) ^ p; 2020 } else { 2021 crc = ((crc << 1) | bit); 2022 } 2023 } 2024 return crc; 2025 #else 2026 drflac_uint32 wholeBytes; 2027 drflac_uint32 leftoverBits; 2028 drflac_uint64 leftoverDataMask; 2029 2030 static drflac_uint64 leftoverDataMaskTable[8] = { 2031 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 2032 }; 2033 2034 DRFLAC_ASSERT(count <= 32); 2035 2036 wholeBytes = count >> 3; 2037 leftoverBits = count - (wholeBytes*8); 2038 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 2039 2040 switch (wholeBytes) { 2041 case 4: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits))); 2042 case 3: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits))); 2043 case 2: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits))); 2044 case 1: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits))); 2045 case 0: if (leftoverBits > 0) crc = (drflac_uint8)((crc << leftoverBits) ^ drflac__crc8_table[(crc >> (8 - leftoverBits)) ^ (data & leftoverDataMask)]); 2046 } 2047 return crc; 2048 #endif 2049 #endif 2050 } 2051 2052 static DRFLAC_INLINE drflac_uint16 drflac_crc16_byte(drflac_uint16 crc, drflac_uint8 data) 2053 { 2054 return (crc << 8) ^ drflac__crc16_table[(drflac_uint8)(crc >> 8) ^ data]; 2055 } 2056 2057 static DRFLAC_INLINE drflac_uint16 drflac_crc16_cache(drflac_uint16 crc, drflac_cache_t data) 2058 { 2059 #ifdef DRFLAC_64BIT 2060 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF)); 2061 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF)); 2062 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF)); 2063 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF)); 2064 #endif 2065 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF)); 2066 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF)); 2067 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF)); 2068 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF)); 2069 2070 return crc; 2071 } 2072 2073 static DRFLAC_INLINE drflac_uint16 drflac_crc16_bytes(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 byteCount) 2074 { 2075 switch (byteCount) 2076 { 2077 #ifdef DRFLAC_64BIT 2078 case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF)); 2079 case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF)); 2080 case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF)); 2081 case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF)); 2082 #endif 2083 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF)); 2084 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF)); 2085 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF)); 2086 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF)); 2087 } 2088 2089 return crc; 2090 } 2091 2092 #if 0 2093 static DRFLAC_INLINE drflac_uint16 drflac_crc16__32bit(drflac_uint16 crc, drflac_uint32 data, drflac_uint32 count) 2094 { 2095 #ifdef DR_FLAC_NO_CRC 2096 (void)crc; 2097 (void)data; 2098 (void)count; 2099 return 0; 2100 #else 2101 #if 0 2102 /* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc16(crc, 0, 16);") */ 2103 drflac_uint16 p = 0x8005; 2104 for (int i = count-1; i >= 0; --i) { 2105 drflac_uint16 bit = (data & (1ULL << i)) >> i; 2106 if (r & 0x8000) { 2107 r = ((r << 1) | bit) ^ p; 2108 } else { 2109 r = ((r << 1) | bit); 2110 } 2111 } 2112 2113 return crc; 2114 #else 2115 drflac_uint32 wholeBytes; 2116 drflac_uint32 leftoverBits; 2117 drflac_uint64 leftoverDataMask; 2118 2119 static drflac_uint64 leftoverDataMaskTable[8] = { 2120 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 2121 }; 2122 2123 DRFLAC_ASSERT(count <= 64); 2124 2125 wholeBytes = count >> 3; 2126 leftoverBits = count & 7; 2127 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 2128 2129 switch (wholeBytes) { 2130 default: 2131 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits))); 2132 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits))); 2133 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits))); 2134 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits))); 2135 case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)]; 2136 } 2137 return crc; 2138 #endif 2139 #endif 2140 } 2141 2142 static DRFLAC_INLINE drflac_uint16 drflac_crc16__64bit(drflac_uint16 crc, drflac_uint64 data, drflac_uint32 count) 2143 { 2144 #ifdef DR_FLAC_NO_CRC 2145 (void)crc; 2146 (void)data; 2147 (void)count; 2148 return 0; 2149 #else 2150 drflac_uint32 wholeBytes; 2151 drflac_uint32 leftoverBits; 2152 drflac_uint64 leftoverDataMask; 2153 2154 static drflac_uint64 leftoverDataMaskTable[8] = { 2155 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 2156 }; 2157 2158 DRFLAC_ASSERT(count <= 64); 2159 2160 wholeBytes = count >> 3; 2161 leftoverBits = count & 7; 2162 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 2163 2164 switch (wholeBytes) { 2165 default: 2166 case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 << 32) << leftoverBits)) >> (56 + leftoverBits))); /* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */ 2167 case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 << 32) << leftoverBits)) >> (48 + leftoverBits))); 2168 case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 << 32) << leftoverBits)) >> (40 + leftoverBits))); 2169 case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF << 32) << leftoverBits)) >> (32 + leftoverBits))); 2170 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 ) << leftoverBits)) >> (24 + leftoverBits))); 2171 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 ) << leftoverBits)) >> (16 + leftoverBits))); 2172 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 ) << leftoverBits)) >> ( 8 + leftoverBits))); 2173 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF ) << leftoverBits)) >> ( 0 + leftoverBits))); 2174 case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)]; 2175 } 2176 return crc; 2177 #endif 2178 } 2179 2180 2181 static DRFLAC_INLINE drflac_uint16 drflac_crc16(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 count) 2182 { 2183 #ifdef DRFLAC_64BIT 2184 return drflac_crc16__64bit(crc, data, count); 2185 #else 2186 return drflac_crc16__32bit(crc, data, count); 2187 #endif 2188 } 2189 #endif 2190 2191 2192 #ifdef DRFLAC_64BIT 2193 #define drflac__be2host__cache_line drflac__be2host_64 2194 #else 2195 #define drflac__be2host__cache_line drflac__be2host_32 2196 #endif 2197 2198 /* 2199 BIT READING ATTEMPT #2 2200 2201 This uses a 32- or 64-bit bit-shifted cache - as bits are read, the cache is shifted such that the first valid bit is sitting 2202 on the most significant bit. It uses the notion of an L1 and L2 cache (borrowed from CPU architecture), where the L1 cache 2203 is a 32- or 64-bit unsigned integer (depending on whether or not a 32- or 64-bit build is being compiled) and the L2 is an 2204 array of "cache lines", with each cache line being the same size as the L1. The L2 is a buffer of about 4KB and is where data 2205 from onRead() is read into. 2206 */ 2207 #define DRFLAC_CACHE_L1_SIZE_BYTES(bs) (sizeof((bs)->cache)) 2208 #define DRFLAC_CACHE_L1_SIZE_BITS(bs) (sizeof((bs)->cache)*8) 2209 #define DRFLAC_CACHE_L1_BITS_REMAINING(bs) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (bs)->consumedBits) 2210 #define DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount) (~((~(drflac_cache_t)0) >> (_bitCount))) 2211 #define DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (_bitCount)) 2212 #define DRFLAC_CACHE_L1_SELECT(bs, _bitCount) (((bs)->cache) & DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount)) 2213 #define DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount))) 2214 #define DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, _bitCount)(DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> (DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)) & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1))) 2215 #define DRFLAC_CACHE_L2_SIZE_BYTES(bs) (sizeof((bs)->cacheL2)) 2216 #define DRFLAC_CACHE_L2_LINE_COUNT(bs) (DRFLAC_CACHE_L2_SIZE_BYTES(bs) / sizeof((bs)->cacheL2[0])) 2217 #define DRFLAC_CACHE_L2_LINES_REMAINING(bs) (DRFLAC_CACHE_L2_LINE_COUNT(bs) - (bs)->nextL2Line) 2218 2219 2220 #ifndef DR_FLAC_NO_CRC 2221 static DRFLAC_INLINE void drflac__reset_crc16(drflac_bs* bs) 2222 { 2223 bs->crc16 = 0; 2224 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 2225 } 2226 2227 static DRFLAC_INLINE void drflac__update_crc16(drflac_bs* bs) 2228 { 2229 if (bs->crc16CacheIgnoredBytes == 0) { 2230 bs->crc16 = drflac_crc16_cache(bs->crc16, bs->crc16Cache); 2231 } else { 2232 bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache, DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bs->crc16CacheIgnoredBytes); 2233 bs->crc16CacheIgnoredBytes = 0; 2234 } 2235 } 2236 2237 static DRFLAC_INLINE drflac_uint16 drflac__flush_crc16(drflac_bs* bs) 2238 { 2239 /* We should never be flushing in a situation where we are not aligned on a byte boundary. */ 2240 DRFLAC_ASSERT((DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7) == 0); 2241 2242 /* 2243 The bits that were read from the L1 cache need to be accumulated. The number of bytes needing to be accumulated is determined 2244 by the number of bits that have been consumed. 2245 */ 2246 if (DRFLAC_CACHE_L1_BITS_REMAINING(bs) == 0) { 2247 drflac__update_crc16(bs); 2248 } else { 2249 /* We only accumulate the consumed bits. */ 2250 bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache >> DRFLAC_CACHE_L1_BITS_REMAINING(bs), (bs->consumedBits >> 3) - bs->crc16CacheIgnoredBytes); 2251 2252 /* 2253 The bits that we just accumulated should never be accumulated again. We need to keep track of how many bytes were accumulated 2254 so we can handle that later. 2255 */ 2256 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 2257 } 2258 2259 return bs->crc16; 2260 } 2261 #endif 2262 2263 static DRFLAC_INLINE drflac_bool32 drflac__reload_l1_cache_from_l2(drflac_bs* bs) 2264 { 2265 size_t bytesRead; 2266 size_t alignedL1LineCount; 2267 2268 /* Fast path. Try loading straight from L2. */ 2269 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 2270 bs->cache = bs->cacheL2[bs->nextL2Line++]; 2271 return DRFLAC_TRUE; 2272 } 2273 2274 /* 2275 If we get here it means we've run out of data in the L2 cache. We'll need to fetch more from the client, if there's 2276 any left. 2277 */ 2278 if (bs->unalignedByteCount > 0) { 2279 return DRFLAC_FALSE; /* If we have any unaligned bytes it means there's no more aligned bytes left in the client. */ 2280 } 2281 2282 bytesRead = bs->onRead(bs->pUserData, bs->cacheL2, DRFLAC_CACHE_L2_SIZE_BYTES(bs)); 2283 2284 bs->nextL2Line = 0; 2285 if (bytesRead == DRFLAC_CACHE_L2_SIZE_BYTES(bs)) { 2286 bs->cache = bs->cacheL2[bs->nextL2Line++]; 2287 return DRFLAC_TRUE; 2288 } 2289 2290 2291 /* 2292 If we get here it means we were unable to retrieve enough data to fill the entire L2 cache. It probably 2293 means we've just reached the end of the file. We need to move the valid data down to the end of the buffer 2294 and adjust the index of the next line accordingly. Also keep in mind that the L2 cache must be aligned to 2295 the size of the L1 so we'll need to seek backwards by any misaligned bytes. 2296 */ 2297 alignedL1LineCount = bytesRead / DRFLAC_CACHE_L1_SIZE_BYTES(bs); 2298 2299 /* We need to keep track of any unaligned bytes for later use. */ 2300 bs->unalignedByteCount = bytesRead - (alignedL1LineCount * DRFLAC_CACHE_L1_SIZE_BYTES(bs)); 2301 if (bs->unalignedByteCount > 0) { 2302 bs->unalignedCache = bs->cacheL2[alignedL1LineCount]; 2303 } 2304 2305 if (alignedL1LineCount > 0) { 2306 size_t offset = DRFLAC_CACHE_L2_LINE_COUNT(bs) - alignedL1LineCount; 2307 size_t i; 2308 for (i = alignedL1LineCount; i > 0; --i) { 2309 bs->cacheL2[i-1 + offset] = bs->cacheL2[i-1]; 2310 } 2311 2312 bs->nextL2Line = (drflac_uint32)offset; 2313 bs->cache = bs->cacheL2[bs->nextL2Line++]; 2314 return DRFLAC_TRUE; 2315 } else { 2316 /* If we get into this branch it means we weren't able to load any L1-aligned data. */ 2317 bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); 2318 return DRFLAC_FALSE; 2319 } 2320 } 2321 2322 static drflac_bool32 drflac__reload_cache(drflac_bs* bs) 2323 { 2324 size_t bytesRead; 2325 2326 #ifndef DR_FLAC_NO_CRC 2327 drflac__update_crc16(bs); 2328 #endif 2329 2330 /* Fast path. Try just moving the next value in the L2 cache to the L1 cache. */ 2331 if (drflac__reload_l1_cache_from_l2(bs)) { 2332 bs->cache = drflac__be2host__cache_line(bs->cache); 2333 bs->consumedBits = 0; 2334 #ifndef DR_FLAC_NO_CRC 2335 bs->crc16Cache = bs->cache; 2336 #endif 2337 return DRFLAC_TRUE; 2338 } 2339 2340 /* Slow path. */ 2341 2342 /* 2343 If we get here it means we have failed to load the L1 cache from the L2. Likely we've just reached the end of the stream and the last 2344 few bytes did not meet the alignment requirements for the L2 cache. In this case we need to fall back to a slower path and read the 2345 data from the unaligned cache. 2346 */ 2347 bytesRead = bs->unalignedByteCount; 2348 if (bytesRead == 0) { 2349 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- The stream has been exhausted, so marked the bits as consumed. */ 2350 return DRFLAC_FALSE; 2351 } 2352 2353 DRFLAC_ASSERT(bytesRead < DRFLAC_CACHE_L1_SIZE_BYTES(bs)); 2354 bs->consumedBits = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bytesRead) * 8; 2355 2356 bs->cache = drflac__be2host__cache_line(bs->unalignedCache); 2357 bs->cache &= DRFLAC_CACHE_L1_SELECTION_MASK(DRFLAC_CACHE_L1_BITS_REMAINING(bs)); /* <-- Make sure the consumed bits are always set to zero. Other parts of the library depend on this property. */ 2358 bs->unalignedByteCount = 0; /* <-- At this point the unaligned bytes have been moved into the cache and we thus have no more unaligned bytes. */ 2359 2360 #ifndef DR_FLAC_NO_CRC 2361 bs->crc16Cache = bs->cache >> bs->consumedBits; 2362 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 2363 #endif 2364 return DRFLAC_TRUE; 2365 } 2366 2367 static void drflac__reset_cache(drflac_bs* bs) 2368 { 2369 bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); /* <-- This clears the L2 cache. */ 2370 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- This clears the L1 cache. */ 2371 bs->cache = 0; 2372 bs->unalignedByteCount = 0; /* <-- This clears the trailing unaligned bytes. */ 2373 bs->unalignedCache = 0; 2374 2375 #ifndef DR_FLAC_NO_CRC 2376 bs->crc16Cache = 0; 2377 bs->crc16CacheIgnoredBytes = 0; 2378 #endif 2379 } 2380 2381 2382 static DRFLAC_INLINE drflac_bool32 drflac__read_uint32(drflac_bs* bs, unsigned int bitCount, drflac_uint32* pResultOut) 2383 { 2384 DRFLAC_ASSERT(bs != NULL); 2385 DRFLAC_ASSERT(pResultOut != NULL); 2386 DRFLAC_ASSERT(bitCount > 0); 2387 DRFLAC_ASSERT(bitCount <= 32); 2388 2389 if (bs->consumedBits == DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 2390 if (!drflac__reload_cache(bs)) { 2391 return DRFLAC_FALSE; 2392 } 2393 } 2394 2395 if (bitCount <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 2396 /* 2397 If we want to load all 32-bits from a 32-bit cache we need to do it slightly differently because we can't do 2398 a 32-bit shift on a 32-bit integer. This will never be the case on 64-bit caches, so we can have a slightly 2399 more optimal solution for this. 2400 */ 2401 #ifdef DRFLAC_64BIT 2402 *pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount); 2403 bs->consumedBits += bitCount; 2404 bs->cache <<= bitCount; 2405 #else 2406 if (bitCount < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 2407 *pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount); 2408 bs->consumedBits += bitCount; 2409 bs->cache <<= bitCount; 2410 } else { 2411 /* Cannot shift by 32-bits, so need to do it differently. */ 2412 *pResultOut = (drflac_uint32)bs->cache; 2413 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); 2414 bs->cache = 0; 2415 } 2416 #endif 2417 2418 return DRFLAC_TRUE; 2419 } else { 2420 /* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */ 2421 drflac_uint32 bitCountHi = DRFLAC_CACHE_L1_BITS_REMAINING(bs); 2422 drflac_uint32 bitCountLo = bitCount - bitCountHi; 2423 drflac_uint32 resultHi; 2424 2425 DRFLAC_ASSERT(bitCountHi > 0); 2426 DRFLAC_ASSERT(bitCountHi < 32); 2427 resultHi = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountHi); 2428 2429 if (!drflac__reload_cache(bs)) { 2430 return DRFLAC_FALSE; 2431 } 2432 2433 *pResultOut = (resultHi << bitCountLo) | (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountLo); 2434 bs->consumedBits += bitCountLo; 2435 bs->cache <<= bitCountLo; 2436 return DRFLAC_TRUE; 2437 } 2438 } 2439 2440 static drflac_bool32 drflac__read_int32(drflac_bs* bs, unsigned int bitCount, drflac_int32* pResult) 2441 { 2442 drflac_uint32 result; 2443 2444 DRFLAC_ASSERT(bs != NULL); 2445 DRFLAC_ASSERT(pResult != NULL); 2446 DRFLAC_ASSERT(bitCount > 0); 2447 DRFLAC_ASSERT(bitCount <= 32); 2448 2449 if (!drflac__read_uint32(bs, bitCount, &result)) { 2450 return DRFLAC_FALSE; 2451 } 2452 2453 /* Do not attempt to shift by 32 as it's undefined. */ 2454 if (bitCount < 32) { 2455 drflac_uint32 signbit; 2456 signbit = ((result >> (bitCount-1)) & 0x01); 2457 result |= (~signbit + 1) << bitCount; 2458 } 2459 2460 *pResult = (drflac_int32)result; 2461 return DRFLAC_TRUE; 2462 } 2463 2464 #ifdef DRFLAC_64BIT 2465 static drflac_bool32 drflac__read_uint64(drflac_bs* bs, unsigned int bitCount, drflac_uint64* pResultOut) 2466 { 2467 drflac_uint32 resultHi; 2468 drflac_uint32 resultLo; 2469 2470 DRFLAC_ASSERT(bitCount <= 64); 2471 DRFLAC_ASSERT(bitCount > 32); 2472 2473 if (!drflac__read_uint32(bs, bitCount - 32, &resultHi)) { 2474 return DRFLAC_FALSE; 2475 } 2476 2477 if (!drflac__read_uint32(bs, 32, &resultLo)) { 2478 return DRFLAC_FALSE; 2479 } 2480 2481 *pResultOut = (((drflac_uint64)resultHi) << 32) | ((drflac_uint64)resultLo); 2482 return DRFLAC_TRUE; 2483 } 2484 #endif 2485 2486 /* Function below is unused, but leaving it here in case I need to quickly add it again. */ 2487 #if 0 2488 static drflac_bool32 drflac__read_int64(drflac_bs* bs, unsigned int bitCount, drflac_int64* pResultOut) 2489 { 2490 drflac_uint64 result; 2491 drflac_uint64 signbit; 2492 2493 DRFLAC_ASSERT(bitCount <= 64); 2494 2495 if (!drflac__read_uint64(bs, bitCount, &result)) { 2496 return DRFLAC_FALSE; 2497 } 2498 2499 signbit = ((result >> (bitCount-1)) & 0x01); 2500 result |= (~signbit + 1) << bitCount; 2501 2502 *pResultOut = (drflac_int64)result; 2503 return DRFLAC_TRUE; 2504 } 2505 #endif 2506 2507 static drflac_bool32 drflac__read_uint16(drflac_bs* bs, unsigned int bitCount, drflac_uint16* pResult) 2508 { 2509 drflac_uint32 result; 2510 2511 DRFLAC_ASSERT(bs != NULL); 2512 DRFLAC_ASSERT(pResult != NULL); 2513 DRFLAC_ASSERT(bitCount > 0); 2514 DRFLAC_ASSERT(bitCount <= 16); 2515 2516 if (!drflac__read_uint32(bs, bitCount, &result)) { 2517 return DRFLAC_FALSE; 2518 } 2519 2520 *pResult = (drflac_uint16)result; 2521 return DRFLAC_TRUE; 2522 } 2523 2524 #if 0 2525 static drflac_bool32 drflac__read_int16(drflac_bs* bs, unsigned int bitCount, drflac_int16* pResult) 2526 { 2527 drflac_int32 result; 2528 2529 DRFLAC_ASSERT(bs != NULL); 2530 DRFLAC_ASSERT(pResult != NULL); 2531 DRFLAC_ASSERT(bitCount > 0); 2532 DRFLAC_ASSERT(bitCount <= 16); 2533 2534 if (!drflac__read_int32(bs, bitCount, &result)) { 2535 return DRFLAC_FALSE; 2536 } 2537 2538 *pResult = (drflac_int16)result; 2539 return DRFLAC_TRUE; 2540 } 2541 #endif 2542 2543 static drflac_bool32 drflac__read_uint8(drflac_bs* bs, unsigned int bitCount, drflac_uint8* pResult) 2544 { 2545 drflac_uint32 result; 2546 2547 DRFLAC_ASSERT(bs != NULL); 2548 DRFLAC_ASSERT(pResult != NULL); 2549 DRFLAC_ASSERT(bitCount > 0); 2550 DRFLAC_ASSERT(bitCount <= 8); 2551 2552 if (!drflac__read_uint32(bs, bitCount, &result)) { 2553 return DRFLAC_FALSE; 2554 } 2555 2556 *pResult = (drflac_uint8)result; 2557 return DRFLAC_TRUE; 2558 } 2559 2560 static drflac_bool32 drflac__read_int8(drflac_bs* bs, unsigned int bitCount, drflac_int8* pResult) 2561 { 2562 drflac_int32 result; 2563 2564 DRFLAC_ASSERT(bs != NULL); 2565 DRFLAC_ASSERT(pResult != NULL); 2566 DRFLAC_ASSERT(bitCount > 0); 2567 DRFLAC_ASSERT(bitCount <= 8); 2568 2569 if (!drflac__read_int32(bs, bitCount, &result)) { 2570 return DRFLAC_FALSE; 2571 } 2572 2573 *pResult = (drflac_int8)result; 2574 return DRFLAC_TRUE; 2575 } 2576 2577 2578 static drflac_bool32 drflac__seek_bits(drflac_bs* bs, size_t bitsToSeek) 2579 { 2580 if (bitsToSeek <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 2581 bs->consumedBits += (drflac_uint32)bitsToSeek; 2582 bs->cache <<= bitsToSeek; 2583 return DRFLAC_TRUE; 2584 } else { 2585 /* It straddles the cached data. This function isn't called too frequently so I'm favouring simplicity here. */ 2586 bitsToSeek -= DRFLAC_CACHE_L1_BITS_REMAINING(bs); 2587 bs->consumedBits += DRFLAC_CACHE_L1_BITS_REMAINING(bs); 2588 bs->cache = 0; 2589 2590 /* Simple case. Seek in groups of the same number as bits that fit within a cache line. */ 2591 #ifdef DRFLAC_64BIT 2592 while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 2593 drflac_uint64 bin; 2594 if (!drflac__read_uint64(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) { 2595 return DRFLAC_FALSE; 2596 } 2597 bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs); 2598 } 2599 #else 2600 while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 2601 drflac_uint32 bin; 2602 if (!drflac__read_uint32(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) { 2603 return DRFLAC_FALSE; 2604 } 2605 bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs); 2606 } 2607 #endif 2608 2609 /* Whole leftover bytes. */ 2610 while (bitsToSeek >= 8) { 2611 drflac_uint8 bin; 2612 if (!drflac__read_uint8(bs, 8, &bin)) { 2613 return DRFLAC_FALSE; 2614 } 2615 bitsToSeek -= 8; 2616 } 2617 2618 /* Leftover bits. */ 2619 if (bitsToSeek > 0) { 2620 drflac_uint8 bin; 2621 if (!drflac__read_uint8(bs, (drflac_uint32)bitsToSeek, &bin)) { 2622 return DRFLAC_FALSE; 2623 } 2624 bitsToSeek = 0; /* <-- Necessary for the assert below. */ 2625 } 2626 2627 DRFLAC_ASSERT(bitsToSeek == 0); 2628 return DRFLAC_TRUE; 2629 } 2630 } 2631 2632 2633 /* This function moves the bit streamer to the first bit after the sync code (bit 15 of the of the frame header). It will also update the CRC-16. */ 2634 static drflac_bool32 drflac__find_and_seek_to_next_sync_code(drflac_bs* bs) 2635 { 2636 DRFLAC_ASSERT(bs != NULL); 2637 2638 /* 2639 The sync code is always aligned to 8 bits. This is convenient for us because it means we can do byte-aligned movements. The first 2640 thing to do is align to the next byte. 2641 */ 2642 if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) { 2643 return DRFLAC_FALSE; 2644 } 2645 2646 for (;;) { 2647 drflac_uint8 hi; 2648 2649 #ifndef DR_FLAC_NO_CRC 2650 drflac__reset_crc16(bs); 2651 #endif 2652 2653 if (!drflac__read_uint8(bs, 8, &hi)) { 2654 return DRFLAC_FALSE; 2655 } 2656 2657 if (hi == 0xFF) { 2658 drflac_uint8 lo; 2659 if (!drflac__read_uint8(bs, 6, &lo)) { 2660 return DRFLAC_FALSE; 2661 } 2662 2663 if (lo == 0x3E) { 2664 return DRFLAC_TRUE; 2665 } else { 2666 if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) { 2667 return DRFLAC_FALSE; 2668 } 2669 } 2670 } 2671 } 2672 2673 /* Should never get here. */ 2674 /*return DRFLAC_FALSE;*/ 2675 } 2676 2677 2678 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 2679 #define DRFLAC_IMPLEMENT_CLZ_LZCNT 2680 #endif 2681 #if defined(_MSC_VER) && _MSC_VER >= 1400 && (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(__clang__) 2682 #define DRFLAC_IMPLEMENT_CLZ_MSVC 2683 #endif 2684 #if defined(__WATCOMC__) && defined(__386__) 2685 #define DRFLAC_IMPLEMENT_CLZ_WATCOM 2686 #endif 2687 2688 static DRFLAC_INLINE drflac_uint32 drflac__clz_software(drflac_cache_t x) 2689 { 2690 drflac_uint32 n; 2691 static drflac_uint32 clz_table_4[] = { 2692 0, 2693 4, 2694 3, 3, 2695 2, 2, 2, 2, 2696 1, 1, 1, 1, 1, 1, 1, 1 2697 }; 2698 2699 if (x == 0) { 2700 return sizeof(x)*8; 2701 } 2702 2703 n = clz_table_4[x >> (sizeof(x)*8 - 4)]; 2704 if (n == 0) { 2705 #ifdef DRFLAC_64BIT 2706 if ((x & ((drflac_uint64)0xFFFFFFFF << 32)) == 0) { n = 32; x <<= 32; } 2707 if ((x & ((drflac_uint64)0xFFFF0000 << 32)) == 0) { n += 16; x <<= 16; } 2708 if ((x & ((drflac_uint64)0xFF000000 << 32)) == 0) { n += 8; x <<= 8; } 2709 if ((x & ((drflac_uint64)0xF0000000 << 32)) == 0) { n += 4; x <<= 4; } 2710 #else 2711 if ((x & 0xFFFF0000) == 0) { n = 16; x <<= 16; } 2712 if ((x & 0xFF000000) == 0) { n += 8; x <<= 8; } 2713 if ((x & 0xF0000000) == 0) { n += 4; x <<= 4; } 2714 #endif 2715 n += clz_table_4[x >> (sizeof(x)*8 - 4)]; 2716 } 2717 2718 return n - 1; 2719 } 2720 2721 #ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT 2722 static DRFLAC_INLINE drflac_bool32 drflac__is_lzcnt_supported(void) 2723 { 2724 /* Fast compile time check for ARM. */ 2725 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) 2726 return DRFLAC_TRUE; 2727 #else 2728 /* If the compiler itself does not support the intrinsic then we'll need to return false. */ 2729 #ifdef DRFLAC_HAS_LZCNT_INTRINSIC 2730 return drflac__gIsLZCNTSupported; 2731 #else 2732 return DRFLAC_FALSE; 2733 #endif 2734 #endif 2735 } 2736 2737 static DRFLAC_INLINE drflac_uint32 drflac__clz_lzcnt(drflac_cache_t x) 2738 { 2739 /* 2740 It's critical for competitive decoding performance that this function be highly optimal. With MSVC we can use the __lzcnt64() and __lzcnt() intrinsics 2741 to achieve good performance, however on GCC and Clang it's a little bit more annoying. The __builtin_clzl() and __builtin_clzll() intrinsics leave 2742 it undefined as to the return value when `x` is 0. We need this to be well defined as returning 32 or 64, depending on whether or not it's a 32- or 2743 64-bit build. To work around this we would need to add a conditional to check for the x = 0 case, but this creates unnecessary inefficiency. To work 2744 around this problem I have written some inline assembly to emit the LZCNT (x86) or CLZ (ARM) instruction directly which removes the need to include 2745 the conditional. This has worked well in the past, but for some reason Clang's MSVC compatible driver, clang-cl, does not seem to be handling this 2746 in the same way as the normal Clang driver. It seems that `clang-cl` is just outputting the wrong results sometimes, maybe due to some register 2747 getting clobbered? 2748 2749 I'm not sure if this is a bug with dr_flac's inlined assembly (most likely), a bug in `clang-cl` or just a misunderstanding on my part with inline 2750 assembly rules for `clang-cl`. If somebody can identify an error in dr_flac's inlined assembly I'm happy to get that fixed. 2751 2752 Fortunately there is an easy workaround for this. Clang implements MSVC-specific intrinsics for compatibility. It also defines _MSC_VER for extra 2753 compatibility. We can therefore just check for _MSC_VER and use the MSVC intrinsic which, fortunately for us, Clang supports. It would still be nice 2754 to know how to fix the inlined assembly for correctness sake, however. 2755 */ 2756 2757 #if defined(_MSC_VER) /*&& !defined(__clang__)*/ /* <-- Intentionally wanting Clang to use the MSVC __lzcnt64/__lzcnt intrinsics due to above ^. */ 2758 #ifdef DRFLAC_64BIT 2759 return (drflac_uint32)__lzcnt64(x); 2760 #else 2761 return (drflac_uint32)__lzcnt(x); 2762 #endif 2763 #else 2764 #if defined(__GNUC__) || defined(__clang__) 2765 #if defined(DRFLAC_X64) 2766 { 2767 drflac_uint64 r; 2768 __asm__ __volatile__ ( 2769 "lzcnt{ %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc" 2770 ); 2771 2772 return (drflac_uint32)r; 2773 } 2774 #elif defined(DRFLAC_X86) 2775 { 2776 drflac_uint32 r; 2777 __asm__ __volatile__ ( 2778 "lzcnt{l %1, %0| %0, %1}" : "=r"(r) : "r"(x) : "cc" 2779 ); 2780 2781 return r; 2782 } 2783 #elif defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) && !defined(DRFLAC_64BIT) /* <-- I haven't tested 64-bit inline assembly, so only enabling this for the 32-bit build for now. */ 2784 { 2785 unsigned int r; 2786 __asm__ __volatile__ ( 2787 #if defined(DRFLAC_64BIT) 2788 "clz %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(x) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */ 2789 #else 2790 "clz %[out], %[in]" : [out]"=r"(r) : [in]"r"(x) 2791 #endif 2792 ); 2793 2794 return r; 2795 } 2796 #else 2797 if (x == 0) { 2798 return sizeof(x)*8; 2799 } 2800 #ifdef DRFLAC_64BIT 2801 return (drflac_uint32)__builtin_clzll((drflac_uint64)x); 2802 #else 2803 return (drflac_uint32)__builtin_clzl((drflac_uint32)x); 2804 #endif 2805 #endif 2806 #else 2807 /* Unsupported compiler. */ 2808 #error "This compiler does not support the lzcnt intrinsic." 2809 #endif 2810 #endif 2811 } 2812 #endif 2813 2814 #ifdef DRFLAC_IMPLEMENT_CLZ_MSVC 2815 #include <intrin.h> /* For BitScanReverse(). */ 2816 2817 static DRFLAC_INLINE drflac_uint32 drflac__clz_msvc(drflac_cache_t x) 2818 { 2819 drflac_uint32 n; 2820 2821 if (x == 0) { 2822 return sizeof(x)*8; 2823 } 2824 2825 #ifdef DRFLAC_64BIT 2826 _BitScanReverse64((unsigned long*)&n, x); 2827 #else 2828 _BitScanReverse((unsigned long*)&n, x); 2829 #endif 2830 return sizeof(x)*8 - n - 1; 2831 } 2832 #endif 2833 2834 #ifdef DRFLAC_IMPLEMENT_CLZ_WATCOM 2835 static __inline drflac_uint32 drflac__clz_watcom (drflac_uint32); 2836 #pragma aux drflac__clz_watcom = \ 2837 "bsr eax, eax" \ 2838 "xor eax, 31" \ 2839 parm [eax] nomemory \ 2840 value [eax] \ 2841 modify exact [eax] nomemory; 2842 #endif 2843 2844 static DRFLAC_INLINE drflac_uint32 drflac__clz(drflac_cache_t x) 2845 { 2846 #ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT 2847 if (drflac__is_lzcnt_supported()) { 2848 return drflac__clz_lzcnt(x); 2849 } else 2850 #endif 2851 { 2852 #ifdef DRFLAC_IMPLEMENT_CLZ_MSVC 2853 return drflac__clz_msvc(x); 2854 #elif defined(DRFLAC_IMPLEMENT_CLZ_WATCOM) 2855 return (x == 0) ? sizeof(x)*8 : drflac__clz_watcom(x); 2856 #else 2857 return drflac__clz_software(x); 2858 #endif 2859 } 2860 } 2861 2862 2863 static DRFLAC_INLINE drflac_bool32 drflac__seek_past_next_set_bit(drflac_bs* bs, unsigned int* pOffsetOut) 2864 { 2865 drflac_uint32 zeroCounter = 0; 2866 drflac_uint32 setBitOffsetPlus1; 2867 2868 while (bs->cache == 0) { 2869 zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs); 2870 if (!drflac__reload_cache(bs)) { 2871 return DRFLAC_FALSE; 2872 } 2873 } 2874 2875 setBitOffsetPlus1 = drflac__clz(bs->cache); 2876 setBitOffsetPlus1 += 1; 2877 2878 bs->consumedBits += setBitOffsetPlus1; 2879 bs->cache <<= setBitOffsetPlus1; 2880 2881 *pOffsetOut = zeroCounter + setBitOffsetPlus1 - 1; 2882 return DRFLAC_TRUE; 2883 } 2884 2885 2886 2887 static drflac_bool32 drflac__seek_to_byte(drflac_bs* bs, drflac_uint64 offsetFromStart) 2888 { 2889 DRFLAC_ASSERT(bs != NULL); 2890 DRFLAC_ASSERT(offsetFromStart > 0); 2891 2892 /* 2893 Seeking from the start is not quite as trivial as it sounds because the onSeek callback takes a signed 32-bit integer (which 2894 is intentional because it simplifies the implementation of the onSeek callbacks), however offsetFromStart is unsigned 64-bit. 2895 To resolve we just need to do an initial seek from the start, and then a series of offset seeks to make up the remainder. 2896 */ 2897 if (offsetFromStart > 0x7FFFFFFF) { 2898 drflac_uint64 bytesRemaining = offsetFromStart; 2899 if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) { 2900 return DRFLAC_FALSE; 2901 } 2902 bytesRemaining -= 0x7FFFFFFF; 2903 2904 while (bytesRemaining > 0x7FFFFFFF) { 2905 if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) { 2906 return DRFLAC_FALSE; 2907 } 2908 bytesRemaining -= 0x7FFFFFFF; 2909 } 2910 2911 if (bytesRemaining > 0) { 2912 if (!bs->onSeek(bs->pUserData, (int)bytesRemaining, drflac_seek_origin_current)) { 2913 return DRFLAC_FALSE; 2914 } 2915 } 2916 } else { 2917 if (!bs->onSeek(bs->pUserData, (int)offsetFromStart, drflac_seek_origin_start)) { 2918 return DRFLAC_FALSE; 2919 } 2920 } 2921 2922 /* The cache should be reset to force a reload of fresh data from the client. */ 2923 drflac__reset_cache(bs); 2924 return DRFLAC_TRUE; 2925 } 2926 2927 2928 static drflac_result drflac__read_utf8_coded_number(drflac_bs* bs, drflac_uint64* pNumberOut, drflac_uint8* pCRCOut) 2929 { 2930 drflac_uint8 crc; 2931 drflac_uint64 result; 2932 drflac_uint8 utf8[7] = {0}; 2933 int byteCount; 2934 int i; 2935 2936 DRFLAC_ASSERT(bs != NULL); 2937 DRFLAC_ASSERT(pNumberOut != NULL); 2938 DRFLAC_ASSERT(pCRCOut != NULL); 2939 2940 crc = *pCRCOut; 2941 2942 if (!drflac__read_uint8(bs, 8, utf8)) { 2943 *pNumberOut = 0; 2944 return DRFLAC_AT_END; 2945 } 2946 crc = drflac_crc8(crc, utf8[0], 8); 2947 2948 if ((utf8[0] & 0x80) == 0) { 2949 *pNumberOut = utf8[0]; 2950 *pCRCOut = crc; 2951 return DRFLAC_SUCCESS; 2952 } 2953 2954 /*byteCount = 1;*/ 2955 if ((utf8[0] & 0xE0) == 0xC0) { 2956 byteCount = 2; 2957 } else if ((utf8[0] & 0xF0) == 0xE0) { 2958 byteCount = 3; 2959 } else if ((utf8[0] & 0xF8) == 0xF0) { 2960 byteCount = 4; 2961 } else if ((utf8[0] & 0xFC) == 0xF8) { 2962 byteCount = 5; 2963 } else if ((utf8[0] & 0xFE) == 0xFC) { 2964 byteCount = 6; 2965 } else if ((utf8[0] & 0xFF) == 0xFE) { 2966 byteCount = 7; 2967 } else { 2968 *pNumberOut = 0; 2969 return DRFLAC_CRC_MISMATCH; /* Bad UTF-8 encoding. */ 2970 } 2971 2972 /* Read extra bytes. */ 2973 DRFLAC_ASSERT(byteCount > 1); 2974 2975 result = (drflac_uint64)(utf8[0] & (0xFF >> (byteCount + 1))); 2976 for (i = 1; i < byteCount; ++i) { 2977 if (!drflac__read_uint8(bs, 8, utf8 + i)) { 2978 *pNumberOut = 0; 2979 return DRFLAC_AT_END; 2980 } 2981 crc = drflac_crc8(crc, utf8[i], 8); 2982 2983 result = (result << 6) | (utf8[i] & 0x3F); 2984 } 2985 2986 *pNumberOut = result; 2987 *pCRCOut = crc; 2988 return DRFLAC_SUCCESS; 2989 } 2990 2991 2992 2993 /* 2994 The next two functions are responsible for calculating the prediction. 2995 2996 When the bits per sample is >16 we need to use 64-bit integer arithmetic because otherwise we'll run out of precision. It's 2997 safe to assume this will be slower on 32-bit platforms so we use a more optimal solution when the bits per sample is <=16. 2998 */ 2999 static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_32(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 3000 { 3001 drflac_int32 prediction = 0; 3002 3003 DRFLAC_ASSERT(order <= 32); 3004 3005 /* 32-bit version. */ 3006 3007 /* VC++ optimizes this to a single jmp. I've not yet verified this for other compilers. */ 3008 switch (order) 3009 { 3010 case 32: prediction += coefficients[31] * pDecodedSamples[-32]; 3011 case 31: prediction += coefficients[30] * pDecodedSamples[-31]; 3012 case 30: prediction += coefficients[29] * pDecodedSamples[-30]; 3013 case 29: prediction += coefficients[28] * pDecodedSamples[-29]; 3014 case 28: prediction += coefficients[27] * pDecodedSamples[-28]; 3015 case 27: prediction += coefficients[26] * pDecodedSamples[-27]; 3016 case 26: prediction += coefficients[25] * pDecodedSamples[-26]; 3017 case 25: prediction += coefficients[24] * pDecodedSamples[-25]; 3018 case 24: prediction += coefficients[23] * pDecodedSamples[-24]; 3019 case 23: prediction += coefficients[22] * pDecodedSamples[-23]; 3020 case 22: prediction += coefficients[21] * pDecodedSamples[-22]; 3021 case 21: prediction += coefficients[20] * pDecodedSamples[-21]; 3022 case 20: prediction += coefficients[19] * pDecodedSamples[-20]; 3023 case 19: prediction += coefficients[18] * pDecodedSamples[-19]; 3024 case 18: prediction += coefficients[17] * pDecodedSamples[-18]; 3025 case 17: prediction += coefficients[16] * pDecodedSamples[-17]; 3026 case 16: prediction += coefficients[15] * pDecodedSamples[-16]; 3027 case 15: prediction += coefficients[14] * pDecodedSamples[-15]; 3028 case 14: prediction += coefficients[13] * pDecodedSamples[-14]; 3029 case 13: prediction += coefficients[12] * pDecodedSamples[-13]; 3030 case 12: prediction += coefficients[11] * pDecodedSamples[-12]; 3031 case 11: prediction += coefficients[10] * pDecodedSamples[-11]; 3032 case 10: prediction += coefficients[ 9] * pDecodedSamples[-10]; 3033 case 9: prediction += coefficients[ 8] * pDecodedSamples[- 9]; 3034 case 8: prediction += coefficients[ 7] * pDecodedSamples[- 8]; 3035 case 7: prediction += coefficients[ 6] * pDecodedSamples[- 7]; 3036 case 6: prediction += coefficients[ 5] * pDecodedSamples[- 6]; 3037 case 5: prediction += coefficients[ 4] * pDecodedSamples[- 5]; 3038 case 4: prediction += coefficients[ 3] * pDecodedSamples[- 4]; 3039 case 3: prediction += coefficients[ 2] * pDecodedSamples[- 3]; 3040 case 2: prediction += coefficients[ 1] * pDecodedSamples[- 2]; 3041 case 1: prediction += coefficients[ 0] * pDecodedSamples[- 1]; 3042 } 3043 3044 return (drflac_int32)(prediction >> shift); 3045 } 3046 3047 static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_64(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 3048 { 3049 drflac_int64 prediction; 3050 3051 DRFLAC_ASSERT(order <= 32); 3052 3053 /* 64-bit version. */ 3054 3055 /* This method is faster on the 32-bit build when compiling with VC++. See note below. */ 3056 #ifndef DRFLAC_64BIT 3057 if (order == 8) 3058 { 3059 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3060 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3061 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3062 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3063 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3064 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3065 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3066 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 3067 } 3068 else if (order == 7) 3069 { 3070 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3071 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3072 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3073 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3074 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3075 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3076 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3077 } 3078 else if (order == 3) 3079 { 3080 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3081 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3082 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3083 } 3084 else if (order == 6) 3085 { 3086 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3087 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3088 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3089 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3090 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3091 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3092 } 3093 else if (order == 5) 3094 { 3095 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3096 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3097 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3098 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3099 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3100 } 3101 else if (order == 4) 3102 { 3103 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3104 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3105 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3106 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3107 } 3108 else if (order == 12) 3109 { 3110 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3111 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3112 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3113 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3114 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3115 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3116 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3117 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 3118 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 3119 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 3120 prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 3121 prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12]; 3122 } 3123 else if (order == 2) 3124 { 3125 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3126 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3127 } 3128 else if (order == 1) 3129 { 3130 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3131 } 3132 else if (order == 10) 3133 { 3134 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3135 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3136 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3137 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3138 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3139 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3140 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3141 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 3142 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 3143 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 3144 } 3145 else if (order == 9) 3146 { 3147 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3148 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3149 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3150 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3151 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3152 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3153 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3154 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 3155 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 3156 } 3157 else if (order == 11) 3158 { 3159 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 3160 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 3161 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 3162 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 3163 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 3164 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 3165 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 3166 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 3167 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 3168 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 3169 prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 3170 } 3171 else 3172 { 3173 int j; 3174 3175 prediction = 0; 3176 for (j = 0; j < (int)order; ++j) { 3177 prediction += coefficients[j] * (drflac_int64)pDecodedSamples[-j-1]; 3178 } 3179 } 3180 #endif 3181 3182 /* 3183 VC++ optimizes this to a single jmp instruction, but only the 64-bit build. The 32-bit build generates less efficient code for some 3184 reason. The ugly version above is faster so we'll just switch between the two depending on the target platform. 3185 */ 3186 #ifdef DRFLAC_64BIT 3187 prediction = 0; 3188 switch (order) 3189 { 3190 case 32: prediction += coefficients[31] * (drflac_int64)pDecodedSamples[-32]; 3191 case 31: prediction += coefficients[30] * (drflac_int64)pDecodedSamples[-31]; 3192 case 30: prediction += coefficients[29] * (drflac_int64)pDecodedSamples[-30]; 3193 case 29: prediction += coefficients[28] * (drflac_int64)pDecodedSamples[-29]; 3194 case 28: prediction += coefficients[27] * (drflac_int64)pDecodedSamples[-28]; 3195 case 27: prediction += coefficients[26] * (drflac_int64)pDecodedSamples[-27]; 3196 case 26: prediction += coefficients[25] * (drflac_int64)pDecodedSamples[-26]; 3197 case 25: prediction += coefficients[24] * (drflac_int64)pDecodedSamples[-25]; 3198 case 24: prediction += coefficients[23] * (drflac_int64)pDecodedSamples[-24]; 3199 case 23: prediction += coefficients[22] * (drflac_int64)pDecodedSamples[-23]; 3200 case 22: prediction += coefficients[21] * (drflac_int64)pDecodedSamples[-22]; 3201 case 21: prediction += coefficients[20] * (drflac_int64)pDecodedSamples[-21]; 3202 case 20: prediction += coefficients[19] * (drflac_int64)pDecodedSamples[-20]; 3203 case 19: prediction += coefficients[18] * (drflac_int64)pDecodedSamples[-19]; 3204 case 18: prediction += coefficients[17] * (drflac_int64)pDecodedSamples[-18]; 3205 case 17: prediction += coefficients[16] * (drflac_int64)pDecodedSamples[-17]; 3206 case 16: prediction += coefficients[15] * (drflac_int64)pDecodedSamples[-16]; 3207 case 15: prediction += coefficients[14] * (drflac_int64)pDecodedSamples[-15]; 3208 case 14: prediction += coefficients[13] * (drflac_int64)pDecodedSamples[-14]; 3209 case 13: prediction += coefficients[12] * (drflac_int64)pDecodedSamples[-13]; 3210 case 12: prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12]; 3211 case 11: prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 3212 case 10: prediction += coefficients[ 9] * (drflac_int64)pDecodedSamples[-10]; 3213 case 9: prediction += coefficients[ 8] * (drflac_int64)pDecodedSamples[- 9]; 3214 case 8: prediction += coefficients[ 7] * (drflac_int64)pDecodedSamples[- 8]; 3215 case 7: prediction += coefficients[ 6] * (drflac_int64)pDecodedSamples[- 7]; 3216 case 6: prediction += coefficients[ 5] * (drflac_int64)pDecodedSamples[- 6]; 3217 case 5: prediction += coefficients[ 4] * (drflac_int64)pDecodedSamples[- 5]; 3218 case 4: prediction += coefficients[ 3] * (drflac_int64)pDecodedSamples[- 4]; 3219 case 3: prediction += coefficients[ 2] * (drflac_int64)pDecodedSamples[- 3]; 3220 case 2: prediction += coefficients[ 1] * (drflac_int64)pDecodedSamples[- 2]; 3221 case 1: prediction += coefficients[ 0] * (drflac_int64)pDecodedSamples[- 1]; 3222 } 3223 #endif 3224 3225 return (drflac_int32)(prediction >> shift); 3226 } 3227 3228 3229 #if 0 3230 /* 3231 Reference implementation for reading and decoding samples with residual. This is intentionally left unoptimized for the 3232 sake of readability and should only be used as a reference. 3233 */ 3234 static drflac_bool32 drflac__decode_samples_with_residual__rice__reference(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 3235 { 3236 drflac_uint32 i; 3237 3238 DRFLAC_ASSERT(bs != NULL); 3239 DRFLAC_ASSERT(pSamplesOut != NULL); 3240 3241 for (i = 0; i < count; ++i) { 3242 drflac_uint32 zeroCounter = 0; 3243 for (;;) { 3244 drflac_uint8 bit; 3245 if (!drflac__read_uint8(bs, 1, &bit)) { 3246 return DRFLAC_FALSE; 3247 } 3248 3249 if (bit == 0) { 3250 zeroCounter += 1; 3251 } else { 3252 break; 3253 } 3254 } 3255 3256 drflac_uint32 decodedRice; 3257 if (riceParam > 0) { 3258 if (!drflac__read_uint32(bs, riceParam, &decodedRice)) { 3259 return DRFLAC_FALSE; 3260 } 3261 } else { 3262 decodedRice = 0; 3263 } 3264 3265 decodedRice |= (zeroCounter << riceParam); 3266 if ((decodedRice & 0x01)) { 3267 decodedRice = ~(decodedRice >> 1); 3268 } else { 3269 decodedRice = (decodedRice >> 1); 3270 } 3271 3272 3273 if (bitsPerSample+shift >= 32) { 3274 pSamplesOut[i] = decodedRice + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + i); 3275 } else { 3276 pSamplesOut[i] = decodedRice + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + i); 3277 } 3278 } 3279 3280 return DRFLAC_TRUE; 3281 } 3282 #endif 3283 3284 #if 0 3285 static drflac_bool32 drflac__read_rice_parts__reference(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 3286 { 3287 drflac_uint32 zeroCounter = 0; 3288 drflac_uint32 decodedRice; 3289 3290 for (;;) { 3291 drflac_uint8 bit; 3292 if (!drflac__read_uint8(bs, 1, &bit)) { 3293 return DRFLAC_FALSE; 3294 } 3295 3296 if (bit == 0) { 3297 zeroCounter += 1; 3298 } else { 3299 break; 3300 } 3301 } 3302 3303 if (riceParam > 0) { 3304 if (!drflac__read_uint32(bs, riceParam, &decodedRice)) { 3305 return DRFLAC_FALSE; 3306 } 3307 } else { 3308 decodedRice = 0; 3309 } 3310 3311 *pZeroCounterOut = zeroCounter; 3312 *pRiceParamPartOut = decodedRice; 3313 return DRFLAC_TRUE; 3314 } 3315 #endif 3316 3317 #if 0 3318 static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 3319 { 3320 drflac_cache_t riceParamMask; 3321 drflac_uint32 zeroCounter; 3322 drflac_uint32 setBitOffsetPlus1; 3323 drflac_uint32 riceParamPart; 3324 drflac_uint32 riceLength; 3325 3326 DRFLAC_ASSERT(riceParam > 0); /* <-- riceParam should never be 0. drflac__read_rice_parts__param_equals_zero() should be used instead for this case. */ 3327 3328 riceParamMask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParam); 3329 3330 zeroCounter = 0; 3331 while (bs->cache == 0) { 3332 zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs); 3333 if (!drflac__reload_cache(bs)) { 3334 return DRFLAC_FALSE; 3335 } 3336 } 3337 3338 setBitOffsetPlus1 = drflac__clz(bs->cache); 3339 zeroCounter += setBitOffsetPlus1; 3340 setBitOffsetPlus1 += 1; 3341 3342 riceLength = setBitOffsetPlus1 + riceParam; 3343 if (riceLength < DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 3344 riceParamPart = (drflac_uint32)((bs->cache & (riceParamMask >> setBitOffsetPlus1)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceLength)); 3345 3346 bs->consumedBits += riceLength; 3347 bs->cache <<= riceLength; 3348 } else { 3349 drflac_uint32 bitCountLo; 3350 drflac_cache_t resultHi; 3351 3352 bs->consumedBits += riceLength; 3353 bs->cache <<= setBitOffsetPlus1 & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1); /* <-- Equivalent to "if (setBitOffsetPlus1 < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { bs->cache <<= setBitOffsetPlus1; }" */ 3354 3355 /* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */ 3356 bitCountLo = bs->consumedBits - DRFLAC_CACHE_L1_SIZE_BITS(bs); 3357 resultHi = DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, riceParam); /* <-- Use DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE() if ever this function allows riceParam=0. */ 3358 3359 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 3360 #ifndef DR_FLAC_NO_CRC 3361 drflac__update_crc16(bs); 3362 #endif 3363 bs->cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 3364 bs->consumedBits = 0; 3365 #ifndef DR_FLAC_NO_CRC 3366 bs->crc16Cache = bs->cache; 3367 #endif 3368 } else { 3369 /* Slow path. We need to fetch more data from the client. */ 3370 if (!drflac__reload_cache(bs)) { 3371 return DRFLAC_FALSE; 3372 } 3373 } 3374 3375 riceParamPart = (drflac_uint32)(resultHi | DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, bitCountLo)); 3376 3377 bs->consumedBits += bitCountLo; 3378 bs->cache <<= bitCountLo; 3379 } 3380 3381 pZeroCounterOut[0] = zeroCounter; 3382 pRiceParamPartOut[0] = riceParamPart; 3383 3384 return DRFLAC_TRUE; 3385 } 3386 #endif 3387 3388 static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts_x1(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 3389 { 3390 drflac_uint32 riceParamPlus1 = riceParam + 1; 3391 /*drflac_cache_t riceParamPlus1Mask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParamPlus1);*/ 3392 drflac_uint32 riceParamPlus1Shift = DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPlus1); 3393 drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1; 3394 3395 /* 3396 The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have 3397 no idea how this will work in practice... 3398 */ 3399 drflac_cache_t bs_cache = bs->cache; 3400 drflac_uint32 bs_consumedBits = bs->consumedBits; 3401 3402 /* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */ 3403 drflac_uint32 lzcount = drflac__clz(bs_cache); 3404 if (lzcount < sizeof(bs_cache)*8) { 3405 pZeroCounterOut[0] = lzcount; 3406 3407 /* 3408 It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting 3409 this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled 3410 outside of this function at a higher level. 3411 */ 3412 extract_rice_param_part: 3413 bs_cache <<= lzcount; 3414 bs_consumedBits += lzcount; 3415 3416 if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) { 3417 /* Getting here means the rice parameter part is wholly contained within the current cache line. */ 3418 pRiceParamPartOut[0] = (drflac_uint32)(bs_cache >> riceParamPlus1Shift); 3419 bs_cache <<= riceParamPlus1; 3420 bs_consumedBits += riceParamPlus1; 3421 } else { 3422 drflac_uint32 riceParamPartHi; 3423 drflac_uint32 riceParamPartLo; 3424 drflac_uint32 riceParamPartLoBitCount; 3425 3426 /* 3427 Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache 3428 line, reload the cache, and then combine it with the head of the next cache line. 3429 */ 3430 3431 /* Grab the high part of the rice parameter part. */ 3432 riceParamPartHi = (drflac_uint32)(bs_cache >> riceParamPlus1Shift); 3433 3434 /* Before reloading the cache we need to grab the size in bits of the low part. */ 3435 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits; 3436 DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32); 3437 3438 /* Now reload the cache. */ 3439 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 3440 #ifndef DR_FLAC_NO_CRC 3441 drflac__update_crc16(bs); 3442 #endif 3443 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 3444 bs_consumedBits = riceParamPartLoBitCount; 3445 #ifndef DR_FLAC_NO_CRC 3446 bs->crc16Cache = bs_cache; 3447 #endif 3448 } else { 3449 /* Slow path. We need to fetch more data from the client. */ 3450 if (!drflac__reload_cache(bs)) { 3451 return DRFLAC_FALSE; 3452 } 3453 3454 bs_cache = bs->cache; 3455 bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount; 3456 } 3457 3458 /* We should now have enough information to construct the rice parameter part. */ 3459 riceParamPartLo = (drflac_uint32)(bs_cache >> (DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPartLoBitCount))); 3460 pRiceParamPartOut[0] = riceParamPartHi | riceParamPartLo; 3461 3462 bs_cache <<= riceParamPartLoBitCount; 3463 } 3464 } else { 3465 /* 3466 Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call 3467 to drflac__clz() and we need to reload the cache. 3468 */ 3469 drflac_uint32 zeroCounter = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BITS(bs) - bs_consumedBits); 3470 for (;;) { 3471 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 3472 #ifndef DR_FLAC_NO_CRC 3473 drflac__update_crc16(bs); 3474 #endif 3475 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 3476 bs_consumedBits = 0; 3477 #ifndef DR_FLAC_NO_CRC 3478 bs->crc16Cache = bs_cache; 3479 #endif 3480 } else { 3481 /* Slow path. We need to fetch more data from the client. */ 3482 if (!drflac__reload_cache(bs)) { 3483 return DRFLAC_FALSE; 3484 } 3485 3486 bs_cache = bs->cache; 3487 bs_consumedBits = bs->consumedBits; 3488 } 3489 3490 lzcount = drflac__clz(bs_cache); 3491 zeroCounter += lzcount; 3492 3493 if (lzcount < sizeof(bs_cache)*8) { 3494 break; 3495 } 3496 } 3497 3498 pZeroCounterOut[0] = zeroCounter; 3499 goto extract_rice_param_part; 3500 } 3501 3502 /* Make sure the cache is restored at the end of it all. */ 3503 bs->cache = bs_cache; 3504 bs->consumedBits = bs_consumedBits; 3505 3506 return DRFLAC_TRUE; 3507 } 3508 3509 static DRFLAC_INLINE drflac_bool32 drflac__seek_rice_parts(drflac_bs* bs, drflac_uint8 riceParam) 3510 { 3511 drflac_uint32 riceParamPlus1 = riceParam + 1; 3512 drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1; 3513 3514 /* 3515 The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have 3516 no idea how this will work in practice... 3517 */ 3518 drflac_cache_t bs_cache = bs->cache; 3519 drflac_uint32 bs_consumedBits = bs->consumedBits; 3520 3521 /* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */ 3522 drflac_uint32 lzcount = drflac__clz(bs_cache); 3523 if (lzcount < sizeof(bs_cache)*8) { 3524 /* 3525 It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting 3526 this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled 3527 outside of this function at a higher level. 3528 */ 3529 extract_rice_param_part: 3530 bs_cache <<= lzcount; 3531 bs_consumedBits += lzcount; 3532 3533 if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) { 3534 /* Getting here means the rice parameter part is wholly contained within the current cache line. */ 3535 bs_cache <<= riceParamPlus1; 3536 bs_consumedBits += riceParamPlus1; 3537 } else { 3538 /* 3539 Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache 3540 line, reload the cache, and then combine it with the head of the next cache line. 3541 */ 3542 3543 /* Before reloading the cache we need to grab the size in bits of the low part. */ 3544 drflac_uint32 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits; 3545 DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32); 3546 3547 /* Now reload the cache. */ 3548 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 3549 #ifndef DR_FLAC_NO_CRC 3550 drflac__update_crc16(bs); 3551 #endif 3552 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 3553 bs_consumedBits = riceParamPartLoBitCount; 3554 #ifndef DR_FLAC_NO_CRC 3555 bs->crc16Cache = bs_cache; 3556 #endif 3557 } else { 3558 /* Slow path. We need to fetch more data from the client. */ 3559 if (!drflac__reload_cache(bs)) { 3560 return DRFLAC_FALSE; 3561 } 3562 3563 bs_cache = bs->cache; 3564 bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount; 3565 } 3566 3567 bs_cache <<= riceParamPartLoBitCount; 3568 } 3569 } else { 3570 /* 3571 Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call 3572 to drflac__clz() and we need to reload the cache. 3573 */ 3574 for (;;) { 3575 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 3576 #ifndef DR_FLAC_NO_CRC 3577 drflac__update_crc16(bs); 3578 #endif 3579 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 3580 bs_consumedBits = 0; 3581 #ifndef DR_FLAC_NO_CRC 3582 bs->crc16Cache = bs_cache; 3583 #endif 3584 } else { 3585 /* Slow path. We need to fetch more data from the client. */ 3586 if (!drflac__reload_cache(bs)) { 3587 return DRFLAC_FALSE; 3588 } 3589 3590 bs_cache = bs->cache; 3591 bs_consumedBits = bs->consumedBits; 3592 } 3593 3594 lzcount = drflac__clz(bs_cache); 3595 if (lzcount < sizeof(bs_cache)*8) { 3596 break; 3597 } 3598 } 3599 3600 goto extract_rice_param_part; 3601 } 3602 3603 /* Make sure the cache is restored at the end of it all. */ 3604 bs->cache = bs_cache; 3605 bs->consumedBits = bs_consumedBits; 3606 3607 return DRFLAC_TRUE; 3608 } 3609 3610 3611 static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar_zeroorder(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 3612 { 3613 drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 3614 drflac_uint32 zeroCountPart0; 3615 drflac_uint32 riceParamPart0; 3616 drflac_uint32 riceParamMask; 3617 drflac_uint32 i; 3618 3619 DRFLAC_ASSERT(bs != NULL); 3620 DRFLAC_ASSERT(pSamplesOut != NULL); 3621 3622 (void)bitsPerSample; 3623 (void)order; 3624 (void)shift; 3625 (void)coefficients; 3626 3627 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 3628 3629 i = 0; 3630 while (i < count) { 3631 /* Rice extraction. */ 3632 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) { 3633 return DRFLAC_FALSE; 3634 } 3635 3636 /* Rice reconstruction. */ 3637 riceParamPart0 &= riceParamMask; 3638 riceParamPart0 |= (zeroCountPart0 << riceParam); 3639 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 3640 3641 pSamplesOut[i] = riceParamPart0; 3642 3643 i += 1; 3644 } 3645 3646 return DRFLAC_TRUE; 3647 } 3648 3649 static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 3650 { 3651 drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 3652 drflac_uint32 zeroCountPart0 = 0; 3653 drflac_uint32 zeroCountPart1 = 0; 3654 drflac_uint32 zeroCountPart2 = 0; 3655 drflac_uint32 zeroCountPart3 = 0; 3656 drflac_uint32 riceParamPart0 = 0; 3657 drflac_uint32 riceParamPart1 = 0; 3658 drflac_uint32 riceParamPart2 = 0; 3659 drflac_uint32 riceParamPart3 = 0; 3660 drflac_uint32 riceParamMask; 3661 const drflac_int32* pSamplesOutEnd; 3662 drflac_uint32 i; 3663 3664 DRFLAC_ASSERT(bs != NULL); 3665 DRFLAC_ASSERT(pSamplesOut != NULL); 3666 3667 if (order == 0) { 3668 return drflac__decode_samples_with_residual__rice__scalar_zeroorder(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 3669 } 3670 3671 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 3672 pSamplesOutEnd = pSamplesOut + (count & ~3); 3673 3674 if (bitsPerSample+shift > 32) { 3675 while (pSamplesOut < pSamplesOutEnd) { 3676 /* 3677 Rice extraction. It's faster to do this one at a time against local variables than it is to use the x4 version 3678 against an array. Not sure why, but perhaps it's making more efficient use of registers? 3679 */ 3680 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) || 3681 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) || 3682 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) || 3683 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) { 3684 return DRFLAC_FALSE; 3685 } 3686 3687 riceParamPart0 &= riceParamMask; 3688 riceParamPart1 &= riceParamMask; 3689 riceParamPart2 &= riceParamMask; 3690 riceParamPart3 &= riceParamMask; 3691 3692 riceParamPart0 |= (zeroCountPart0 << riceParam); 3693 riceParamPart1 |= (zeroCountPart1 << riceParam); 3694 riceParamPart2 |= (zeroCountPart2 << riceParam); 3695 riceParamPart3 |= (zeroCountPart3 << riceParam); 3696 3697 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 3698 riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01]; 3699 riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01]; 3700 riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01]; 3701 3702 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 0); 3703 pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 1); 3704 pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 2); 3705 pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 3); 3706 3707 pSamplesOut += 4; 3708 } 3709 } else { 3710 while (pSamplesOut < pSamplesOutEnd) { 3711 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) || 3712 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) || 3713 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) || 3714 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) { 3715 return DRFLAC_FALSE; 3716 } 3717 3718 riceParamPart0 &= riceParamMask; 3719 riceParamPart1 &= riceParamMask; 3720 riceParamPart2 &= riceParamMask; 3721 riceParamPart3 &= riceParamMask; 3722 3723 riceParamPart0 |= (zeroCountPart0 << riceParam); 3724 riceParamPart1 |= (zeroCountPart1 << riceParam); 3725 riceParamPart2 |= (zeroCountPart2 << riceParam); 3726 riceParamPart3 |= (zeroCountPart3 << riceParam); 3727 3728 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 3729 riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01]; 3730 riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01]; 3731 riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01]; 3732 3733 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 0); 3734 pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 1); 3735 pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 2); 3736 pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 3); 3737 3738 pSamplesOut += 4; 3739 } 3740 } 3741 3742 i = (count & ~3); 3743 while (i < count) { 3744 /* Rice extraction. */ 3745 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) { 3746 return DRFLAC_FALSE; 3747 } 3748 3749 /* Rice reconstruction. */ 3750 riceParamPart0 &= riceParamMask; 3751 riceParamPart0 |= (zeroCountPart0 << riceParam); 3752 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 3753 /*riceParamPart0 = (riceParamPart0 >> 1) ^ (~(riceParamPart0 & 0x01) + 1);*/ 3754 3755 /* Sample reconstruction. */ 3756 if (bitsPerSample+shift > 32) { 3757 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 0); 3758 } else { 3759 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 0); 3760 } 3761 3762 i += 1; 3763 pSamplesOut += 1; 3764 } 3765 3766 return DRFLAC_TRUE; 3767 } 3768 3769 #if defined(DRFLAC_SUPPORT_SSE2) 3770 static DRFLAC_INLINE __m128i drflac__mm_packs_interleaved_epi32(__m128i a, __m128i b) 3771 { 3772 __m128i r; 3773 3774 /* Pack. */ 3775 r = _mm_packs_epi32(a, b); 3776 3777 /* a3a2 a1a0 b3b2 b1b0 -> a3a2 b3b2 a1a0 b1b0 */ 3778 r = _mm_shuffle_epi32(r, _MM_SHUFFLE(3, 1, 2, 0)); 3779 3780 /* a3a2 b3b2 a1a0 b1b0 -> a3b3 a2b2 a1b1 a0b0 */ 3781 r = _mm_shufflehi_epi16(r, _MM_SHUFFLE(3, 1, 2, 0)); 3782 r = _mm_shufflelo_epi16(r, _MM_SHUFFLE(3, 1, 2, 0)); 3783 3784 return r; 3785 } 3786 #endif 3787 3788 #if defined(DRFLAC_SUPPORT_SSE41) 3789 static DRFLAC_INLINE __m128i drflac__mm_not_si128(__m128i a) 3790 { 3791 return _mm_xor_si128(a, _mm_cmpeq_epi32(_mm_setzero_si128(), _mm_setzero_si128())); 3792 } 3793 3794 static DRFLAC_INLINE __m128i drflac__mm_hadd_epi32(__m128i x) 3795 { 3796 __m128i x64 = _mm_add_epi32(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2))); 3797 __m128i x32 = _mm_shufflelo_epi16(x64, _MM_SHUFFLE(1, 0, 3, 2)); 3798 return _mm_add_epi32(x64, x32); 3799 } 3800 3801 static DRFLAC_INLINE __m128i drflac__mm_hadd_epi64(__m128i x) 3802 { 3803 return _mm_add_epi64(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2))); 3804 } 3805 3806 static DRFLAC_INLINE __m128i drflac__mm_srai_epi64(__m128i x, int count) 3807 { 3808 /* 3809 To simplify this we are assuming count < 32. This restriction allows us to work on a low side and a high side. The low side 3810 is shifted with zero bits, whereas the right side is shifted with sign bits. 3811 */ 3812 __m128i lo = _mm_srli_epi64(x, count); 3813 __m128i hi = _mm_srai_epi32(x, count); 3814 3815 hi = _mm_and_si128(hi, _mm_set_epi32(0xFFFFFFFF, 0, 0xFFFFFFFF, 0)); /* The high part needs to have the low part cleared. */ 3816 3817 return _mm_or_si128(lo, hi); 3818 } 3819 3820 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 3821 { 3822 int i; 3823 drflac_uint32 riceParamMask; 3824 drflac_int32* pDecodedSamples = pSamplesOut; 3825 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 3826 drflac_uint32 zeroCountParts0 = 0; 3827 drflac_uint32 zeroCountParts1 = 0; 3828 drflac_uint32 zeroCountParts2 = 0; 3829 drflac_uint32 zeroCountParts3 = 0; 3830 drflac_uint32 riceParamParts0 = 0; 3831 drflac_uint32 riceParamParts1 = 0; 3832 drflac_uint32 riceParamParts2 = 0; 3833 drflac_uint32 riceParamParts3 = 0; 3834 __m128i coefficients128_0; 3835 __m128i coefficients128_4; 3836 __m128i coefficients128_8; 3837 __m128i samples128_0; 3838 __m128i samples128_4; 3839 __m128i samples128_8; 3840 __m128i riceParamMask128; 3841 3842 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 3843 3844 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 3845 riceParamMask128 = _mm_set1_epi32(riceParamMask); 3846 3847 /* Pre-load. */ 3848 coefficients128_0 = _mm_setzero_si128(); 3849 coefficients128_4 = _mm_setzero_si128(); 3850 coefficients128_8 = _mm_setzero_si128(); 3851 3852 samples128_0 = _mm_setzero_si128(); 3853 samples128_4 = _mm_setzero_si128(); 3854 samples128_8 = _mm_setzero_si128(); 3855 3856 /* 3857 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 3858 what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results 3859 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 3860 so I think there's opportunity for this to be simplified. 3861 */ 3862 #if 1 3863 { 3864 int runningOrder = order; 3865 3866 /* 0 - 3. */ 3867 if (runningOrder >= 4) { 3868 coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0)); 3869 samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4)); 3870 runningOrder -= 4; 3871 } else { 3872 switch (runningOrder) { 3873 case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break; 3874 case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break; 3875 case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break; 3876 } 3877 runningOrder = 0; 3878 } 3879 3880 /* 4 - 7 */ 3881 if (runningOrder >= 4) { 3882 coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4)); 3883 samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8)); 3884 runningOrder -= 4; 3885 } else { 3886 switch (runningOrder) { 3887 case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break; 3888 case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break; 3889 case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break; 3890 } 3891 runningOrder = 0; 3892 } 3893 3894 /* 8 - 11 */ 3895 if (runningOrder == 4) { 3896 coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8)); 3897 samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12)); 3898 runningOrder -= 4; 3899 } else { 3900 switch (runningOrder) { 3901 case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break; 3902 case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break; 3903 case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break; 3904 } 3905 runningOrder = 0; 3906 } 3907 3908 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 3909 coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3)); 3910 coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3)); 3911 coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3)); 3912 } 3913 #else 3914 /* This causes strict-aliasing warnings with GCC. */ 3915 switch (order) 3916 { 3917 case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12]; 3918 case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11]; 3919 case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10]; 3920 case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9]; 3921 case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8]; 3922 case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7]; 3923 case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6]; 3924 case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5]; 3925 case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4]; 3926 case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3]; 3927 case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2]; 3928 case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1]; 3929 } 3930 #endif 3931 3932 /* For this version we are doing one sample at a time. */ 3933 while (pDecodedSamples < pDecodedSamplesEnd) { 3934 __m128i prediction128; 3935 __m128i zeroCountPart128; 3936 __m128i riceParamPart128; 3937 3938 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) || 3939 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) || 3940 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) || 3941 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) { 3942 return DRFLAC_FALSE; 3943 } 3944 3945 zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0); 3946 riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0); 3947 3948 riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128); 3949 riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam)); 3950 riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01))), _mm_set1_epi32(0x01))); /* <-- SSE2 compatible */ 3951 /*riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_mullo_epi32(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01)), _mm_set1_epi32(0xFFFFFFFF)));*/ /* <-- Only supported from SSE4.1 and is slower in my testing... */ 3952 3953 if (order <= 4) { 3954 for (i = 0; i < 4; i += 1) { 3955 prediction128 = _mm_mullo_epi32(coefficients128_0, samples128_0); 3956 3957 /* Horizontal add and shift. */ 3958 prediction128 = drflac__mm_hadd_epi32(prediction128); 3959 prediction128 = _mm_srai_epi32(prediction128, shift); 3960 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 3961 3962 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 3963 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 3964 } 3965 } else if (order <= 8) { 3966 for (i = 0; i < 4; i += 1) { 3967 prediction128 = _mm_mullo_epi32(coefficients128_4, samples128_4); 3968 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0)); 3969 3970 /* Horizontal add and shift. */ 3971 prediction128 = drflac__mm_hadd_epi32(prediction128); 3972 prediction128 = _mm_srai_epi32(prediction128, shift); 3973 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 3974 3975 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 3976 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 3977 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 3978 } 3979 } else { 3980 for (i = 0; i < 4; i += 1) { 3981 prediction128 = _mm_mullo_epi32(coefficients128_8, samples128_8); 3982 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_4, samples128_4)); 3983 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0)); 3984 3985 /* Horizontal add and shift. */ 3986 prediction128 = drflac__mm_hadd_epi32(prediction128); 3987 prediction128 = _mm_srai_epi32(prediction128, shift); 3988 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 3989 3990 samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4); 3991 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 3992 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 3993 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 3994 } 3995 } 3996 3997 /* We store samples in groups of 4. */ 3998 _mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0); 3999 pDecodedSamples += 4; 4000 } 4001 4002 /* Make sure we process the last few samples. */ 4003 i = (count & ~3); 4004 while (i < (int)count) { 4005 /* Rice extraction. */ 4006 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) { 4007 return DRFLAC_FALSE; 4008 } 4009 4010 /* Rice reconstruction. */ 4011 riceParamParts0 &= riceParamMask; 4012 riceParamParts0 |= (zeroCountParts0 << riceParam); 4013 riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01]; 4014 4015 /* Sample reconstruction. */ 4016 pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples); 4017 4018 i += 1; 4019 pDecodedSamples += 1; 4020 } 4021 4022 return DRFLAC_TRUE; 4023 } 4024 4025 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4026 { 4027 int i; 4028 drflac_uint32 riceParamMask; 4029 drflac_int32* pDecodedSamples = pSamplesOut; 4030 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 4031 drflac_uint32 zeroCountParts0 = 0; 4032 drflac_uint32 zeroCountParts1 = 0; 4033 drflac_uint32 zeroCountParts2 = 0; 4034 drflac_uint32 zeroCountParts3 = 0; 4035 drflac_uint32 riceParamParts0 = 0; 4036 drflac_uint32 riceParamParts1 = 0; 4037 drflac_uint32 riceParamParts2 = 0; 4038 drflac_uint32 riceParamParts3 = 0; 4039 __m128i coefficients128_0; 4040 __m128i coefficients128_4; 4041 __m128i coefficients128_8; 4042 __m128i samples128_0; 4043 __m128i samples128_4; 4044 __m128i samples128_8; 4045 __m128i prediction128; 4046 __m128i riceParamMask128; 4047 4048 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 4049 4050 DRFLAC_ASSERT(order <= 12); 4051 4052 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 4053 riceParamMask128 = _mm_set1_epi32(riceParamMask); 4054 4055 prediction128 = _mm_setzero_si128(); 4056 4057 /* Pre-load. */ 4058 coefficients128_0 = _mm_setzero_si128(); 4059 coefficients128_4 = _mm_setzero_si128(); 4060 coefficients128_8 = _mm_setzero_si128(); 4061 4062 samples128_0 = _mm_setzero_si128(); 4063 samples128_4 = _mm_setzero_si128(); 4064 samples128_8 = _mm_setzero_si128(); 4065 4066 #if 1 4067 { 4068 int runningOrder = order; 4069 4070 /* 0 - 3. */ 4071 if (runningOrder >= 4) { 4072 coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0)); 4073 samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4)); 4074 runningOrder -= 4; 4075 } else { 4076 switch (runningOrder) { 4077 case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break; 4078 case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break; 4079 case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break; 4080 } 4081 runningOrder = 0; 4082 } 4083 4084 /* 4 - 7 */ 4085 if (runningOrder >= 4) { 4086 coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4)); 4087 samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8)); 4088 runningOrder -= 4; 4089 } else { 4090 switch (runningOrder) { 4091 case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break; 4092 case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break; 4093 case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break; 4094 } 4095 runningOrder = 0; 4096 } 4097 4098 /* 8 - 11 */ 4099 if (runningOrder == 4) { 4100 coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8)); 4101 samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12)); 4102 runningOrder -= 4; 4103 } else { 4104 switch (runningOrder) { 4105 case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break; 4106 case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break; 4107 case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break; 4108 } 4109 runningOrder = 0; 4110 } 4111 4112 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 4113 coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3)); 4114 coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3)); 4115 coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3)); 4116 } 4117 #else 4118 switch (order) 4119 { 4120 case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12]; 4121 case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11]; 4122 case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10]; 4123 case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9]; 4124 case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8]; 4125 case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7]; 4126 case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6]; 4127 case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5]; 4128 case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4]; 4129 case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3]; 4130 case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2]; 4131 case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1]; 4132 } 4133 #endif 4134 4135 /* For this version we are doing one sample at a time. */ 4136 while (pDecodedSamples < pDecodedSamplesEnd) { 4137 __m128i zeroCountPart128; 4138 __m128i riceParamPart128; 4139 4140 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) || 4141 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) || 4142 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) || 4143 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) { 4144 return DRFLAC_FALSE; 4145 } 4146 4147 zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0); 4148 riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0); 4149 4150 riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128); 4151 riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam)); 4152 riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(1))), _mm_set1_epi32(1))); 4153 4154 for (i = 0; i < 4; i += 1) { 4155 prediction128 = _mm_xor_si128(prediction128, prediction128); /* Reset to 0. */ 4156 4157 switch (order) 4158 { 4159 case 12: 4160 case 11: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(1, 1, 0, 0)))); 4161 case 10: 4162 case 9: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(3, 3, 2, 2)))); 4163 case 8: 4164 case 7: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(1, 1, 0, 0)))); 4165 case 6: 4166 case 5: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(3, 3, 2, 2)))); 4167 case 4: 4168 case 3: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(1, 1, 0, 0)))); 4169 case 2: 4170 case 1: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(3, 3, 2, 2)))); 4171 } 4172 4173 /* Horizontal add and shift. */ 4174 prediction128 = drflac__mm_hadd_epi64(prediction128); 4175 prediction128 = drflac__mm_srai_epi64(prediction128, shift); 4176 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 4177 4178 /* Our value should be sitting in prediction128[0]. We need to combine this with our SSE samples. */ 4179 samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4); 4180 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 4181 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 4182 4183 /* Slide our rice parameter down so that the value in position 0 contains the next one to process. */ 4184 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 4185 } 4186 4187 /* We store samples in groups of 4. */ 4188 _mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0); 4189 pDecodedSamples += 4; 4190 } 4191 4192 /* Make sure we process the last few samples. */ 4193 i = (count & ~3); 4194 while (i < (int)count) { 4195 /* Rice extraction. */ 4196 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) { 4197 return DRFLAC_FALSE; 4198 } 4199 4200 /* Rice reconstruction. */ 4201 riceParamParts0 &= riceParamMask; 4202 riceParamParts0 |= (zeroCountParts0 << riceParam); 4203 riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01]; 4204 4205 /* Sample reconstruction. */ 4206 pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples); 4207 4208 i += 1; 4209 pDecodedSamples += 1; 4210 } 4211 4212 return DRFLAC_TRUE; 4213 } 4214 4215 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4216 { 4217 DRFLAC_ASSERT(bs != NULL); 4218 DRFLAC_ASSERT(pSamplesOut != NULL); 4219 4220 /* In my testing the order is rarely > 12, so in this case I'm going to simplify the SSE implementation by only handling order <= 12. */ 4221 if (order > 0 && order <= 12) { 4222 if (bitsPerSample+shift > 32) { 4223 return drflac__decode_samples_with_residual__rice__sse41_64(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 4224 } else { 4225 return drflac__decode_samples_with_residual__rice__sse41_32(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 4226 } 4227 } else { 4228 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4229 } 4230 } 4231 #endif 4232 4233 #if defined(DRFLAC_SUPPORT_NEON) 4234 static DRFLAC_INLINE void drflac__vst2q_s32(drflac_int32* p, int32x4x2_t x) 4235 { 4236 vst1q_s32(p+0, x.val[0]); 4237 vst1q_s32(p+4, x.val[1]); 4238 } 4239 4240 static DRFLAC_INLINE void drflac__vst2q_u32(drflac_uint32* p, uint32x4x2_t x) 4241 { 4242 vst1q_u32(p+0, x.val[0]); 4243 vst1q_u32(p+4, x.val[1]); 4244 } 4245 4246 static DRFLAC_INLINE void drflac__vst2q_f32(float* p, float32x4x2_t x) 4247 { 4248 vst1q_f32(p+0, x.val[0]); 4249 vst1q_f32(p+4, x.val[1]); 4250 } 4251 4252 static DRFLAC_INLINE void drflac__vst2q_s16(drflac_int16* p, int16x4x2_t x) 4253 { 4254 vst1q_s16(p, vcombine_s16(x.val[0], x.val[1])); 4255 } 4256 4257 static DRFLAC_INLINE void drflac__vst2q_u16(drflac_uint16* p, uint16x4x2_t x) 4258 { 4259 vst1q_u16(p, vcombine_u16(x.val[0], x.val[1])); 4260 } 4261 4262 static DRFLAC_INLINE int32x4_t drflac__vdupq_n_s32x4(drflac_int32 x3, drflac_int32 x2, drflac_int32 x1, drflac_int32 x0) 4263 { 4264 drflac_int32 x[4]; 4265 x[3] = x3; 4266 x[2] = x2; 4267 x[1] = x1; 4268 x[0] = x0; 4269 return vld1q_s32(x); 4270 } 4271 4272 static DRFLAC_INLINE int32x4_t drflac__valignrq_s32_1(int32x4_t a, int32x4_t b) 4273 { 4274 /* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */ 4275 4276 /* Reference */ 4277 /*return drflac__vdupq_n_s32x4( 4278 vgetq_lane_s32(a, 0), 4279 vgetq_lane_s32(b, 3), 4280 vgetq_lane_s32(b, 2), 4281 vgetq_lane_s32(b, 1) 4282 );*/ 4283 4284 return vextq_s32(b, a, 1); 4285 } 4286 4287 static DRFLAC_INLINE uint32x4_t drflac__valignrq_u32_1(uint32x4_t a, uint32x4_t b) 4288 { 4289 /* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */ 4290 4291 /* Reference */ 4292 /*return drflac__vdupq_n_s32x4( 4293 vgetq_lane_s32(a, 0), 4294 vgetq_lane_s32(b, 3), 4295 vgetq_lane_s32(b, 2), 4296 vgetq_lane_s32(b, 1) 4297 );*/ 4298 4299 return vextq_u32(b, a, 1); 4300 } 4301 4302 static DRFLAC_INLINE int32x2_t drflac__vhaddq_s32(int32x4_t x) 4303 { 4304 /* The sum must end up in position 0. */ 4305 4306 /* Reference */ 4307 /*return vdupq_n_s32( 4308 vgetq_lane_s32(x, 3) + 4309 vgetq_lane_s32(x, 2) + 4310 vgetq_lane_s32(x, 1) + 4311 vgetq_lane_s32(x, 0) 4312 );*/ 4313 4314 int32x2_t r = vadd_s32(vget_high_s32(x), vget_low_s32(x)); 4315 return vpadd_s32(r, r); 4316 } 4317 4318 static DRFLAC_INLINE int64x1_t drflac__vhaddq_s64(int64x2_t x) 4319 { 4320 return vadd_s64(vget_high_s64(x), vget_low_s64(x)); 4321 } 4322 4323 static DRFLAC_INLINE int32x4_t drflac__vrevq_s32(int32x4_t x) 4324 { 4325 /* Reference */ 4326 /*return drflac__vdupq_n_s32x4( 4327 vgetq_lane_s32(x, 0), 4328 vgetq_lane_s32(x, 1), 4329 vgetq_lane_s32(x, 2), 4330 vgetq_lane_s32(x, 3) 4331 );*/ 4332 4333 return vrev64q_s32(vcombine_s32(vget_high_s32(x), vget_low_s32(x))); 4334 } 4335 4336 static DRFLAC_INLINE int32x4_t drflac__vnotq_s32(int32x4_t x) 4337 { 4338 return veorq_s32(x, vdupq_n_s32(0xFFFFFFFF)); 4339 } 4340 4341 static DRFLAC_INLINE uint32x4_t drflac__vnotq_u32(uint32x4_t x) 4342 { 4343 return veorq_u32(x, vdupq_n_u32(0xFFFFFFFF)); 4344 } 4345 4346 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4347 { 4348 int i; 4349 drflac_uint32 riceParamMask; 4350 drflac_int32* pDecodedSamples = pSamplesOut; 4351 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 4352 drflac_uint32 zeroCountParts[4]; 4353 drflac_uint32 riceParamParts[4]; 4354 int32x4_t coefficients128_0; 4355 int32x4_t coefficients128_4; 4356 int32x4_t coefficients128_8; 4357 int32x4_t samples128_0; 4358 int32x4_t samples128_4; 4359 int32x4_t samples128_8; 4360 uint32x4_t riceParamMask128; 4361 int32x4_t riceParam128; 4362 int32x2_t shift64; 4363 uint32x4_t one128; 4364 4365 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 4366 4367 riceParamMask = ~((~0UL) << riceParam); 4368 riceParamMask128 = vdupq_n_u32(riceParamMask); 4369 4370 riceParam128 = vdupq_n_s32(riceParam); 4371 shift64 = vdup_n_s32(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */ 4372 one128 = vdupq_n_u32(1); 4373 4374 /* 4375 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 4376 what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results 4377 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 4378 so I think there's opportunity for this to be simplified. 4379 */ 4380 { 4381 int runningOrder = order; 4382 drflac_int32 tempC[4] = {0, 0, 0, 0}; 4383 drflac_int32 tempS[4] = {0, 0, 0, 0}; 4384 4385 /* 0 - 3. */ 4386 if (runningOrder >= 4) { 4387 coefficients128_0 = vld1q_s32(coefficients + 0); 4388 samples128_0 = vld1q_s32(pSamplesOut - 4); 4389 runningOrder -= 4; 4390 } else { 4391 switch (runningOrder) { 4392 case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */ 4393 case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */ 4394 case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */ 4395 } 4396 4397 coefficients128_0 = vld1q_s32(tempC); 4398 samples128_0 = vld1q_s32(tempS); 4399 runningOrder = 0; 4400 } 4401 4402 /* 4 - 7 */ 4403 if (runningOrder >= 4) { 4404 coefficients128_4 = vld1q_s32(coefficients + 4); 4405 samples128_4 = vld1q_s32(pSamplesOut - 8); 4406 runningOrder -= 4; 4407 } else { 4408 switch (runningOrder) { 4409 case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */ 4410 case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */ 4411 case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */ 4412 } 4413 4414 coefficients128_4 = vld1q_s32(tempC); 4415 samples128_4 = vld1q_s32(tempS); 4416 runningOrder = 0; 4417 } 4418 4419 /* 8 - 11 */ 4420 if (runningOrder == 4) { 4421 coefficients128_8 = vld1q_s32(coefficients + 8); 4422 samples128_8 = vld1q_s32(pSamplesOut - 12); 4423 runningOrder -= 4; 4424 } else { 4425 switch (runningOrder) { 4426 case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */ 4427 case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */ 4428 case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */ 4429 } 4430 4431 coefficients128_8 = vld1q_s32(tempC); 4432 samples128_8 = vld1q_s32(tempS); 4433 runningOrder = 0; 4434 } 4435 4436 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 4437 coefficients128_0 = drflac__vrevq_s32(coefficients128_0); 4438 coefficients128_4 = drflac__vrevq_s32(coefficients128_4); 4439 coefficients128_8 = drflac__vrevq_s32(coefficients128_8); 4440 } 4441 4442 /* For this version we are doing one sample at a time. */ 4443 while (pDecodedSamples < pDecodedSamplesEnd) { 4444 int32x4_t prediction128; 4445 int32x2_t prediction64; 4446 uint32x4_t zeroCountPart128; 4447 uint32x4_t riceParamPart128; 4448 4449 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) || 4450 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) || 4451 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) || 4452 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) { 4453 return DRFLAC_FALSE; 4454 } 4455 4456 zeroCountPart128 = vld1q_u32(zeroCountParts); 4457 riceParamPart128 = vld1q_u32(riceParamParts); 4458 4459 riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128); 4460 riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128)); 4461 riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128)); 4462 4463 if (order <= 4) { 4464 for (i = 0; i < 4; i += 1) { 4465 prediction128 = vmulq_s32(coefficients128_0, samples128_0); 4466 4467 /* Horizontal add and shift. */ 4468 prediction64 = drflac__vhaddq_s32(prediction128); 4469 prediction64 = vshl_s32(prediction64, shift64); 4470 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 4471 4472 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 4473 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 4474 } 4475 } else if (order <= 8) { 4476 for (i = 0; i < 4; i += 1) { 4477 prediction128 = vmulq_s32(coefficients128_4, samples128_4); 4478 prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0); 4479 4480 /* Horizontal add and shift. */ 4481 prediction64 = drflac__vhaddq_s32(prediction128); 4482 prediction64 = vshl_s32(prediction64, shift64); 4483 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 4484 4485 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 4486 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 4487 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 4488 } 4489 } else { 4490 for (i = 0; i < 4; i += 1) { 4491 prediction128 = vmulq_s32(coefficients128_8, samples128_8); 4492 prediction128 = vmlaq_s32(prediction128, coefficients128_4, samples128_4); 4493 prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0); 4494 4495 /* Horizontal add and shift. */ 4496 prediction64 = drflac__vhaddq_s32(prediction128); 4497 prediction64 = vshl_s32(prediction64, shift64); 4498 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 4499 4500 samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8); 4501 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 4502 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 4503 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 4504 } 4505 } 4506 4507 /* We store samples in groups of 4. */ 4508 vst1q_s32(pDecodedSamples, samples128_0); 4509 pDecodedSamples += 4; 4510 } 4511 4512 /* Make sure we process the last few samples. */ 4513 i = (count & ~3); 4514 while (i < (int)count) { 4515 /* Rice extraction. */ 4516 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) { 4517 return DRFLAC_FALSE; 4518 } 4519 4520 /* Rice reconstruction. */ 4521 riceParamParts[0] &= riceParamMask; 4522 riceParamParts[0] |= (zeroCountParts[0] << riceParam); 4523 riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01]; 4524 4525 /* Sample reconstruction. */ 4526 pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples); 4527 4528 i += 1; 4529 pDecodedSamples += 1; 4530 } 4531 4532 return DRFLAC_TRUE; 4533 } 4534 4535 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4536 { 4537 int i; 4538 drflac_uint32 riceParamMask; 4539 drflac_int32* pDecodedSamples = pSamplesOut; 4540 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 4541 drflac_uint32 zeroCountParts[4]; 4542 drflac_uint32 riceParamParts[4]; 4543 int32x4_t coefficients128_0; 4544 int32x4_t coefficients128_4; 4545 int32x4_t coefficients128_8; 4546 int32x4_t samples128_0; 4547 int32x4_t samples128_4; 4548 int32x4_t samples128_8; 4549 uint32x4_t riceParamMask128; 4550 int32x4_t riceParam128; 4551 int64x1_t shift64; 4552 uint32x4_t one128; 4553 4554 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 4555 4556 riceParamMask = ~((~0UL) << riceParam); 4557 riceParamMask128 = vdupq_n_u32(riceParamMask); 4558 4559 riceParam128 = vdupq_n_s32(riceParam); 4560 shift64 = vdup_n_s64(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */ 4561 one128 = vdupq_n_u32(1); 4562 4563 /* 4564 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 4565 what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results 4566 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 4567 so I think there's opportunity for this to be simplified. 4568 */ 4569 { 4570 int runningOrder = order; 4571 drflac_int32 tempC[4] = {0, 0, 0, 0}; 4572 drflac_int32 tempS[4] = {0, 0, 0, 0}; 4573 4574 /* 0 - 3. */ 4575 if (runningOrder >= 4) { 4576 coefficients128_0 = vld1q_s32(coefficients + 0); 4577 samples128_0 = vld1q_s32(pSamplesOut - 4); 4578 runningOrder -= 4; 4579 } else { 4580 switch (runningOrder) { 4581 case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */ 4582 case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */ 4583 case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */ 4584 } 4585 4586 coefficients128_0 = vld1q_s32(tempC); 4587 samples128_0 = vld1q_s32(tempS); 4588 runningOrder = 0; 4589 } 4590 4591 /* 4 - 7 */ 4592 if (runningOrder >= 4) { 4593 coefficients128_4 = vld1q_s32(coefficients + 4); 4594 samples128_4 = vld1q_s32(pSamplesOut - 8); 4595 runningOrder -= 4; 4596 } else { 4597 switch (runningOrder) { 4598 case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */ 4599 case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */ 4600 case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */ 4601 } 4602 4603 coefficients128_4 = vld1q_s32(tempC); 4604 samples128_4 = vld1q_s32(tempS); 4605 runningOrder = 0; 4606 } 4607 4608 /* 8 - 11 */ 4609 if (runningOrder == 4) { 4610 coefficients128_8 = vld1q_s32(coefficients + 8); 4611 samples128_8 = vld1q_s32(pSamplesOut - 12); 4612 runningOrder -= 4; 4613 } else { 4614 switch (runningOrder) { 4615 case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */ 4616 case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */ 4617 case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */ 4618 } 4619 4620 coefficients128_8 = vld1q_s32(tempC); 4621 samples128_8 = vld1q_s32(tempS); 4622 runningOrder = 0; 4623 } 4624 4625 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 4626 coefficients128_0 = drflac__vrevq_s32(coefficients128_0); 4627 coefficients128_4 = drflac__vrevq_s32(coefficients128_4); 4628 coefficients128_8 = drflac__vrevq_s32(coefficients128_8); 4629 } 4630 4631 /* For this version we are doing one sample at a time. */ 4632 while (pDecodedSamples < pDecodedSamplesEnd) { 4633 int64x2_t prediction128; 4634 uint32x4_t zeroCountPart128; 4635 uint32x4_t riceParamPart128; 4636 4637 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) || 4638 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) || 4639 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) || 4640 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) { 4641 return DRFLAC_FALSE; 4642 } 4643 4644 zeroCountPart128 = vld1q_u32(zeroCountParts); 4645 riceParamPart128 = vld1q_u32(riceParamParts); 4646 4647 riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128); 4648 riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128)); 4649 riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128)); 4650 4651 for (i = 0; i < 4; i += 1) { 4652 int64x1_t prediction64; 4653 4654 prediction128 = veorq_s64(prediction128, prediction128); /* Reset to 0. */ 4655 switch (order) 4656 { 4657 case 12: 4658 case 11: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_8), vget_low_s32(samples128_8))); 4659 case 10: 4660 case 9: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_8), vget_high_s32(samples128_8))); 4661 case 8: 4662 case 7: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_4), vget_low_s32(samples128_4))); 4663 case 6: 4664 case 5: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_4), vget_high_s32(samples128_4))); 4665 case 4: 4666 case 3: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_0), vget_low_s32(samples128_0))); 4667 case 2: 4668 case 1: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_0), vget_high_s32(samples128_0))); 4669 } 4670 4671 /* Horizontal add and shift. */ 4672 prediction64 = drflac__vhaddq_s64(prediction128); 4673 prediction64 = vshl_s64(prediction64, shift64); 4674 prediction64 = vadd_s64(prediction64, vdup_n_s64(vgetq_lane_u32(riceParamPart128, 0))); 4675 4676 /* Our value should be sitting in prediction64[0]. We need to combine this with our SSE samples. */ 4677 samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8); 4678 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 4679 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(vreinterpret_s32_s64(prediction64), vdup_n_s32(0)), samples128_0); 4680 4681 /* Slide our rice parameter down so that the value in position 0 contains the next one to process. */ 4682 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 4683 } 4684 4685 /* We store samples in groups of 4. */ 4686 vst1q_s32(pDecodedSamples, samples128_0); 4687 pDecodedSamples += 4; 4688 } 4689 4690 /* Make sure we process the last few samples. */ 4691 i = (count & ~3); 4692 while (i < (int)count) { 4693 /* Rice extraction. */ 4694 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) { 4695 return DRFLAC_FALSE; 4696 } 4697 4698 /* Rice reconstruction. */ 4699 riceParamParts[0] &= riceParamMask; 4700 riceParamParts[0] |= (zeroCountParts[0] << riceParam); 4701 riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01]; 4702 4703 /* Sample reconstruction. */ 4704 pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples); 4705 4706 i += 1; 4707 pDecodedSamples += 1; 4708 } 4709 4710 return DRFLAC_TRUE; 4711 } 4712 4713 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4714 { 4715 DRFLAC_ASSERT(bs != NULL); 4716 DRFLAC_ASSERT(pSamplesOut != NULL); 4717 4718 /* In my testing the order is rarely > 12, so in this case I'm going to simplify the NEON implementation by only handling order <= 12. */ 4719 if (order > 0 && order <= 12) { 4720 if (bitsPerSample+shift > 32) { 4721 return drflac__decode_samples_with_residual__rice__neon_64(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 4722 } else { 4723 return drflac__decode_samples_with_residual__rice__neon_32(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 4724 } 4725 } else { 4726 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4727 } 4728 } 4729 #endif 4730 4731 static drflac_bool32 drflac__decode_samples_with_residual__rice(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4732 { 4733 #if defined(DRFLAC_SUPPORT_SSE41) 4734 if (drflac__gIsSSE41Supported) { 4735 return drflac__decode_samples_with_residual__rice__sse41(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4736 } else 4737 #elif defined(DRFLAC_SUPPORT_NEON) 4738 if (drflac__gIsNEONSupported) { 4739 return drflac__decode_samples_with_residual__rice__neon(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4740 } else 4741 #endif 4742 { 4743 /* Scalar fallback. */ 4744 #if 0 4745 return drflac__decode_samples_with_residual__rice__reference(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4746 #else 4747 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 4748 #endif 4749 } 4750 } 4751 4752 /* Reads and seeks past a string of residual values as Rice codes. The decoder should be sitting on the first bit of the Rice codes. */ 4753 static drflac_bool32 drflac__read_and_seek_residual__rice(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam) 4754 { 4755 drflac_uint32 i; 4756 4757 DRFLAC_ASSERT(bs != NULL); 4758 4759 for (i = 0; i < count; ++i) { 4760 if (!drflac__seek_rice_parts(bs, riceParam)) { 4761 return DRFLAC_FALSE; 4762 } 4763 } 4764 4765 return DRFLAC_TRUE; 4766 } 4767 4768 static drflac_bool32 drflac__decode_samples_with_residual__unencoded(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 unencodedBitsPerSample, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 4769 { 4770 drflac_uint32 i; 4771 4772 DRFLAC_ASSERT(bs != NULL); 4773 DRFLAC_ASSERT(unencodedBitsPerSample <= 31); /* <-- unencodedBitsPerSample is a 5 bit number, so cannot exceed 31. */ 4774 DRFLAC_ASSERT(pSamplesOut != NULL); 4775 4776 for (i = 0; i < count; ++i) { 4777 if (unencodedBitsPerSample > 0) { 4778 if (!drflac__read_int32(bs, unencodedBitsPerSample, pSamplesOut + i)) { 4779 return DRFLAC_FALSE; 4780 } 4781 } else { 4782 pSamplesOut[i] = 0; 4783 } 4784 4785 if (bitsPerSample >= 24) { 4786 pSamplesOut[i] += drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + i); 4787 } else { 4788 pSamplesOut[i] += drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + i); 4789 } 4790 } 4791 4792 return DRFLAC_TRUE; 4793 } 4794 4795 4796 /* 4797 Reads and decodes the residual for the sub-frame the decoder is currently sitting on. This function should be called 4798 when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be ignored. The 4799 <blockSize> and <order> parameters are used to determine how many residual values need to be decoded. 4800 */ 4801 static drflac_bool32 drflac__decode_samples_with_residual(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 blockSize, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 4802 { 4803 drflac_uint8 residualMethod; 4804 drflac_uint8 partitionOrder; 4805 drflac_uint32 samplesInPartition; 4806 drflac_uint32 partitionsRemaining; 4807 4808 DRFLAC_ASSERT(bs != NULL); 4809 DRFLAC_ASSERT(blockSize != 0); 4810 DRFLAC_ASSERT(pDecodedSamples != NULL); /* <-- Should we allow NULL, in which case we just seek past the residual rather than do a full decode? */ 4811 4812 if (!drflac__read_uint8(bs, 2, &residualMethod)) { 4813 return DRFLAC_FALSE; 4814 } 4815 4816 if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 4817 return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */ 4818 } 4819 4820 /* Ignore the first <order> values. */ 4821 pDecodedSamples += order; 4822 4823 if (!drflac__read_uint8(bs, 4, &partitionOrder)) { 4824 return DRFLAC_FALSE; 4825 } 4826 4827 /* 4828 From the FLAC spec: 4829 The Rice partition order in a Rice-coded residual section must be less than or equal to 8. 4830 */ 4831 if (partitionOrder > 8) { 4832 return DRFLAC_FALSE; 4833 } 4834 4835 /* Validation check. */ 4836 if ((blockSize / (1 << partitionOrder)) < order) { 4837 return DRFLAC_FALSE; 4838 } 4839 4840 samplesInPartition = (blockSize / (1 << partitionOrder)) - order; 4841 partitionsRemaining = (1 << partitionOrder); 4842 for (;;) { 4843 drflac_uint8 riceParam = 0; 4844 if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) { 4845 if (!drflac__read_uint8(bs, 4, &riceParam)) { 4846 return DRFLAC_FALSE; 4847 } 4848 if (riceParam == 15) { 4849 riceParam = 0xFF; 4850 } 4851 } else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 4852 if (!drflac__read_uint8(bs, 5, &riceParam)) { 4853 return DRFLAC_FALSE; 4854 } 4855 if (riceParam == 31) { 4856 riceParam = 0xFF; 4857 } 4858 } 4859 4860 if (riceParam != 0xFF) { 4861 if (!drflac__decode_samples_with_residual__rice(bs, bitsPerSample, samplesInPartition, riceParam, order, shift, coefficients, pDecodedSamples)) { 4862 return DRFLAC_FALSE; 4863 } 4864 } else { 4865 drflac_uint8 unencodedBitsPerSample = 0; 4866 if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) { 4867 return DRFLAC_FALSE; 4868 } 4869 4870 if (!drflac__decode_samples_with_residual__unencoded(bs, bitsPerSample, samplesInPartition, unencodedBitsPerSample, order, shift, coefficients, pDecodedSamples)) { 4871 return DRFLAC_FALSE; 4872 } 4873 } 4874 4875 pDecodedSamples += samplesInPartition; 4876 4877 if (partitionsRemaining == 1) { 4878 break; 4879 } 4880 4881 partitionsRemaining -= 1; 4882 4883 if (partitionOrder != 0) { 4884 samplesInPartition = blockSize / (1 << partitionOrder); 4885 } 4886 } 4887 4888 return DRFLAC_TRUE; 4889 } 4890 4891 /* 4892 Reads and seeks past the residual for the sub-frame the decoder is currently sitting on. This function should be called 4893 when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be set to 0. The 4894 <blockSize> and <order> parameters are used to determine how many residual values need to be decoded. 4895 */ 4896 static drflac_bool32 drflac__read_and_seek_residual(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 order) 4897 { 4898 drflac_uint8 residualMethod; 4899 drflac_uint8 partitionOrder; 4900 drflac_uint32 samplesInPartition; 4901 drflac_uint32 partitionsRemaining; 4902 4903 DRFLAC_ASSERT(bs != NULL); 4904 DRFLAC_ASSERT(blockSize != 0); 4905 4906 if (!drflac__read_uint8(bs, 2, &residualMethod)) { 4907 return DRFLAC_FALSE; 4908 } 4909 4910 if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 4911 return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */ 4912 } 4913 4914 if (!drflac__read_uint8(bs, 4, &partitionOrder)) { 4915 return DRFLAC_FALSE; 4916 } 4917 4918 /* 4919 From the FLAC spec: 4920 The Rice partition order in a Rice-coded residual section must be less than or equal to 8. 4921 */ 4922 if (partitionOrder > 8) { 4923 return DRFLAC_FALSE; 4924 } 4925 4926 /* Validation check. */ 4927 if ((blockSize / (1 << partitionOrder)) <= order) { 4928 return DRFLAC_FALSE; 4929 } 4930 4931 samplesInPartition = (blockSize / (1 << partitionOrder)) - order; 4932 partitionsRemaining = (1 << partitionOrder); 4933 for (;;) 4934 { 4935 drflac_uint8 riceParam = 0; 4936 if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) { 4937 if (!drflac__read_uint8(bs, 4, &riceParam)) { 4938 return DRFLAC_FALSE; 4939 } 4940 if (riceParam == 15) { 4941 riceParam = 0xFF; 4942 } 4943 } else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 4944 if (!drflac__read_uint8(bs, 5, &riceParam)) { 4945 return DRFLAC_FALSE; 4946 } 4947 if (riceParam == 31) { 4948 riceParam = 0xFF; 4949 } 4950 } 4951 4952 if (riceParam != 0xFF) { 4953 if (!drflac__read_and_seek_residual__rice(bs, samplesInPartition, riceParam)) { 4954 return DRFLAC_FALSE; 4955 } 4956 } else { 4957 drflac_uint8 unencodedBitsPerSample = 0; 4958 if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) { 4959 return DRFLAC_FALSE; 4960 } 4961 4962 if (!drflac__seek_bits(bs, unencodedBitsPerSample * samplesInPartition)) { 4963 return DRFLAC_FALSE; 4964 } 4965 } 4966 4967 4968 if (partitionsRemaining == 1) { 4969 break; 4970 } 4971 4972 partitionsRemaining -= 1; 4973 samplesInPartition = blockSize / (1 << partitionOrder); 4974 } 4975 4976 return DRFLAC_TRUE; 4977 } 4978 4979 4980 static drflac_bool32 drflac__decode_samples__constant(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples) 4981 { 4982 drflac_uint32 i; 4983 4984 /* Only a single sample needs to be decoded here. */ 4985 drflac_int32 sample; 4986 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 4987 return DRFLAC_FALSE; 4988 } 4989 4990 /* 4991 We don't really need to expand this, but it does simplify the process of reading samples. If this becomes a performance issue (unlikely) 4992 we'll want to look at a more efficient way. 4993 */ 4994 for (i = 0; i < blockSize; ++i) { 4995 pDecodedSamples[i] = sample; 4996 } 4997 4998 return DRFLAC_TRUE; 4999 } 5000 5001 static drflac_bool32 drflac__decode_samples__verbatim(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples) 5002 { 5003 drflac_uint32 i; 5004 5005 for (i = 0; i < blockSize; ++i) { 5006 drflac_int32 sample; 5007 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 5008 return DRFLAC_FALSE; 5009 } 5010 5011 pDecodedSamples[i] = sample; 5012 } 5013 5014 return DRFLAC_TRUE; 5015 } 5016 5017 static drflac_bool32 drflac__decode_samples__fixed(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples) 5018 { 5019 drflac_uint32 i; 5020 5021 static drflac_int32 lpcCoefficientsTable[5][4] = { 5022 {0, 0, 0, 0}, 5023 {1, 0, 0, 0}, 5024 {2, -1, 0, 0}, 5025 {3, -3, 1, 0}, 5026 {4, -6, 4, -1} 5027 }; 5028 5029 /* Warm up samples and coefficients. */ 5030 for (i = 0; i < lpcOrder; ++i) { 5031 drflac_int32 sample; 5032 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 5033 return DRFLAC_FALSE; 5034 } 5035 5036 pDecodedSamples[i] = sample; 5037 } 5038 5039 if (!drflac__decode_samples_with_residual(bs, subframeBitsPerSample, blockSize, lpcOrder, 0, lpcCoefficientsTable[lpcOrder], pDecodedSamples)) { 5040 return DRFLAC_FALSE; 5041 } 5042 5043 return DRFLAC_TRUE; 5044 } 5045 5046 static drflac_bool32 drflac__decode_samples__lpc(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 bitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples) 5047 { 5048 drflac_uint8 i; 5049 drflac_uint8 lpcPrecision; 5050 drflac_int8 lpcShift; 5051 drflac_int32 coefficients[32]; 5052 5053 /* Warm up samples. */ 5054 for (i = 0; i < lpcOrder; ++i) { 5055 drflac_int32 sample; 5056 if (!drflac__read_int32(bs, bitsPerSample, &sample)) { 5057 return DRFLAC_FALSE; 5058 } 5059 5060 pDecodedSamples[i] = sample; 5061 } 5062 5063 if (!drflac__read_uint8(bs, 4, &lpcPrecision)) { 5064 return DRFLAC_FALSE; 5065 } 5066 if (lpcPrecision == 15) { 5067 return DRFLAC_FALSE; /* Invalid. */ 5068 } 5069 lpcPrecision += 1; 5070 5071 if (!drflac__read_int8(bs, 5, &lpcShift)) { 5072 return DRFLAC_FALSE; 5073 } 5074 5075 /* 5076 From the FLAC specification: 5077 5078 Quantized linear predictor coefficient shift needed in bits (NOTE: this number is signed two's-complement) 5079 5080 Emphasis on the "signed two's-complement". In practice there does not seem to be any encoders nor decoders supporting negative shifts. For now dr_flac is 5081 not going to support negative shifts as I don't have any reference files. However, when a reference file comes through I will consider adding support. 5082 */ 5083 if (lpcShift < 0) { 5084 return DRFLAC_FALSE; 5085 } 5086 5087 DRFLAC_ZERO_MEMORY(coefficients, sizeof(coefficients)); 5088 for (i = 0; i < lpcOrder; ++i) { 5089 if (!drflac__read_int32(bs, lpcPrecision, coefficients + i)) { 5090 return DRFLAC_FALSE; 5091 } 5092 } 5093 5094 if (!drflac__decode_samples_with_residual(bs, bitsPerSample, blockSize, lpcOrder, lpcShift, coefficients, pDecodedSamples)) { 5095 return DRFLAC_FALSE; 5096 } 5097 5098 return DRFLAC_TRUE; 5099 } 5100 5101 5102 static drflac_bool32 drflac__read_next_flac_frame_header(drflac_bs* bs, drflac_uint8 streaminfoBitsPerSample, drflac_frame_header* header) 5103 { 5104 const drflac_uint32 sampleRateTable[12] = {0, 88200, 176400, 192000, 8000, 16000, 22050, 24000, 32000, 44100, 48000, 96000}; 5105 const drflac_uint8 bitsPerSampleTable[8] = {0, 8, 12, (drflac_uint8)-1, 16, 20, 24, (drflac_uint8)-1}; /* -1 = reserved. */ 5106 5107 DRFLAC_ASSERT(bs != NULL); 5108 DRFLAC_ASSERT(header != NULL); 5109 5110 /* Keep looping until we find a valid sync code. */ 5111 for (;;) { 5112 drflac_uint8 crc8 = 0xCE; /* 0xCE = drflac_crc8(0, 0x3FFE, 14); */ 5113 drflac_uint8 reserved = 0; 5114 drflac_uint8 blockingStrategy = 0; 5115 drflac_uint8 blockSize = 0; 5116 drflac_uint8 sampleRate = 0; 5117 drflac_uint8 channelAssignment = 0; 5118 drflac_uint8 bitsPerSample = 0; 5119 drflac_bool32 isVariableBlockSize; 5120 5121 if (!drflac__find_and_seek_to_next_sync_code(bs)) { 5122 return DRFLAC_FALSE; 5123 } 5124 5125 if (!drflac__read_uint8(bs, 1, &reserved)) { 5126 return DRFLAC_FALSE; 5127 } 5128 if (reserved == 1) { 5129 continue; 5130 } 5131 crc8 = drflac_crc8(crc8, reserved, 1); 5132 5133 if (!drflac__read_uint8(bs, 1, &blockingStrategy)) { 5134 return DRFLAC_FALSE; 5135 } 5136 crc8 = drflac_crc8(crc8, blockingStrategy, 1); 5137 5138 if (!drflac__read_uint8(bs, 4, &blockSize)) { 5139 return DRFLAC_FALSE; 5140 } 5141 if (blockSize == 0) { 5142 continue; 5143 } 5144 crc8 = drflac_crc8(crc8, blockSize, 4); 5145 5146 if (!drflac__read_uint8(bs, 4, &sampleRate)) { 5147 return DRFLAC_FALSE; 5148 } 5149 crc8 = drflac_crc8(crc8, sampleRate, 4); 5150 5151 if (!drflac__read_uint8(bs, 4, &channelAssignment)) { 5152 return DRFLAC_FALSE; 5153 } 5154 if (channelAssignment > 10) { 5155 continue; 5156 } 5157 crc8 = drflac_crc8(crc8, channelAssignment, 4); 5158 5159 if (!drflac__read_uint8(bs, 3, &bitsPerSample)) { 5160 return DRFLAC_FALSE; 5161 } 5162 if (bitsPerSample == 3 || bitsPerSample == 7) { 5163 continue; 5164 } 5165 crc8 = drflac_crc8(crc8, bitsPerSample, 3); 5166 5167 5168 if (!drflac__read_uint8(bs, 1, &reserved)) { 5169 return DRFLAC_FALSE; 5170 } 5171 if (reserved == 1) { 5172 continue; 5173 } 5174 crc8 = drflac_crc8(crc8, reserved, 1); 5175 5176 5177 isVariableBlockSize = blockingStrategy == 1; 5178 if (isVariableBlockSize) { 5179 drflac_uint64 pcmFrameNumber; 5180 drflac_result result = drflac__read_utf8_coded_number(bs, &pcmFrameNumber, &crc8); 5181 if (result != DRFLAC_SUCCESS) { 5182 if (result == DRFLAC_AT_END) { 5183 return DRFLAC_FALSE; 5184 } else { 5185 continue; 5186 } 5187 } 5188 header->flacFrameNumber = 0; 5189 header->pcmFrameNumber = pcmFrameNumber; 5190 } else { 5191 drflac_uint64 flacFrameNumber = 0; 5192 drflac_result result = drflac__read_utf8_coded_number(bs, &flacFrameNumber, &crc8); 5193 if (result != DRFLAC_SUCCESS) { 5194 if (result == DRFLAC_AT_END) { 5195 return DRFLAC_FALSE; 5196 } else { 5197 continue; 5198 } 5199 } 5200 header->flacFrameNumber = (drflac_uint32)flacFrameNumber; /* <-- Safe cast. */ 5201 header->pcmFrameNumber = 0; 5202 } 5203 5204 5205 DRFLAC_ASSERT(blockSize > 0); 5206 if (blockSize == 1) { 5207 header->blockSizeInPCMFrames = 192; 5208 } else if (blockSize <= 5) { 5209 DRFLAC_ASSERT(blockSize >= 2); 5210 header->blockSizeInPCMFrames = 576 * (1 << (blockSize - 2)); 5211 } else if (blockSize == 6) { 5212 if (!drflac__read_uint16(bs, 8, &header->blockSizeInPCMFrames)) { 5213 return DRFLAC_FALSE; 5214 } 5215 crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 8); 5216 header->blockSizeInPCMFrames += 1; 5217 } else if (blockSize == 7) { 5218 if (!drflac__read_uint16(bs, 16, &header->blockSizeInPCMFrames)) { 5219 return DRFLAC_FALSE; 5220 } 5221 crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 16); 5222 header->blockSizeInPCMFrames += 1; 5223 } else { 5224 DRFLAC_ASSERT(blockSize >= 8); 5225 header->blockSizeInPCMFrames = 256 * (1 << (blockSize - 8)); 5226 } 5227 5228 5229 if (sampleRate <= 11) { 5230 header->sampleRate = sampleRateTable[sampleRate]; 5231 } else if (sampleRate == 12) { 5232 if (!drflac__read_uint32(bs, 8, &header->sampleRate)) { 5233 return DRFLAC_FALSE; 5234 } 5235 crc8 = drflac_crc8(crc8, header->sampleRate, 8); 5236 header->sampleRate *= 1000; 5237 } else if (sampleRate == 13) { 5238 if (!drflac__read_uint32(bs, 16, &header->sampleRate)) { 5239 return DRFLAC_FALSE; 5240 } 5241 crc8 = drflac_crc8(crc8, header->sampleRate, 16); 5242 } else if (sampleRate == 14) { 5243 if (!drflac__read_uint32(bs, 16, &header->sampleRate)) { 5244 return DRFLAC_FALSE; 5245 } 5246 crc8 = drflac_crc8(crc8, header->sampleRate, 16); 5247 header->sampleRate *= 10; 5248 } else { 5249 continue; /* Invalid. Assume an invalid block. */ 5250 } 5251 5252 5253 header->channelAssignment = channelAssignment; 5254 5255 header->bitsPerSample = bitsPerSampleTable[bitsPerSample]; 5256 if (header->bitsPerSample == 0) { 5257 header->bitsPerSample = streaminfoBitsPerSample; 5258 } 5259 5260 if (!drflac__read_uint8(bs, 8, &header->crc8)) { 5261 return DRFLAC_FALSE; 5262 } 5263 5264 #ifndef DR_FLAC_NO_CRC 5265 if (header->crc8 != crc8) { 5266 continue; /* CRC mismatch. Loop back to the top and find the next sync code. */ 5267 } 5268 #endif 5269 return DRFLAC_TRUE; 5270 } 5271 } 5272 5273 static drflac_bool32 drflac__read_subframe_header(drflac_bs* bs, drflac_subframe* pSubframe) 5274 { 5275 drflac_uint8 header; 5276 int type; 5277 5278 if (!drflac__read_uint8(bs, 8, &header)) { 5279 return DRFLAC_FALSE; 5280 } 5281 5282 /* First bit should always be 0. */ 5283 if ((header & 0x80) != 0) { 5284 return DRFLAC_FALSE; 5285 } 5286 5287 type = (header & 0x7E) >> 1; 5288 if (type == 0) { 5289 pSubframe->subframeType = DRFLAC_SUBFRAME_CONSTANT; 5290 } else if (type == 1) { 5291 pSubframe->subframeType = DRFLAC_SUBFRAME_VERBATIM; 5292 } else { 5293 if ((type & 0x20) != 0) { 5294 pSubframe->subframeType = DRFLAC_SUBFRAME_LPC; 5295 pSubframe->lpcOrder = (drflac_uint8)(type & 0x1F) + 1; 5296 } else if ((type & 0x08) != 0) { 5297 pSubframe->subframeType = DRFLAC_SUBFRAME_FIXED; 5298 pSubframe->lpcOrder = (drflac_uint8)(type & 0x07); 5299 if (pSubframe->lpcOrder > 4) { 5300 pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED; 5301 pSubframe->lpcOrder = 0; 5302 } 5303 } else { 5304 pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED; 5305 } 5306 } 5307 5308 if (pSubframe->subframeType == DRFLAC_SUBFRAME_RESERVED) { 5309 return DRFLAC_FALSE; 5310 } 5311 5312 /* Wasted bits per sample. */ 5313 pSubframe->wastedBitsPerSample = 0; 5314 if ((header & 0x01) == 1) { 5315 unsigned int wastedBitsPerSample; 5316 if (!drflac__seek_past_next_set_bit(bs, &wastedBitsPerSample)) { 5317 return DRFLAC_FALSE; 5318 } 5319 pSubframe->wastedBitsPerSample = (drflac_uint8)wastedBitsPerSample + 1; 5320 } 5321 5322 return DRFLAC_TRUE; 5323 } 5324 5325 static drflac_bool32 drflac__decode_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex, drflac_int32* pDecodedSamplesOut) 5326 { 5327 drflac_subframe* pSubframe; 5328 drflac_uint32 subframeBitsPerSample; 5329 5330 DRFLAC_ASSERT(bs != NULL); 5331 DRFLAC_ASSERT(frame != NULL); 5332 5333 pSubframe = frame->subframes + subframeIndex; 5334 if (!drflac__read_subframe_header(bs, pSubframe)) { 5335 return DRFLAC_FALSE; 5336 } 5337 5338 /* Side channels require an extra bit per sample. Took a while to figure that one out... */ 5339 subframeBitsPerSample = frame->header.bitsPerSample; 5340 if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) { 5341 subframeBitsPerSample += 1; 5342 } else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) { 5343 subframeBitsPerSample += 1; 5344 } 5345 5346 /* Need to handle wasted bits per sample. */ 5347 if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) { 5348 return DRFLAC_FALSE; 5349 } 5350 subframeBitsPerSample -= pSubframe->wastedBitsPerSample; 5351 5352 pSubframe->pSamplesS32 = pDecodedSamplesOut; 5353 5354 switch (pSubframe->subframeType) 5355 { 5356 case DRFLAC_SUBFRAME_CONSTANT: 5357 { 5358 drflac__decode_samples__constant(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32); 5359 } break; 5360 5361 case DRFLAC_SUBFRAME_VERBATIM: 5362 { 5363 drflac__decode_samples__verbatim(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32); 5364 } break; 5365 5366 case DRFLAC_SUBFRAME_FIXED: 5367 { 5368 drflac__decode_samples__fixed(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32); 5369 } break; 5370 5371 case DRFLAC_SUBFRAME_LPC: 5372 { 5373 drflac__decode_samples__lpc(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32); 5374 } break; 5375 5376 default: return DRFLAC_FALSE; 5377 } 5378 5379 return DRFLAC_TRUE; 5380 } 5381 5382 static drflac_bool32 drflac__seek_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex) 5383 { 5384 drflac_subframe* pSubframe; 5385 drflac_uint32 subframeBitsPerSample; 5386 5387 DRFLAC_ASSERT(bs != NULL); 5388 DRFLAC_ASSERT(frame != NULL); 5389 5390 pSubframe = frame->subframes + subframeIndex; 5391 if (!drflac__read_subframe_header(bs, pSubframe)) { 5392 return DRFLAC_FALSE; 5393 } 5394 5395 /* Side channels require an extra bit per sample. Took a while to figure that one out... */ 5396 subframeBitsPerSample = frame->header.bitsPerSample; 5397 if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) { 5398 subframeBitsPerSample += 1; 5399 } else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) { 5400 subframeBitsPerSample += 1; 5401 } 5402 5403 /* Need to handle wasted bits per sample. */ 5404 if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) { 5405 return DRFLAC_FALSE; 5406 } 5407 subframeBitsPerSample -= pSubframe->wastedBitsPerSample; 5408 5409 pSubframe->pSamplesS32 = NULL; 5410 5411 switch (pSubframe->subframeType) 5412 { 5413 case DRFLAC_SUBFRAME_CONSTANT: 5414 { 5415 if (!drflac__seek_bits(bs, subframeBitsPerSample)) { 5416 return DRFLAC_FALSE; 5417 } 5418 } break; 5419 5420 case DRFLAC_SUBFRAME_VERBATIM: 5421 { 5422 unsigned int bitsToSeek = frame->header.blockSizeInPCMFrames * subframeBitsPerSample; 5423 if (!drflac__seek_bits(bs, bitsToSeek)) { 5424 return DRFLAC_FALSE; 5425 } 5426 } break; 5427 5428 case DRFLAC_SUBFRAME_FIXED: 5429 { 5430 unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample; 5431 if (!drflac__seek_bits(bs, bitsToSeek)) { 5432 return DRFLAC_FALSE; 5433 } 5434 5435 if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) { 5436 return DRFLAC_FALSE; 5437 } 5438 } break; 5439 5440 case DRFLAC_SUBFRAME_LPC: 5441 { 5442 drflac_uint8 lpcPrecision; 5443 5444 unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample; 5445 if (!drflac__seek_bits(bs, bitsToSeek)) { 5446 return DRFLAC_FALSE; 5447 } 5448 5449 if (!drflac__read_uint8(bs, 4, &lpcPrecision)) { 5450 return DRFLAC_FALSE; 5451 } 5452 if (lpcPrecision == 15) { 5453 return DRFLAC_FALSE; /* Invalid. */ 5454 } 5455 lpcPrecision += 1; 5456 5457 5458 bitsToSeek = (pSubframe->lpcOrder * lpcPrecision) + 5; /* +5 for shift. */ 5459 if (!drflac__seek_bits(bs, bitsToSeek)) { 5460 return DRFLAC_FALSE; 5461 } 5462 5463 if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) { 5464 return DRFLAC_FALSE; 5465 } 5466 } break; 5467 5468 default: return DRFLAC_FALSE; 5469 } 5470 5471 return DRFLAC_TRUE; 5472 } 5473 5474 5475 static DRFLAC_INLINE drflac_uint8 drflac__get_channel_count_from_channel_assignment(drflac_int8 channelAssignment) 5476 { 5477 drflac_uint8 lookup[] = {1, 2, 3, 4, 5, 6, 7, 8, 2, 2, 2}; 5478 5479 DRFLAC_ASSERT(channelAssignment <= 10); 5480 return lookup[channelAssignment]; 5481 } 5482 5483 static drflac_result drflac__decode_flac_frame(drflac* pFlac) 5484 { 5485 int channelCount; 5486 int i; 5487 drflac_uint8 paddingSizeInBits; 5488 drflac_uint16 desiredCRC16; 5489 #ifndef DR_FLAC_NO_CRC 5490 drflac_uint16 actualCRC16; 5491 #endif 5492 5493 /* This function should be called while the stream is sitting on the first byte after the frame header. */ 5494 DRFLAC_ZERO_MEMORY(pFlac->currentFLACFrame.subframes, sizeof(pFlac->currentFLACFrame.subframes)); 5495 5496 /* The frame block size must never be larger than the maximum block size defined by the FLAC stream. */ 5497 if (pFlac->currentFLACFrame.header.blockSizeInPCMFrames > pFlac->maxBlockSizeInPCMFrames) { 5498 return DRFLAC_ERROR; 5499 } 5500 5501 /* The number of channels in the frame must match the channel count from the STREAMINFO block. */ 5502 channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 5503 if (channelCount != (int)pFlac->channels) { 5504 return DRFLAC_ERROR; 5505 } 5506 5507 for (i = 0; i < channelCount; ++i) { 5508 if (!drflac__decode_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i, pFlac->pDecodedSamples + (pFlac->currentFLACFrame.header.blockSizeInPCMFrames * i))) { 5509 return DRFLAC_ERROR; 5510 } 5511 } 5512 5513 paddingSizeInBits = (drflac_uint8)(DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7); 5514 if (paddingSizeInBits > 0) { 5515 drflac_uint8 padding = 0; 5516 if (!drflac__read_uint8(&pFlac->bs, paddingSizeInBits, &padding)) { 5517 return DRFLAC_AT_END; 5518 } 5519 } 5520 5521 #ifndef DR_FLAC_NO_CRC 5522 actualCRC16 = drflac__flush_crc16(&pFlac->bs); 5523 #endif 5524 if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) { 5525 return DRFLAC_AT_END; 5526 } 5527 5528 #ifndef DR_FLAC_NO_CRC 5529 if (actualCRC16 != desiredCRC16) { 5530 return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */ 5531 } 5532 #endif 5533 5534 pFlac->currentFLACFrame.pcmFramesRemaining = pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 5535 5536 return DRFLAC_SUCCESS; 5537 } 5538 5539 static drflac_result drflac__seek_flac_frame(drflac* pFlac) 5540 { 5541 int channelCount; 5542 int i; 5543 drflac_uint16 desiredCRC16; 5544 #ifndef DR_FLAC_NO_CRC 5545 drflac_uint16 actualCRC16; 5546 #endif 5547 5548 channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 5549 for (i = 0; i < channelCount; ++i) { 5550 if (!drflac__seek_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i)) { 5551 return DRFLAC_ERROR; 5552 } 5553 } 5554 5555 /* Padding. */ 5556 if (!drflac__seek_bits(&pFlac->bs, DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7)) { 5557 return DRFLAC_ERROR; 5558 } 5559 5560 /* CRC. */ 5561 #ifndef DR_FLAC_NO_CRC 5562 actualCRC16 = drflac__flush_crc16(&pFlac->bs); 5563 #endif 5564 if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) { 5565 return DRFLAC_AT_END; 5566 } 5567 5568 #ifndef DR_FLAC_NO_CRC 5569 if (actualCRC16 != desiredCRC16) { 5570 return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */ 5571 } 5572 #endif 5573 5574 return DRFLAC_SUCCESS; 5575 } 5576 5577 static drflac_bool32 drflac__read_and_decode_next_flac_frame(drflac* pFlac) 5578 { 5579 DRFLAC_ASSERT(pFlac != NULL); 5580 5581 for (;;) { 5582 drflac_result result; 5583 5584 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 5585 return DRFLAC_FALSE; 5586 } 5587 5588 result = drflac__decode_flac_frame(pFlac); 5589 if (result != DRFLAC_SUCCESS) { 5590 if (result == DRFLAC_CRC_MISMATCH) { 5591 continue; /* CRC mismatch. Skip to the next frame. */ 5592 } else { 5593 return DRFLAC_FALSE; 5594 } 5595 } 5596 5597 return DRFLAC_TRUE; 5598 } 5599 } 5600 5601 static void drflac__get_pcm_frame_range_of_current_flac_frame(drflac* pFlac, drflac_uint64* pFirstPCMFrame, drflac_uint64* pLastPCMFrame) 5602 { 5603 drflac_uint64 firstPCMFrame; 5604 drflac_uint64 lastPCMFrame; 5605 5606 DRFLAC_ASSERT(pFlac != NULL); 5607 5608 firstPCMFrame = pFlac->currentFLACFrame.header.pcmFrameNumber; 5609 if (firstPCMFrame == 0) { 5610 firstPCMFrame = ((drflac_uint64)pFlac->currentFLACFrame.header.flacFrameNumber) * pFlac->maxBlockSizeInPCMFrames; 5611 } 5612 5613 lastPCMFrame = firstPCMFrame + pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 5614 if (lastPCMFrame > 0) { 5615 lastPCMFrame -= 1; /* Needs to be zero based. */ 5616 } 5617 5618 if (pFirstPCMFrame) { 5619 *pFirstPCMFrame = firstPCMFrame; 5620 } 5621 if (pLastPCMFrame) { 5622 *pLastPCMFrame = lastPCMFrame; 5623 } 5624 } 5625 5626 static drflac_bool32 drflac__seek_to_first_frame(drflac* pFlac) 5627 { 5628 drflac_bool32 result; 5629 5630 DRFLAC_ASSERT(pFlac != NULL); 5631 5632 result = drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes); 5633 5634 DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame)); 5635 pFlac->currentPCMFrame = 0; 5636 5637 return result; 5638 } 5639 5640 static DRFLAC_INLINE drflac_result drflac__seek_to_next_flac_frame(drflac* pFlac) 5641 { 5642 /* This function should only ever be called while the decoder is sitting on the first byte past the FRAME_HEADER section. */ 5643 DRFLAC_ASSERT(pFlac != NULL); 5644 return drflac__seek_flac_frame(pFlac); 5645 } 5646 5647 5648 static drflac_uint64 drflac__seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 pcmFramesToSeek) 5649 { 5650 drflac_uint64 pcmFramesRead = 0; 5651 while (pcmFramesToSeek > 0) { 5652 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 5653 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 5654 break; /* Couldn't read the next frame, so just break from the loop and return. */ 5655 } 5656 } else { 5657 if (pFlac->currentFLACFrame.pcmFramesRemaining > pcmFramesToSeek) { 5658 pcmFramesRead += pcmFramesToSeek; 5659 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)pcmFramesToSeek; /* <-- Safe cast. Will always be < currentFrame.pcmFramesRemaining < 65536. */ 5660 pcmFramesToSeek = 0; 5661 } else { 5662 pcmFramesRead += pFlac->currentFLACFrame.pcmFramesRemaining; 5663 pcmFramesToSeek -= pFlac->currentFLACFrame.pcmFramesRemaining; 5664 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 5665 } 5666 } 5667 } 5668 5669 pFlac->currentPCMFrame += pcmFramesRead; 5670 return pcmFramesRead; 5671 } 5672 5673 5674 static drflac_bool32 drflac__seek_to_pcm_frame__brute_force(drflac* pFlac, drflac_uint64 pcmFrameIndex) 5675 { 5676 drflac_bool32 isMidFrame = DRFLAC_FALSE; 5677 drflac_uint64 runningPCMFrameCount; 5678 5679 DRFLAC_ASSERT(pFlac != NULL); 5680 5681 /* If we are seeking forward we start from the current position. Otherwise we need to start all the way from the start of the file. */ 5682 if (pcmFrameIndex >= pFlac->currentPCMFrame) { 5683 /* Seeking forward. Need to seek from the current position. */ 5684 runningPCMFrameCount = pFlac->currentPCMFrame; 5685 5686 /* The frame header for the first frame may not yet have been read. We need to do that if necessary. */ 5687 if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 5688 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 5689 return DRFLAC_FALSE; 5690 } 5691 } else { 5692 isMidFrame = DRFLAC_TRUE; 5693 } 5694 } else { 5695 /* Seeking backwards. Need to seek from the start of the file. */ 5696 runningPCMFrameCount = 0; 5697 5698 /* Move back to the start. */ 5699 if (!drflac__seek_to_first_frame(pFlac)) { 5700 return DRFLAC_FALSE; 5701 } 5702 5703 /* Decode the first frame in preparation for sample-exact seeking below. */ 5704 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 5705 return DRFLAC_FALSE; 5706 } 5707 } 5708 5709 /* 5710 We need to as quickly as possible find the frame that contains the target sample. To do this, we iterate over each frame and inspect its 5711 header. If based on the header we can determine that the frame contains the sample, we do a full decode of that frame. 5712 */ 5713 for (;;) { 5714 drflac_uint64 pcmFrameCountInThisFLACFrame; 5715 drflac_uint64 firstPCMFrameInFLACFrame = 0; 5716 drflac_uint64 lastPCMFrameInFLACFrame = 0; 5717 5718 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 5719 5720 pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 5721 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) { 5722 /* 5723 The sample should be in this frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend 5724 it never existed and keep iterating. 5725 */ 5726 drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount; 5727 5728 if (!isMidFrame) { 5729 drflac_result result = drflac__decode_flac_frame(pFlac); 5730 if (result == DRFLAC_SUCCESS) { 5731 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 5732 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 5733 } else { 5734 if (result == DRFLAC_CRC_MISMATCH) { 5735 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 5736 } else { 5737 return DRFLAC_FALSE; 5738 } 5739 } 5740 } else { 5741 /* We started seeking mid-frame which means we need to skip the frame decoding part. */ 5742 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; 5743 } 5744 } else { 5745 /* 5746 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 5747 frame never existed and leave the running sample count untouched. 5748 */ 5749 if (!isMidFrame) { 5750 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 5751 if (result == DRFLAC_SUCCESS) { 5752 runningPCMFrameCount += pcmFrameCountInThisFLACFrame; 5753 } else { 5754 if (result == DRFLAC_CRC_MISMATCH) { 5755 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 5756 } else { 5757 return DRFLAC_FALSE; 5758 } 5759 } 5760 } else { 5761 /* 5762 We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with 5763 drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header. 5764 */ 5765 runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining; 5766 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 5767 isMidFrame = DRFLAC_FALSE; 5768 } 5769 5770 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 5771 if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) { 5772 return DRFLAC_TRUE; 5773 } 5774 } 5775 5776 next_iteration: 5777 /* Grab the next frame in preparation for the next iteration. */ 5778 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 5779 return DRFLAC_FALSE; 5780 } 5781 } 5782 } 5783 5784 5785 #if !defined(DR_FLAC_NO_CRC) 5786 /* 5787 We use an average compression ratio to determine our approximate start location. FLAC files are generally about 50%-70% the size of their 5788 uncompressed counterparts so we'll use this as a basis. I'm going to split the middle and use a factor of 0.6 to determine the starting 5789 location. 5790 */ 5791 #define DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO 0.6f 5792 5793 static drflac_bool32 drflac__seek_to_approximate_flac_frame_to_byte(drflac* pFlac, drflac_uint64 targetByte, drflac_uint64 rangeLo, drflac_uint64 rangeHi, drflac_uint64* pLastSuccessfulSeekOffset) 5794 { 5795 DRFLAC_ASSERT(pFlac != NULL); 5796 DRFLAC_ASSERT(pLastSuccessfulSeekOffset != NULL); 5797 DRFLAC_ASSERT(targetByte >= rangeLo); 5798 DRFLAC_ASSERT(targetByte <= rangeHi); 5799 5800 *pLastSuccessfulSeekOffset = pFlac->firstFLACFramePosInBytes; 5801 5802 for (;;) { 5803 /* After rangeLo == rangeHi == targetByte fails, we need to break out. */ 5804 drflac_uint64 lastTargetByte = targetByte; 5805 5806 /* When seeking to a byte, failure probably means we've attempted to seek beyond the end of the stream. To counter this we just halve it each attempt. */ 5807 if (!drflac__seek_to_byte(&pFlac->bs, targetByte)) { 5808 /* If we couldn't even seek to the first byte in the stream we have a problem. Just abandon the whole thing. */ 5809 if (targetByte == 0) { 5810 drflac__seek_to_first_frame(pFlac); /* Try to recover. */ 5811 return DRFLAC_FALSE; 5812 } 5813 5814 /* Halve the byte location and continue. */ 5815 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 5816 rangeHi = targetByte; 5817 } else { 5818 /* Getting here should mean that we have seeked to an appropriate byte. */ 5819 5820 /* Clear the details of the FLAC frame so we don't misreport data. */ 5821 DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame)); 5822 5823 /* 5824 Now seek to the next FLAC frame. We need to decode the entire frame (not just the header) because it's possible for the header to incorrectly pass the 5825 CRC check and return bad data. We need to decode the entire frame to be more certain. Although this seems unlikely, this has happened to me in testing 5826 so it needs to stay this way for now. 5827 */ 5828 #if 1 5829 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 5830 /* Halve the byte location and continue. */ 5831 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 5832 rangeHi = targetByte; 5833 } else { 5834 break; 5835 } 5836 #else 5837 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 5838 /* Halve the byte location and continue. */ 5839 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 5840 rangeHi = targetByte; 5841 } else { 5842 break; 5843 } 5844 #endif 5845 } 5846 5847 /* We already tried this byte and there are no more to try, break out. */ 5848 if(targetByte == lastTargetByte) { 5849 return DRFLAC_FALSE; 5850 } 5851 } 5852 5853 /* The current PCM frame needs to be updated based on the frame we just seeked to. */ 5854 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL); 5855 5856 DRFLAC_ASSERT(targetByte <= rangeHi); 5857 5858 *pLastSuccessfulSeekOffset = targetByte; 5859 return DRFLAC_TRUE; 5860 } 5861 5862 static drflac_bool32 drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 offset) 5863 { 5864 /* This section of code would be used if we were only decoding the FLAC frame header when calling drflac__seek_to_approximate_flac_frame_to_byte(). */ 5865 #if 0 5866 if (drflac__decode_flac_frame(pFlac) != DRFLAC_SUCCESS) { 5867 /* We failed to decode this frame which may be due to it being corrupt. We'll just use the next valid FLAC frame. */ 5868 if (drflac__read_and_decode_next_flac_frame(pFlac) == DRFLAC_FALSE) { 5869 return DRFLAC_FALSE; 5870 } 5871 } 5872 #endif 5873 5874 return drflac__seek_forward_by_pcm_frames(pFlac, offset) == offset; 5875 } 5876 5877 5878 static drflac_bool32 drflac__seek_to_pcm_frame__binary_search_internal(drflac* pFlac, drflac_uint64 pcmFrameIndex, drflac_uint64 byteRangeLo, drflac_uint64 byteRangeHi) 5879 { 5880 /* This assumes pFlac->currentPCMFrame is sitting on byteRangeLo upon entry. */ 5881 5882 drflac_uint64 targetByte; 5883 drflac_uint64 pcmRangeLo = pFlac->totalPCMFrameCount; 5884 drflac_uint64 pcmRangeHi = 0; 5885 drflac_uint64 lastSuccessfulSeekOffset = (drflac_uint64)-1; 5886 drflac_uint64 closestSeekOffsetBeforeTargetPCMFrame = byteRangeLo; 5887 drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096; 5888 5889 targetByte = byteRangeLo + (drflac_uint64)(((drflac_int64)((pcmFrameIndex - pFlac->currentPCMFrame) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO); 5890 if (targetByte > byteRangeHi) { 5891 targetByte = byteRangeHi; 5892 } 5893 5894 for (;;) { 5895 if (drflac__seek_to_approximate_flac_frame_to_byte(pFlac, targetByte, byteRangeLo, byteRangeHi, &lastSuccessfulSeekOffset)) { 5896 /* We found a FLAC frame. We need to check if it contains the sample we're looking for. */ 5897 drflac_uint64 newPCMRangeLo; 5898 drflac_uint64 newPCMRangeHi; 5899 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &newPCMRangeLo, &newPCMRangeHi); 5900 5901 /* If we selected the same frame, it means we should be pretty close. Just decode the rest. */ 5902 if (pcmRangeLo == newPCMRangeLo) { 5903 if (!drflac__seek_to_approximate_flac_frame_to_byte(pFlac, closestSeekOffsetBeforeTargetPCMFrame, closestSeekOffsetBeforeTargetPCMFrame, byteRangeHi, &lastSuccessfulSeekOffset)) { 5904 break; /* Failed to seek to closest frame. */ 5905 } 5906 5907 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) { 5908 return DRFLAC_TRUE; 5909 } else { 5910 break; /* Failed to seek forward. */ 5911 } 5912 } 5913 5914 pcmRangeLo = newPCMRangeLo; 5915 pcmRangeHi = newPCMRangeHi; 5916 5917 if (pcmRangeLo <= pcmFrameIndex && pcmRangeHi >= pcmFrameIndex) { 5918 /* The target PCM frame is in this FLAC frame. */ 5919 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame) ) { 5920 return DRFLAC_TRUE; 5921 } else { 5922 break; /* Failed to seek to FLAC frame. */ 5923 } 5924 } else { 5925 const float approxCompressionRatio = (drflac_int64)(lastSuccessfulSeekOffset - pFlac->firstFLACFramePosInBytes) / ((drflac_int64)(pcmRangeLo * pFlac->channels * pFlac->bitsPerSample)/8.0f); 5926 5927 if (pcmRangeLo > pcmFrameIndex) { 5928 /* We seeked too far forward. We need to move our target byte backward and try again. */ 5929 byteRangeHi = lastSuccessfulSeekOffset; 5930 if (byteRangeLo > byteRangeHi) { 5931 byteRangeLo = byteRangeHi; 5932 } 5933 5934 targetByte = byteRangeLo + ((byteRangeHi - byteRangeLo) / 2); 5935 if (targetByte < byteRangeLo) { 5936 targetByte = byteRangeLo; 5937 } 5938 } else /*if (pcmRangeHi < pcmFrameIndex)*/ { 5939 /* We didn't seek far enough. We need to move our target byte forward and try again. */ 5940 5941 /* If we're close enough we can just seek forward. */ 5942 if ((pcmFrameIndex - pcmRangeLo) < seekForwardThreshold) { 5943 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) { 5944 return DRFLAC_TRUE; 5945 } else { 5946 break; /* Failed to seek to FLAC frame. */ 5947 } 5948 } else { 5949 byteRangeLo = lastSuccessfulSeekOffset; 5950 if (byteRangeHi < byteRangeLo) { 5951 byteRangeHi = byteRangeLo; 5952 } 5953 5954 targetByte = lastSuccessfulSeekOffset + (drflac_uint64)(((drflac_int64)((pcmFrameIndex-pcmRangeLo) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * approxCompressionRatio); 5955 if (targetByte > byteRangeHi) { 5956 targetByte = byteRangeHi; 5957 } 5958 5959 if (closestSeekOffsetBeforeTargetPCMFrame < lastSuccessfulSeekOffset) { 5960 closestSeekOffsetBeforeTargetPCMFrame = lastSuccessfulSeekOffset; 5961 } 5962 } 5963 } 5964 } 5965 } else { 5966 /* Getting here is really bad. We just recover as best we can, but moving to the first frame in the stream, and then abort. */ 5967 break; 5968 } 5969 } 5970 5971 drflac__seek_to_first_frame(pFlac); /* <-- Try to recover. */ 5972 return DRFLAC_FALSE; 5973 } 5974 5975 static drflac_bool32 drflac__seek_to_pcm_frame__binary_search(drflac* pFlac, drflac_uint64 pcmFrameIndex) 5976 { 5977 drflac_uint64 byteRangeLo; 5978 drflac_uint64 byteRangeHi; 5979 drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096; 5980 5981 /* Our algorithm currently assumes the FLAC stream is currently sitting at the start. */ 5982 if (drflac__seek_to_first_frame(pFlac) == DRFLAC_FALSE) { 5983 return DRFLAC_FALSE; 5984 } 5985 5986 /* If we're close enough to the start, just move to the start and seek forward. */ 5987 if (pcmFrameIndex < seekForwardThreshold) { 5988 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFrameIndex) == pcmFrameIndex; 5989 } 5990 5991 /* 5992 Our starting byte range is the byte position of the first FLAC frame and the approximate end of the file as if it were completely uncompressed. This ensures 5993 the entire file is included, even though most of the time it'll exceed the end of the actual stream. This is OK as the frame searching logic will handle it. 5994 */ 5995 byteRangeLo = pFlac->firstFLACFramePosInBytes; 5996 byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f); 5997 5998 return drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi); 5999 } 6000 #endif /* !DR_FLAC_NO_CRC */ 6001 6002 static drflac_bool32 drflac__seek_to_pcm_frame__seek_table(drflac* pFlac, drflac_uint64 pcmFrameIndex) 6003 { 6004 drflac_uint32 iClosestSeekpoint = 0; 6005 drflac_bool32 isMidFrame = DRFLAC_FALSE; 6006 drflac_uint64 runningPCMFrameCount; 6007 drflac_uint32 iSeekpoint; 6008 6009 6010 DRFLAC_ASSERT(pFlac != NULL); 6011 6012 if (pFlac->pSeekpoints == NULL || pFlac->seekpointCount == 0) { 6013 return DRFLAC_FALSE; 6014 } 6015 6016 for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) { 6017 if (pFlac->pSeekpoints[iSeekpoint].firstPCMFrame >= pcmFrameIndex) { 6018 break; 6019 } 6020 6021 iClosestSeekpoint = iSeekpoint; 6022 } 6023 6024 /* There's been cases where the seek table contains only zeros. We need to do some basic validation on the closest seekpoint. */ 6025 if (pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount == 0 || pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount > pFlac->maxBlockSizeInPCMFrames) { 6026 return DRFLAC_FALSE; 6027 } 6028 if (pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame > pFlac->totalPCMFrameCount && pFlac->totalPCMFrameCount > 0) { 6029 return DRFLAC_FALSE; 6030 } 6031 6032 #if !defined(DR_FLAC_NO_CRC) 6033 /* At this point we should know the closest seek point. We can use a binary search for this. We need to know the total sample count for this. */ 6034 if (pFlac->totalPCMFrameCount > 0) { 6035 drflac_uint64 byteRangeLo; 6036 drflac_uint64 byteRangeHi; 6037 6038 byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f); 6039 byteRangeLo = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset; 6040 6041 /* 6042 If our closest seek point is not the last one, we only need to search between it and the next one. The section below calculates an appropriate starting 6043 value for byteRangeHi which will clamp it appropriately. 6044 6045 Note that the next seekpoint must have an offset greater than the closest seekpoint because otherwise our binary search algorithm will break down. There 6046 have been cases where a seektable consists of seek points where every byte offset is set to 0 which causes problems. If this happens we need to abort. 6047 */ 6048 if (iClosestSeekpoint < pFlac->seekpointCount-1) { 6049 drflac_uint32 iNextSeekpoint = iClosestSeekpoint + 1; 6050 6051 /* Basic validation on the seekpoints to ensure they're usable. */ 6052 if (pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset >= pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset || pFlac->pSeekpoints[iNextSeekpoint].pcmFrameCount == 0) { 6053 return DRFLAC_FALSE; /* The next seekpoint doesn't look right. The seek table cannot be trusted from here. Abort. */ 6054 } 6055 6056 if (pFlac->pSeekpoints[iNextSeekpoint].firstPCMFrame != (((drflac_uint64)0xFFFFFFFF << 32) | 0xFFFFFFFF)) { /* Make sure it's not a placeholder seekpoint. */ 6057 byteRangeHi = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset - 1; /* byteRangeHi must be zero based. */ 6058 } 6059 } 6060 6061 if (drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) { 6062 if (drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 6063 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL); 6064 6065 if (drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi)) { 6066 return DRFLAC_TRUE; 6067 } 6068 } 6069 } 6070 } 6071 #endif /* !DR_FLAC_NO_CRC */ 6072 6073 /* Getting here means we need to use a slower algorithm because the binary search method failed or cannot be used. */ 6074 6075 /* 6076 If we are seeking forward and the closest seekpoint is _before_ the current sample, we just seek forward from where we are. Otherwise we start seeking 6077 from the seekpoint's first sample. 6078 */ 6079 if (pcmFrameIndex >= pFlac->currentPCMFrame && pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame <= pFlac->currentPCMFrame) { 6080 /* Optimized case. Just seek forward from where we are. */ 6081 runningPCMFrameCount = pFlac->currentPCMFrame; 6082 6083 /* The frame header for the first frame may not yet have been read. We need to do that if necessary. */ 6084 if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 6085 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 6086 return DRFLAC_FALSE; 6087 } 6088 } else { 6089 isMidFrame = DRFLAC_TRUE; 6090 } 6091 } else { 6092 /* Slower case. Seek to the start of the seekpoint and then seek forward from there. */ 6093 runningPCMFrameCount = pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame; 6094 6095 if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) { 6096 return DRFLAC_FALSE; 6097 } 6098 6099 /* Grab the frame the seekpoint is sitting on in preparation for the sample-exact seeking below. */ 6100 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 6101 return DRFLAC_FALSE; 6102 } 6103 } 6104 6105 for (;;) { 6106 drflac_uint64 pcmFrameCountInThisFLACFrame; 6107 drflac_uint64 firstPCMFrameInFLACFrame = 0; 6108 drflac_uint64 lastPCMFrameInFLACFrame = 0; 6109 6110 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 6111 6112 pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 6113 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) { 6114 /* 6115 The sample should be in this frame. We need to fully decode it, but if it's an invalid frame (a CRC mismatch) we need to pretend 6116 it never existed and keep iterating. 6117 */ 6118 drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount; 6119 6120 if (!isMidFrame) { 6121 drflac_result result = drflac__decode_flac_frame(pFlac); 6122 if (result == DRFLAC_SUCCESS) { 6123 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 6124 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 6125 } else { 6126 if (result == DRFLAC_CRC_MISMATCH) { 6127 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 6128 } else { 6129 return DRFLAC_FALSE; 6130 } 6131 } 6132 } else { 6133 /* We started seeking mid-frame which means we need to skip the frame decoding part. */ 6134 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; 6135 } 6136 } else { 6137 /* 6138 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 6139 frame never existed and leave the running sample count untouched. 6140 */ 6141 if (!isMidFrame) { 6142 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 6143 if (result == DRFLAC_SUCCESS) { 6144 runningPCMFrameCount += pcmFrameCountInThisFLACFrame; 6145 } else { 6146 if (result == DRFLAC_CRC_MISMATCH) { 6147 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 6148 } else { 6149 return DRFLAC_FALSE; 6150 } 6151 } 6152 } else { 6153 /* 6154 We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with 6155 drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header. 6156 */ 6157 runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining; 6158 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 6159 isMidFrame = DRFLAC_FALSE; 6160 } 6161 6162 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 6163 if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) { 6164 return DRFLAC_TRUE; 6165 } 6166 } 6167 6168 next_iteration: 6169 /* Grab the next frame in preparation for the next iteration. */ 6170 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 6171 return DRFLAC_FALSE; 6172 } 6173 } 6174 } 6175 6176 6177 #ifndef DR_FLAC_NO_OGG 6178 typedef struct 6179 { 6180 drflac_uint8 capturePattern[4]; /* Should be "OggS" */ 6181 drflac_uint8 structureVersion; /* Always 0. */ 6182 drflac_uint8 headerType; 6183 drflac_uint64 granulePosition; 6184 drflac_uint32 serialNumber; 6185 drflac_uint32 sequenceNumber; 6186 drflac_uint32 checksum; 6187 drflac_uint8 segmentCount; 6188 drflac_uint8 segmentTable[255]; 6189 } drflac_ogg_page_header; 6190 #endif 6191 6192 typedef struct 6193 { 6194 drflac_read_proc onRead; 6195 drflac_seek_proc onSeek; 6196 drflac_meta_proc onMeta; 6197 drflac_container container; 6198 void* pUserData; 6199 void* pUserDataMD; 6200 drflac_uint32 sampleRate; 6201 drflac_uint8 channels; 6202 drflac_uint8 bitsPerSample; 6203 drflac_uint64 totalPCMFrameCount; 6204 drflac_uint16 maxBlockSizeInPCMFrames; 6205 drflac_uint64 runningFilePos; 6206 drflac_bool32 hasStreamInfoBlock; 6207 drflac_bool32 hasMetadataBlocks; 6208 drflac_bs bs; /* <-- A bit streamer is required for loading data during initialization. */ 6209 drflac_frame_header firstFrameHeader; /* <-- The header of the first frame that was read during relaxed initalization. Only set if there is no STREAMINFO block. */ 6210 6211 #ifndef DR_FLAC_NO_OGG 6212 drflac_uint32 oggSerial; 6213 drflac_uint64 oggFirstBytePos; 6214 drflac_ogg_page_header oggBosHeader; 6215 #endif 6216 } drflac_init_info; 6217 6218 static DRFLAC_INLINE void drflac__decode_block_header(drflac_uint32 blockHeader, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize) 6219 { 6220 blockHeader = drflac__be2host_32(blockHeader); 6221 *isLastBlock = (drflac_uint8)((blockHeader & 0x80000000UL) >> 31); 6222 *blockType = (drflac_uint8)((blockHeader & 0x7F000000UL) >> 24); 6223 *blockSize = (blockHeader & 0x00FFFFFFUL); 6224 } 6225 6226 static DRFLAC_INLINE drflac_bool32 drflac__read_and_decode_block_header(drflac_read_proc onRead, void* pUserData, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize) 6227 { 6228 drflac_uint32 blockHeader; 6229 6230 *blockSize = 0; 6231 if (onRead(pUserData, &blockHeader, 4) != 4) { 6232 return DRFLAC_FALSE; 6233 } 6234 6235 drflac__decode_block_header(blockHeader, isLastBlock, blockType, blockSize); 6236 return DRFLAC_TRUE; 6237 } 6238 6239 static drflac_bool32 drflac__read_streaminfo(drflac_read_proc onRead, void* pUserData, drflac_streaminfo* pStreamInfo) 6240 { 6241 drflac_uint32 blockSizes; 6242 drflac_uint64 frameSizes = 0; 6243 drflac_uint64 importantProps; 6244 drflac_uint8 md5[16]; 6245 6246 /* min/max block size. */ 6247 if (onRead(pUserData, &blockSizes, 4) != 4) { 6248 return DRFLAC_FALSE; 6249 } 6250 6251 /* min/max frame size. */ 6252 if (onRead(pUserData, &frameSizes, 6) != 6) { 6253 return DRFLAC_FALSE; 6254 } 6255 6256 /* Sample rate, channels, bits per sample and total sample count. */ 6257 if (onRead(pUserData, &importantProps, 8) != 8) { 6258 return DRFLAC_FALSE; 6259 } 6260 6261 /* MD5 */ 6262 if (onRead(pUserData, md5, sizeof(md5)) != sizeof(md5)) { 6263 return DRFLAC_FALSE; 6264 } 6265 6266 blockSizes = drflac__be2host_32(blockSizes); 6267 frameSizes = drflac__be2host_64(frameSizes); 6268 importantProps = drflac__be2host_64(importantProps); 6269 6270 pStreamInfo->minBlockSizeInPCMFrames = (drflac_uint16)((blockSizes & 0xFFFF0000) >> 16); 6271 pStreamInfo->maxBlockSizeInPCMFrames = (drflac_uint16) (blockSizes & 0x0000FFFF); 6272 pStreamInfo->minFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 24)) >> 40); 6273 pStreamInfo->maxFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 0)) >> 16); 6274 pStreamInfo->sampleRate = (drflac_uint32)((importantProps & (((drflac_uint64)0x000FFFFF << 16) << 28)) >> 44); 6275 pStreamInfo->channels = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000000E << 16) << 24)) >> 41) + 1; 6276 pStreamInfo->bitsPerSample = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000001F << 16) << 20)) >> 36) + 1; 6277 pStreamInfo->totalPCMFrameCount = ((importantProps & ((((drflac_uint64)0x0000000F << 16) << 16) | 0xFFFFFFFF))); 6278 DRFLAC_COPY_MEMORY(pStreamInfo->md5, md5, sizeof(md5)); 6279 6280 return DRFLAC_TRUE; 6281 } 6282 6283 6284 static void* drflac__malloc_default(size_t sz, void* pUserData) 6285 { 6286 (void)pUserData; 6287 return DRFLAC_MALLOC(sz); 6288 } 6289 6290 static void* drflac__realloc_default(void* p, size_t sz, void* pUserData) 6291 { 6292 (void)pUserData; 6293 return DRFLAC_REALLOC(p, sz); 6294 } 6295 6296 static void drflac__free_default(void* p, void* pUserData) 6297 { 6298 (void)pUserData; 6299 DRFLAC_FREE(p); 6300 } 6301 6302 6303 static void* drflac__malloc_from_callbacks(size_t sz, const drflac_allocation_callbacks* pAllocationCallbacks) 6304 { 6305 if (pAllocationCallbacks == NULL) { 6306 return NULL; 6307 } 6308 6309 if (pAllocationCallbacks->onMalloc != NULL) { 6310 return pAllocationCallbacks->onMalloc(sz, pAllocationCallbacks->pUserData); 6311 } 6312 6313 /* Try using realloc(). */ 6314 if (pAllocationCallbacks->onRealloc != NULL) { 6315 return pAllocationCallbacks->onRealloc(NULL, sz, pAllocationCallbacks->pUserData); 6316 } 6317 6318 return NULL; 6319 } 6320 6321 static void* drflac__realloc_from_callbacks(void* p, size_t szNew, size_t szOld, const drflac_allocation_callbacks* pAllocationCallbacks) 6322 { 6323 if (pAllocationCallbacks == NULL) { 6324 return NULL; 6325 } 6326 6327 if (pAllocationCallbacks->onRealloc != NULL) { 6328 return pAllocationCallbacks->onRealloc(p, szNew, pAllocationCallbacks->pUserData); 6329 } 6330 6331 /* Try emulating realloc() in terms of malloc()/free(). */ 6332 if (pAllocationCallbacks->onMalloc != NULL && pAllocationCallbacks->onFree != NULL) { 6333 void* p2; 6334 6335 p2 = pAllocationCallbacks->onMalloc(szNew, pAllocationCallbacks->pUserData); 6336 if (p2 == NULL) { 6337 return NULL; 6338 } 6339 6340 if (p != NULL) { 6341 DRFLAC_COPY_MEMORY(p2, p, szOld); 6342 pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData); 6343 } 6344 6345 return p2; 6346 } 6347 6348 return NULL; 6349 } 6350 6351 static void drflac__free_from_callbacks(void* p, const drflac_allocation_callbacks* pAllocationCallbacks) 6352 { 6353 if (p == NULL || pAllocationCallbacks == NULL) { 6354 return; 6355 } 6356 6357 if (pAllocationCallbacks->onFree != NULL) { 6358 pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData); 6359 } 6360 } 6361 6362 6363 static drflac_bool32 drflac__read_and_decode_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_uint64* pFirstFramePos, drflac_uint64* pSeektablePos, drflac_uint32* pSeektableSize, drflac_allocation_callbacks* pAllocationCallbacks) 6364 { 6365 /* 6366 We want to keep track of the byte position in the stream of the seektable. At the time of calling this function we know that 6367 we'll be sitting on byte 42. 6368 */ 6369 drflac_uint64 runningFilePos = 42; 6370 drflac_uint64 seektablePos = 0; 6371 drflac_uint32 seektableSize = 0; 6372 6373 for (;;) { 6374 drflac_metadata metadata; 6375 drflac_uint8 isLastBlock = 0; 6376 drflac_uint8 blockType; 6377 drflac_uint32 blockSize; 6378 if (drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize) == DRFLAC_FALSE) { 6379 return DRFLAC_FALSE; 6380 } 6381 runningFilePos += 4; 6382 6383 metadata.type = blockType; 6384 metadata.pRawData = NULL; 6385 metadata.rawDataSize = 0; 6386 6387 switch (blockType) 6388 { 6389 case DRFLAC_METADATA_BLOCK_TYPE_APPLICATION: 6390 { 6391 if (blockSize < 4) { 6392 return DRFLAC_FALSE; 6393 } 6394 6395 if (onMeta) { 6396 void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6397 if (pRawData == NULL) { 6398 return DRFLAC_FALSE; 6399 } 6400 6401 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6402 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6403 return DRFLAC_FALSE; 6404 } 6405 6406 metadata.pRawData = pRawData; 6407 metadata.rawDataSize = blockSize; 6408 metadata.data.application.id = drflac__be2host_32(*(drflac_uint32*)pRawData); 6409 metadata.data.application.pData = (const void*)((drflac_uint8*)pRawData + sizeof(drflac_uint32)); 6410 metadata.data.application.dataSize = blockSize - sizeof(drflac_uint32); 6411 onMeta(pUserDataMD, &metadata); 6412 6413 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6414 } 6415 } break; 6416 6417 case DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE: 6418 { 6419 seektablePos = runningFilePos; 6420 seektableSize = blockSize; 6421 6422 if (onMeta) { 6423 drflac_uint32 iSeekpoint; 6424 void* pRawData; 6425 6426 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6427 if (pRawData == NULL) { 6428 return DRFLAC_FALSE; 6429 } 6430 6431 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6432 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6433 return DRFLAC_FALSE; 6434 } 6435 6436 metadata.pRawData = pRawData; 6437 metadata.rawDataSize = blockSize; 6438 metadata.data.seektable.seekpointCount = blockSize/sizeof(drflac_seekpoint); 6439 metadata.data.seektable.pSeekpoints = (const drflac_seekpoint*)pRawData; 6440 6441 /* Endian swap. */ 6442 for (iSeekpoint = 0; iSeekpoint < metadata.data.seektable.seekpointCount; ++iSeekpoint) { 6443 drflac_seekpoint* pSeekpoint = (drflac_seekpoint*)pRawData + iSeekpoint; 6444 pSeekpoint->firstPCMFrame = drflac__be2host_64(pSeekpoint->firstPCMFrame); 6445 pSeekpoint->flacFrameOffset = drflac__be2host_64(pSeekpoint->flacFrameOffset); 6446 pSeekpoint->pcmFrameCount = drflac__be2host_16(pSeekpoint->pcmFrameCount); 6447 } 6448 6449 onMeta(pUserDataMD, &metadata); 6450 6451 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6452 } 6453 } break; 6454 6455 case DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT: 6456 { 6457 if (blockSize < 8) { 6458 return DRFLAC_FALSE; 6459 } 6460 6461 if (onMeta) { 6462 void* pRawData; 6463 const char* pRunningData; 6464 const char* pRunningDataEnd; 6465 drflac_uint32 i; 6466 6467 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6468 if (pRawData == NULL) { 6469 return DRFLAC_FALSE; 6470 } 6471 6472 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6473 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6474 return DRFLAC_FALSE; 6475 } 6476 6477 metadata.pRawData = pRawData; 6478 metadata.rawDataSize = blockSize; 6479 6480 pRunningData = (const char*)pRawData; 6481 pRunningDataEnd = (const char*)pRawData + blockSize; 6482 6483 metadata.data.vorbis_comment.vendorLength = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6484 6485 /* Need space for the rest of the block */ 6486 if ((pRunningDataEnd - pRunningData) - 4 < (drflac_int64)metadata.data.vorbis_comment.vendorLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6487 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6488 return DRFLAC_FALSE; 6489 } 6490 metadata.data.vorbis_comment.vendor = pRunningData; pRunningData += metadata.data.vorbis_comment.vendorLength; 6491 metadata.data.vorbis_comment.commentCount = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6492 6493 /* Need space for 'commentCount' comments after the block, which at minimum is a drflac_uint32 per comment */ 6494 if ((pRunningDataEnd - pRunningData) / sizeof(drflac_uint32) < metadata.data.vorbis_comment.commentCount) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6495 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6496 return DRFLAC_FALSE; 6497 } 6498 metadata.data.vorbis_comment.pComments = pRunningData; 6499 6500 /* Check that the comments section is valid before passing it to the callback */ 6501 for (i = 0; i < metadata.data.vorbis_comment.commentCount; ++i) { 6502 drflac_uint32 commentLength; 6503 6504 if (pRunningDataEnd - pRunningData < 4) { 6505 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6506 return DRFLAC_FALSE; 6507 } 6508 6509 commentLength = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6510 if (pRunningDataEnd - pRunningData < (drflac_int64)commentLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6511 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6512 return DRFLAC_FALSE; 6513 } 6514 pRunningData += commentLength; 6515 } 6516 6517 onMeta(pUserDataMD, &metadata); 6518 6519 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6520 } 6521 } break; 6522 6523 case DRFLAC_METADATA_BLOCK_TYPE_CUESHEET: 6524 { 6525 if (blockSize < 396) { 6526 return DRFLAC_FALSE; 6527 } 6528 6529 if (onMeta) { 6530 void* pRawData; 6531 const char* pRunningData; 6532 const char* pRunningDataEnd; 6533 drflac_uint8 iTrack; 6534 drflac_uint8 iIndex; 6535 6536 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6537 if (pRawData == NULL) { 6538 return DRFLAC_FALSE; 6539 } 6540 6541 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6542 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6543 return DRFLAC_FALSE; 6544 } 6545 6546 metadata.pRawData = pRawData; 6547 metadata.rawDataSize = blockSize; 6548 6549 pRunningData = (const char*)pRawData; 6550 pRunningDataEnd = (const char*)pRawData + blockSize; 6551 6552 DRFLAC_COPY_MEMORY(metadata.data.cuesheet.catalog, pRunningData, 128); pRunningData += 128; 6553 metadata.data.cuesheet.leadInSampleCount = drflac__be2host_64(*(const drflac_uint64*)pRunningData); pRunningData += 8; 6554 metadata.data.cuesheet.isCD = (pRunningData[0] & 0x80) != 0; pRunningData += 259; 6555 metadata.data.cuesheet.trackCount = pRunningData[0]; pRunningData += 1; 6556 metadata.data.cuesheet.pTrackData = pRunningData; 6557 6558 /* Check that the cuesheet tracks are valid before passing it to the callback */ 6559 for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) { 6560 drflac_uint8 indexCount; 6561 drflac_uint32 indexPointSize; 6562 6563 if (pRunningDataEnd - pRunningData < 36) { 6564 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6565 return DRFLAC_FALSE; 6566 } 6567 6568 /* Skip to the index point count */ 6569 pRunningData += 35; 6570 indexCount = pRunningData[0]; pRunningData += 1; 6571 indexPointSize = indexCount * sizeof(drflac_cuesheet_track_index); 6572 if (pRunningDataEnd - pRunningData < (drflac_int64)indexPointSize) { 6573 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6574 return DRFLAC_FALSE; 6575 } 6576 6577 /* Endian swap. */ 6578 for (iIndex = 0; iIndex < indexCount; ++iIndex) { 6579 drflac_cuesheet_track_index* pTrack = (drflac_cuesheet_track_index*)pRunningData; 6580 pRunningData += sizeof(drflac_cuesheet_track_index); 6581 pTrack->offset = drflac__be2host_64(pTrack->offset); 6582 } 6583 } 6584 6585 onMeta(pUserDataMD, &metadata); 6586 6587 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6588 } 6589 } break; 6590 6591 case DRFLAC_METADATA_BLOCK_TYPE_PICTURE: 6592 { 6593 if (blockSize < 32) { 6594 return DRFLAC_FALSE; 6595 } 6596 6597 if (onMeta) { 6598 void* pRawData; 6599 const char* pRunningData; 6600 const char* pRunningDataEnd; 6601 6602 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6603 if (pRawData == NULL) { 6604 return DRFLAC_FALSE; 6605 } 6606 6607 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6608 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6609 return DRFLAC_FALSE; 6610 } 6611 6612 metadata.pRawData = pRawData; 6613 metadata.rawDataSize = blockSize; 6614 6615 pRunningData = (const char*)pRawData; 6616 pRunningDataEnd = (const char*)pRawData + blockSize; 6617 6618 metadata.data.picture.type = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6619 metadata.data.picture.mimeLength = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6620 6621 /* Need space for the rest of the block */ 6622 if ((pRunningDataEnd - pRunningData) - 24 < (drflac_int64)metadata.data.picture.mimeLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6623 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6624 return DRFLAC_FALSE; 6625 } 6626 metadata.data.picture.mime = pRunningData; pRunningData += metadata.data.picture.mimeLength; 6627 metadata.data.picture.descriptionLength = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6628 6629 /* Need space for the rest of the block */ 6630 if ((pRunningDataEnd - pRunningData) - 20 < (drflac_int64)metadata.data.picture.descriptionLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6631 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6632 return DRFLAC_FALSE; 6633 } 6634 metadata.data.picture.description = pRunningData; pRunningData += metadata.data.picture.descriptionLength; 6635 metadata.data.picture.width = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6636 metadata.data.picture.height = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6637 metadata.data.picture.colorDepth = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6638 metadata.data.picture.indexColorCount = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6639 metadata.data.picture.pictureDataSize = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 6640 metadata.data.picture.pPictureData = (const drflac_uint8*)pRunningData; 6641 6642 /* Need space for the picture after the block */ 6643 if (pRunningDataEnd - pRunningData < (drflac_int64)metadata.data.picture.pictureDataSize) { /* <-- Note the order of operations to avoid overflow to a valid value */ 6644 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6645 return DRFLAC_FALSE; 6646 } 6647 6648 onMeta(pUserDataMD, &metadata); 6649 6650 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6651 } 6652 } break; 6653 6654 case DRFLAC_METADATA_BLOCK_TYPE_PADDING: 6655 { 6656 if (onMeta) { 6657 metadata.data.padding.unused = 0; 6658 6659 /* Padding doesn't have anything meaningful in it, so just skip over it, but make sure the caller is aware of it by firing the callback. */ 6660 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 6661 isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */ 6662 } else { 6663 onMeta(pUserDataMD, &metadata); 6664 } 6665 } 6666 } break; 6667 6668 case DRFLAC_METADATA_BLOCK_TYPE_INVALID: 6669 { 6670 /* Invalid chunk. Just skip over this one. */ 6671 if (onMeta) { 6672 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 6673 isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */ 6674 } 6675 } 6676 } break; 6677 6678 default: 6679 { 6680 /* 6681 It's an unknown chunk, but not necessarily invalid. There's a chance more metadata blocks might be defined later on, so we 6682 can at the very least report the chunk to the application and let it look at the raw data. 6683 */ 6684 if (onMeta) { 6685 void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 6686 if (pRawData == NULL) { 6687 return DRFLAC_FALSE; 6688 } 6689 6690 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 6691 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6692 return DRFLAC_FALSE; 6693 } 6694 6695 metadata.pRawData = pRawData; 6696 metadata.rawDataSize = blockSize; 6697 onMeta(pUserDataMD, &metadata); 6698 6699 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 6700 } 6701 } break; 6702 } 6703 6704 /* If we're not handling metadata, just skip over the block. If we are, it will have been handled earlier in the switch statement above. */ 6705 if (onMeta == NULL && blockSize > 0) { 6706 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 6707 isLastBlock = DRFLAC_TRUE; 6708 } 6709 } 6710 6711 runningFilePos += blockSize; 6712 if (isLastBlock) { 6713 break; 6714 } 6715 } 6716 6717 *pSeektablePos = seektablePos; 6718 *pSeektableSize = seektableSize; 6719 *pFirstFramePos = runningFilePos; 6720 6721 return DRFLAC_TRUE; 6722 } 6723 6724 static drflac_bool32 drflac__init_private__native(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed) 6725 { 6726 /* Pre Condition: The bit stream should be sitting just past the 4-byte id header. */ 6727 6728 drflac_uint8 isLastBlock; 6729 drflac_uint8 blockType; 6730 drflac_uint32 blockSize; 6731 6732 (void)onSeek; 6733 6734 pInit->container = drflac_container_native; 6735 6736 /* The first metadata block should be the STREAMINFO block. */ 6737 if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) { 6738 return DRFLAC_FALSE; 6739 } 6740 6741 if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) { 6742 if (!relaxed) { 6743 /* We're opening in strict mode and the first block is not the STREAMINFO block. Error. */ 6744 return DRFLAC_FALSE; 6745 } else { 6746 /* 6747 Relaxed mode. To open from here we need to just find the first frame and set the sample rate, etc. to whatever is defined 6748 for that frame. 6749 */ 6750 pInit->hasStreamInfoBlock = DRFLAC_FALSE; 6751 pInit->hasMetadataBlocks = DRFLAC_FALSE; 6752 6753 if (!drflac__read_next_flac_frame_header(&pInit->bs, 0, &pInit->firstFrameHeader)) { 6754 return DRFLAC_FALSE; /* Couldn't find a frame. */ 6755 } 6756 6757 if (pInit->firstFrameHeader.bitsPerSample == 0) { 6758 return DRFLAC_FALSE; /* Failed to initialize because the first frame depends on the STREAMINFO block, which does not exist. */ 6759 } 6760 6761 pInit->sampleRate = pInit->firstFrameHeader.sampleRate; 6762 pInit->channels = drflac__get_channel_count_from_channel_assignment(pInit->firstFrameHeader.channelAssignment); 6763 pInit->bitsPerSample = pInit->firstFrameHeader.bitsPerSample; 6764 pInit->maxBlockSizeInPCMFrames = 65535; /* <-- See notes here: https://xiph.org/flac/format.html#metadata_block_streaminfo */ 6765 return DRFLAC_TRUE; 6766 } 6767 } else { 6768 drflac_streaminfo streaminfo; 6769 if (!drflac__read_streaminfo(onRead, pUserData, &streaminfo)) { 6770 return DRFLAC_FALSE; 6771 } 6772 6773 pInit->hasStreamInfoBlock = DRFLAC_TRUE; 6774 pInit->sampleRate = streaminfo.sampleRate; 6775 pInit->channels = streaminfo.channels; 6776 pInit->bitsPerSample = streaminfo.bitsPerSample; 6777 pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount; 6778 pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; /* Don't care about the min block size - only the max (used for determining the size of the memory allocation). */ 6779 pInit->hasMetadataBlocks = !isLastBlock; 6780 6781 if (onMeta) { 6782 drflac_metadata metadata; 6783 metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO; 6784 metadata.pRawData = NULL; 6785 metadata.rawDataSize = 0; 6786 metadata.data.streaminfo = streaminfo; 6787 onMeta(pUserDataMD, &metadata); 6788 } 6789 6790 return DRFLAC_TRUE; 6791 } 6792 } 6793 6794 #ifndef DR_FLAC_NO_OGG 6795 #define DRFLAC_OGG_MAX_PAGE_SIZE 65307 6796 #define DRFLAC_OGG_CAPTURE_PATTERN_CRC32 1605413199 /* CRC-32 of "OggS". */ 6797 6798 typedef enum 6799 { 6800 drflac_ogg_recover_on_crc_mismatch, 6801 drflac_ogg_fail_on_crc_mismatch 6802 } drflac_ogg_crc_mismatch_recovery; 6803 6804 #ifndef DR_FLAC_NO_CRC 6805 static drflac_uint32 drflac__crc32_table[] = { 6806 0x00000000L, 0x04C11DB7L, 0x09823B6EL, 0x0D4326D9L, 6807 0x130476DCL, 0x17C56B6BL, 0x1A864DB2L, 0x1E475005L, 6808 0x2608EDB8L, 0x22C9F00FL, 0x2F8AD6D6L, 0x2B4BCB61L, 6809 0x350C9B64L, 0x31CD86D3L, 0x3C8EA00AL, 0x384FBDBDL, 6810 0x4C11DB70L, 0x48D0C6C7L, 0x4593E01EL, 0x4152FDA9L, 6811 0x5F15ADACL, 0x5BD4B01BL, 0x569796C2L, 0x52568B75L, 6812 0x6A1936C8L, 0x6ED82B7FL, 0x639B0DA6L, 0x675A1011L, 6813 0x791D4014L, 0x7DDC5DA3L, 0x709F7B7AL, 0x745E66CDL, 6814 0x9823B6E0L, 0x9CE2AB57L, 0x91A18D8EL, 0x95609039L, 6815 0x8B27C03CL, 0x8FE6DD8BL, 0x82A5FB52L, 0x8664E6E5L, 6816 0xBE2B5B58L, 0xBAEA46EFL, 0xB7A96036L, 0xB3687D81L, 6817 0xAD2F2D84L, 0xA9EE3033L, 0xA4AD16EAL, 0xA06C0B5DL, 6818 0xD4326D90L, 0xD0F37027L, 0xDDB056FEL, 0xD9714B49L, 6819 0xC7361B4CL, 0xC3F706FBL, 0xCEB42022L, 0xCA753D95L, 6820 0xF23A8028L, 0xF6FB9D9FL, 0xFBB8BB46L, 0xFF79A6F1L, 6821 0xE13EF6F4L, 0xE5FFEB43L, 0xE8BCCD9AL, 0xEC7DD02DL, 6822 0x34867077L, 0x30476DC0L, 0x3D044B19L, 0x39C556AEL, 6823 0x278206ABL, 0x23431B1CL, 0x2E003DC5L, 0x2AC12072L, 6824 0x128E9DCFL, 0x164F8078L, 0x1B0CA6A1L, 0x1FCDBB16L, 6825 0x018AEB13L, 0x054BF6A4L, 0x0808D07DL, 0x0CC9CDCAL, 6826 0x7897AB07L, 0x7C56B6B0L, 0x71159069L, 0x75D48DDEL, 6827 0x6B93DDDBL, 0x6F52C06CL, 0x6211E6B5L, 0x66D0FB02L, 6828 0x5E9F46BFL, 0x5A5E5B08L, 0x571D7DD1L, 0x53DC6066L, 6829 0x4D9B3063L, 0x495A2DD4L, 0x44190B0DL, 0x40D816BAL, 6830 0xACA5C697L, 0xA864DB20L, 0xA527FDF9L, 0xA1E6E04EL, 6831 0xBFA1B04BL, 0xBB60ADFCL, 0xB6238B25L, 0xB2E29692L, 6832 0x8AAD2B2FL, 0x8E6C3698L, 0x832F1041L, 0x87EE0DF6L, 6833 0x99A95DF3L, 0x9D684044L, 0x902B669DL, 0x94EA7B2AL, 6834 0xE0B41DE7L, 0xE4750050L, 0xE9362689L, 0xEDF73B3EL, 6835 0xF3B06B3BL, 0xF771768CL, 0xFA325055L, 0xFEF34DE2L, 6836 0xC6BCF05FL, 0xC27DEDE8L, 0xCF3ECB31L, 0xCBFFD686L, 6837 0xD5B88683L, 0xD1799B34L, 0xDC3ABDEDL, 0xD8FBA05AL, 6838 0x690CE0EEL, 0x6DCDFD59L, 0x608EDB80L, 0x644FC637L, 6839 0x7A089632L, 0x7EC98B85L, 0x738AAD5CL, 0x774BB0EBL, 6840 0x4F040D56L, 0x4BC510E1L, 0x46863638L, 0x42472B8FL, 6841 0x5C007B8AL, 0x58C1663DL, 0x558240E4L, 0x51435D53L, 6842 0x251D3B9EL, 0x21DC2629L, 0x2C9F00F0L, 0x285E1D47L, 6843 0x36194D42L, 0x32D850F5L, 0x3F9B762CL, 0x3B5A6B9BL, 6844 0x0315D626L, 0x07D4CB91L, 0x0A97ED48L, 0x0E56F0FFL, 6845 0x1011A0FAL, 0x14D0BD4DL, 0x19939B94L, 0x1D528623L, 6846 0xF12F560EL, 0xF5EE4BB9L, 0xF8AD6D60L, 0xFC6C70D7L, 6847 0xE22B20D2L, 0xE6EA3D65L, 0xEBA91BBCL, 0xEF68060BL, 6848 0xD727BBB6L, 0xD3E6A601L, 0xDEA580D8L, 0xDA649D6FL, 6849 0xC423CD6AL, 0xC0E2D0DDL, 0xCDA1F604L, 0xC960EBB3L, 6850 0xBD3E8D7EL, 0xB9FF90C9L, 0xB4BCB610L, 0xB07DABA7L, 6851 0xAE3AFBA2L, 0xAAFBE615L, 0xA7B8C0CCL, 0xA379DD7BL, 6852 0x9B3660C6L, 0x9FF77D71L, 0x92B45BA8L, 0x9675461FL, 6853 0x8832161AL, 0x8CF30BADL, 0x81B02D74L, 0x857130C3L, 6854 0x5D8A9099L, 0x594B8D2EL, 0x5408ABF7L, 0x50C9B640L, 6855 0x4E8EE645L, 0x4A4FFBF2L, 0x470CDD2BL, 0x43CDC09CL, 6856 0x7B827D21L, 0x7F436096L, 0x7200464FL, 0x76C15BF8L, 6857 0x68860BFDL, 0x6C47164AL, 0x61043093L, 0x65C52D24L, 6858 0x119B4BE9L, 0x155A565EL, 0x18197087L, 0x1CD86D30L, 6859 0x029F3D35L, 0x065E2082L, 0x0B1D065BL, 0x0FDC1BECL, 6860 0x3793A651L, 0x3352BBE6L, 0x3E119D3FL, 0x3AD08088L, 6861 0x2497D08DL, 0x2056CD3AL, 0x2D15EBE3L, 0x29D4F654L, 6862 0xC5A92679L, 0xC1683BCEL, 0xCC2B1D17L, 0xC8EA00A0L, 6863 0xD6AD50A5L, 0xD26C4D12L, 0xDF2F6BCBL, 0xDBEE767CL, 6864 0xE3A1CBC1L, 0xE760D676L, 0xEA23F0AFL, 0xEEE2ED18L, 6865 0xF0A5BD1DL, 0xF464A0AAL, 0xF9278673L, 0xFDE69BC4L, 6866 0x89B8FD09L, 0x8D79E0BEL, 0x803AC667L, 0x84FBDBD0L, 6867 0x9ABC8BD5L, 0x9E7D9662L, 0x933EB0BBL, 0x97FFAD0CL, 6868 0xAFB010B1L, 0xAB710D06L, 0xA6322BDFL, 0xA2F33668L, 6869 0xBCB4666DL, 0xB8757BDAL, 0xB5365D03L, 0xB1F740B4L 6870 }; 6871 #endif 6872 6873 static DRFLAC_INLINE drflac_uint32 drflac_crc32_byte(drflac_uint32 crc32, drflac_uint8 data) 6874 { 6875 #ifndef DR_FLAC_NO_CRC 6876 return (crc32 << 8) ^ drflac__crc32_table[(drflac_uint8)((crc32 >> 24) & 0xFF) ^ data]; 6877 #else 6878 (void)data; 6879 return crc32; 6880 #endif 6881 } 6882 6883 #if 0 6884 static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint32(drflac_uint32 crc32, drflac_uint32 data) 6885 { 6886 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 24) & 0xFF)); 6887 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 16) & 0xFF)); 6888 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 8) & 0xFF)); 6889 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 0) & 0xFF)); 6890 return crc32; 6891 } 6892 6893 static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint64(drflac_uint32 crc32, drflac_uint64 data) 6894 { 6895 crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 32) & 0xFFFFFFFF)); 6896 crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 0) & 0xFFFFFFFF)); 6897 return crc32; 6898 } 6899 #endif 6900 6901 static DRFLAC_INLINE drflac_uint32 drflac_crc32_buffer(drflac_uint32 crc32, drflac_uint8* pData, drflac_uint32 dataSize) 6902 { 6903 /* This can be optimized. */ 6904 drflac_uint32 i; 6905 for (i = 0; i < dataSize; ++i) { 6906 crc32 = drflac_crc32_byte(crc32, pData[i]); 6907 } 6908 return crc32; 6909 } 6910 6911 6912 static DRFLAC_INLINE drflac_bool32 drflac_ogg__is_capture_pattern(drflac_uint8 pattern[4]) 6913 { 6914 return pattern[0] == 'O' && pattern[1] == 'g' && pattern[2] == 'g' && pattern[3] == 'S'; 6915 } 6916 6917 static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_header_size(drflac_ogg_page_header* pHeader) 6918 { 6919 return 27 + pHeader->segmentCount; 6920 } 6921 6922 static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_body_size(drflac_ogg_page_header* pHeader) 6923 { 6924 drflac_uint32 pageBodySize = 0; 6925 int i; 6926 6927 for (i = 0; i < pHeader->segmentCount; ++i) { 6928 pageBodySize += pHeader->segmentTable[i]; 6929 } 6930 6931 return pageBodySize; 6932 } 6933 6934 static drflac_result drflac_ogg__read_page_header_after_capture_pattern(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32) 6935 { 6936 drflac_uint8 data[23]; 6937 drflac_uint32 i; 6938 6939 DRFLAC_ASSERT(*pCRC32 == DRFLAC_OGG_CAPTURE_PATTERN_CRC32); 6940 6941 if (onRead(pUserData, data, 23) != 23) { 6942 return DRFLAC_AT_END; 6943 } 6944 *pBytesRead += 23; 6945 6946 /* 6947 It's not actually used, but set the capture pattern to 'OggS' for completeness. Not doing this will cause static analysers to complain about 6948 us trying to access uninitialized data. We could alternatively just comment out this member of the drflac_ogg_page_header structure, but I 6949 like to have it map to the structure of the underlying data. 6950 */ 6951 pHeader->capturePattern[0] = 'O'; 6952 pHeader->capturePattern[1] = 'g'; 6953 pHeader->capturePattern[2] = 'g'; 6954 pHeader->capturePattern[3] = 'S'; 6955 6956 pHeader->structureVersion = data[0]; 6957 pHeader->headerType = data[1]; 6958 DRFLAC_COPY_MEMORY(&pHeader->granulePosition, &data[ 2], 8); 6959 DRFLAC_COPY_MEMORY(&pHeader->serialNumber, &data[10], 4); 6960 DRFLAC_COPY_MEMORY(&pHeader->sequenceNumber, &data[14], 4); 6961 DRFLAC_COPY_MEMORY(&pHeader->checksum, &data[18], 4); 6962 pHeader->segmentCount = data[22]; 6963 6964 /* Calculate the CRC. Note that for the calculation the checksum part of the page needs to be set to 0. */ 6965 data[18] = 0; 6966 data[19] = 0; 6967 data[20] = 0; 6968 data[21] = 0; 6969 6970 for (i = 0; i < 23; ++i) { 6971 *pCRC32 = drflac_crc32_byte(*pCRC32, data[i]); 6972 } 6973 6974 6975 if (onRead(pUserData, pHeader->segmentTable, pHeader->segmentCount) != pHeader->segmentCount) { 6976 return DRFLAC_AT_END; 6977 } 6978 *pBytesRead += pHeader->segmentCount; 6979 6980 for (i = 0; i < pHeader->segmentCount; ++i) { 6981 *pCRC32 = drflac_crc32_byte(*pCRC32, pHeader->segmentTable[i]); 6982 } 6983 6984 return DRFLAC_SUCCESS; 6985 } 6986 6987 static drflac_result drflac_ogg__read_page_header(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32) 6988 { 6989 drflac_uint8 id[4]; 6990 6991 *pBytesRead = 0; 6992 6993 if (onRead(pUserData, id, 4) != 4) { 6994 return DRFLAC_AT_END; 6995 } 6996 *pBytesRead += 4; 6997 6998 /* We need to read byte-by-byte until we find the OggS capture pattern. */ 6999 for (;;) { 7000 if (drflac_ogg__is_capture_pattern(id)) { 7001 drflac_result result; 7002 7003 *pCRC32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32; 7004 7005 result = drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, pHeader, pBytesRead, pCRC32); 7006 if (result == DRFLAC_SUCCESS) { 7007 return DRFLAC_SUCCESS; 7008 } else { 7009 if (result == DRFLAC_CRC_MISMATCH) { 7010 continue; 7011 } else { 7012 return result; 7013 } 7014 } 7015 } else { 7016 /* The first 4 bytes did not equal the capture pattern. Read the next byte and try again. */ 7017 id[0] = id[1]; 7018 id[1] = id[2]; 7019 id[2] = id[3]; 7020 if (onRead(pUserData, &id[3], 1) != 1) { 7021 return DRFLAC_AT_END; 7022 } 7023 *pBytesRead += 1; 7024 } 7025 } 7026 } 7027 7028 7029 /* 7030 The main part of the Ogg encapsulation is the conversion from the physical Ogg bitstream to the native FLAC bitstream. It works 7031 in three general stages: Ogg Physical Bitstream -> Ogg/FLAC Logical Bitstream -> FLAC Native Bitstream. dr_flac is designed 7032 in such a way that the core sections assume everything is delivered in native format. Therefore, for each encapsulation type 7033 dr_flac is supporting there needs to be a layer sitting on top of the onRead and onSeek callbacks that ensures the bits read from 7034 the physical Ogg bitstream are converted and delivered in native FLAC format. 7035 */ 7036 typedef struct 7037 { 7038 drflac_read_proc onRead; /* The original onRead callback from drflac_open() and family. */ 7039 drflac_seek_proc onSeek; /* The original onSeek callback from drflac_open() and family. */ 7040 void* pUserData; /* The user data passed on onRead and onSeek. This is the user data that was passed on drflac_open() and family. */ 7041 drflac_uint64 currentBytePos; /* The position of the byte we are sitting on in the physical byte stream. Used for efficient seeking. */ 7042 drflac_uint64 firstBytePos; /* The position of the first byte in the physical bitstream. Points to the start of the "OggS" identifier of the FLAC bos page. */ 7043 drflac_uint32 serialNumber; /* The serial number of the FLAC audio pages. This is determined by the initial header page that was read during initialization. */ 7044 drflac_ogg_page_header bosPageHeader; /* Used for seeking. */ 7045 drflac_ogg_page_header currentPageHeader; 7046 drflac_uint32 bytesRemainingInPage; 7047 drflac_uint32 pageDataSize; 7048 drflac_uint8 pageData[DRFLAC_OGG_MAX_PAGE_SIZE]; 7049 } drflac_oggbs; /* oggbs = Ogg Bitstream */ 7050 7051 static size_t drflac_oggbs__read_physical(drflac_oggbs* oggbs, void* bufferOut, size_t bytesToRead) 7052 { 7053 size_t bytesActuallyRead = oggbs->onRead(oggbs->pUserData, bufferOut, bytesToRead); 7054 oggbs->currentBytePos += bytesActuallyRead; 7055 7056 return bytesActuallyRead; 7057 } 7058 7059 static drflac_bool32 drflac_oggbs__seek_physical(drflac_oggbs* oggbs, drflac_uint64 offset, drflac_seek_origin origin) 7060 { 7061 if (origin == drflac_seek_origin_start) { 7062 if (offset <= 0x7FFFFFFF) { 7063 if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_start)) { 7064 return DRFLAC_FALSE; 7065 } 7066 oggbs->currentBytePos = offset; 7067 7068 return DRFLAC_TRUE; 7069 } else { 7070 if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) { 7071 return DRFLAC_FALSE; 7072 } 7073 oggbs->currentBytePos = offset; 7074 7075 return drflac_oggbs__seek_physical(oggbs, offset - 0x7FFFFFFF, drflac_seek_origin_current); 7076 } 7077 } else { 7078 while (offset > 0x7FFFFFFF) { 7079 if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) { 7080 return DRFLAC_FALSE; 7081 } 7082 oggbs->currentBytePos += 0x7FFFFFFF; 7083 offset -= 0x7FFFFFFF; 7084 } 7085 7086 if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_current)) { /* <-- Safe cast thanks to the loop above. */ 7087 return DRFLAC_FALSE; 7088 } 7089 oggbs->currentBytePos += offset; 7090 7091 return DRFLAC_TRUE; 7092 } 7093 } 7094 7095 static drflac_bool32 drflac_oggbs__goto_next_page(drflac_oggbs* oggbs, drflac_ogg_crc_mismatch_recovery recoveryMethod) 7096 { 7097 drflac_ogg_page_header header; 7098 for (;;) { 7099 drflac_uint32 crc32 = 0; 7100 drflac_uint32 bytesRead; 7101 drflac_uint32 pageBodySize; 7102 #ifndef DR_FLAC_NO_CRC 7103 drflac_uint32 actualCRC32; 7104 #endif 7105 7106 if (drflac_ogg__read_page_header(oggbs->onRead, oggbs->pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 7107 return DRFLAC_FALSE; 7108 } 7109 oggbs->currentBytePos += bytesRead; 7110 7111 pageBodySize = drflac_ogg__get_page_body_size(&header); 7112 if (pageBodySize > DRFLAC_OGG_MAX_PAGE_SIZE) { 7113 continue; /* Invalid page size. Assume it's corrupted and just move to the next page. */ 7114 } 7115 7116 if (header.serialNumber != oggbs->serialNumber) { 7117 /* It's not a FLAC page. Skip it. */ 7118 if (pageBodySize > 0 && !drflac_oggbs__seek_physical(oggbs, pageBodySize, drflac_seek_origin_current)) { 7119 return DRFLAC_FALSE; 7120 } 7121 continue; 7122 } 7123 7124 7125 /* We need to read the entire page and then do a CRC check on it. If there's a CRC mismatch we need to skip this page. */ 7126 if (drflac_oggbs__read_physical(oggbs, oggbs->pageData, pageBodySize) != pageBodySize) { 7127 return DRFLAC_FALSE; 7128 } 7129 oggbs->pageDataSize = pageBodySize; 7130 7131 #ifndef DR_FLAC_NO_CRC 7132 actualCRC32 = drflac_crc32_buffer(crc32, oggbs->pageData, oggbs->pageDataSize); 7133 if (actualCRC32 != header.checksum) { 7134 if (recoveryMethod == drflac_ogg_recover_on_crc_mismatch) { 7135 continue; /* CRC mismatch. Skip this page. */ 7136 } else { 7137 /* 7138 Even though we are failing on a CRC mismatch, we still want our stream to be in a good state. Therefore we 7139 go to the next valid page to ensure we're in a good state, but return false to let the caller know that the 7140 seek did not fully complete. 7141 */ 7142 drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch); 7143 return DRFLAC_FALSE; 7144 } 7145 } 7146 #else 7147 (void)recoveryMethod; /* <-- Silence a warning. */ 7148 #endif 7149 7150 oggbs->currentPageHeader = header; 7151 oggbs->bytesRemainingInPage = pageBodySize; 7152 return DRFLAC_TRUE; 7153 } 7154 } 7155 7156 /* Function below is unused at the moment, but I might be re-adding it later. */ 7157 #if 0 7158 static drflac_uint8 drflac_oggbs__get_current_segment_index(drflac_oggbs* oggbs, drflac_uint8* pBytesRemainingInSeg) 7159 { 7160 drflac_uint32 bytesConsumedInPage = drflac_ogg__get_page_body_size(&oggbs->currentPageHeader) - oggbs->bytesRemainingInPage; 7161 drflac_uint8 iSeg = 0; 7162 drflac_uint32 iByte = 0; 7163 while (iByte < bytesConsumedInPage) { 7164 drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg]; 7165 if (iByte + segmentSize > bytesConsumedInPage) { 7166 break; 7167 } else { 7168 iSeg += 1; 7169 iByte += segmentSize; 7170 } 7171 } 7172 7173 *pBytesRemainingInSeg = oggbs->currentPageHeader.segmentTable[iSeg] - (drflac_uint8)(bytesConsumedInPage - iByte); 7174 return iSeg; 7175 } 7176 7177 static drflac_bool32 drflac_oggbs__seek_to_next_packet(drflac_oggbs* oggbs) 7178 { 7179 /* The current packet ends when we get to the segment with a lacing value of < 255 which is not at the end of a page. */ 7180 for (;;) { 7181 drflac_bool32 atEndOfPage = DRFLAC_FALSE; 7182 7183 drflac_uint8 bytesRemainingInSeg; 7184 drflac_uint8 iFirstSeg = drflac_oggbs__get_current_segment_index(oggbs, &bytesRemainingInSeg); 7185 7186 drflac_uint32 bytesToEndOfPacketOrPage = bytesRemainingInSeg; 7187 for (drflac_uint8 iSeg = iFirstSeg; iSeg < oggbs->currentPageHeader.segmentCount; ++iSeg) { 7188 drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg]; 7189 if (segmentSize < 255) { 7190 if (iSeg == oggbs->currentPageHeader.segmentCount-1) { 7191 atEndOfPage = DRFLAC_TRUE; 7192 } 7193 7194 break; 7195 } 7196 7197 bytesToEndOfPacketOrPage += segmentSize; 7198 } 7199 7200 /* 7201 At this point we will have found either the packet or the end of the page. If were at the end of the page we'll 7202 want to load the next page and keep searching for the end of the packet. 7203 */ 7204 drflac_oggbs__seek_physical(oggbs, bytesToEndOfPacketOrPage, drflac_seek_origin_current); 7205 oggbs->bytesRemainingInPage -= bytesToEndOfPacketOrPage; 7206 7207 if (atEndOfPage) { 7208 /* 7209 We're potentially at the next packet, but we need to check the next page first to be sure because the packet may 7210 straddle pages. 7211 */ 7212 if (!drflac_oggbs__goto_next_page(oggbs)) { 7213 return DRFLAC_FALSE; 7214 } 7215 7216 /* If it's a fresh packet it most likely means we're at the next packet. */ 7217 if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { 7218 return DRFLAC_TRUE; 7219 } 7220 } else { 7221 /* We're at the next packet. */ 7222 return DRFLAC_TRUE; 7223 } 7224 } 7225 } 7226 7227 static drflac_bool32 drflac_oggbs__seek_to_next_frame(drflac_oggbs* oggbs) 7228 { 7229 /* The bitstream should be sitting on the first byte just after the header of the frame. */ 7230 7231 /* What we're actually doing here is seeking to the start of the next packet. */ 7232 return drflac_oggbs__seek_to_next_packet(oggbs); 7233 } 7234 #endif 7235 7236 static size_t drflac__on_read_ogg(void* pUserData, void* bufferOut, size_t bytesToRead) 7237 { 7238 drflac_oggbs* oggbs = (drflac_oggbs*)pUserData; 7239 drflac_uint8* pRunningBufferOut = (drflac_uint8*)bufferOut; 7240 size_t bytesRead = 0; 7241 7242 DRFLAC_ASSERT(oggbs != NULL); 7243 DRFLAC_ASSERT(pRunningBufferOut != NULL); 7244 7245 /* Reading is done page-by-page. If we've run out of bytes in the page we need to move to the next one. */ 7246 while (bytesRead < bytesToRead) { 7247 size_t bytesRemainingToRead = bytesToRead - bytesRead; 7248 7249 if (oggbs->bytesRemainingInPage >= bytesRemainingToRead) { 7250 DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), bytesRemainingToRead); 7251 bytesRead += bytesRemainingToRead; 7252 oggbs->bytesRemainingInPage -= (drflac_uint32)bytesRemainingToRead; 7253 break; 7254 } 7255 7256 /* If we get here it means some of the requested data is contained in the next pages. */ 7257 if (oggbs->bytesRemainingInPage > 0) { 7258 DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), oggbs->bytesRemainingInPage); 7259 bytesRead += oggbs->bytesRemainingInPage; 7260 pRunningBufferOut += oggbs->bytesRemainingInPage; 7261 oggbs->bytesRemainingInPage = 0; 7262 } 7263 7264 DRFLAC_ASSERT(bytesRemainingToRead > 0); 7265 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 7266 break; /* Failed to go to the next page. Might have simply hit the end of the stream. */ 7267 } 7268 } 7269 7270 return bytesRead; 7271 } 7272 7273 static drflac_bool32 drflac__on_seek_ogg(void* pUserData, int offset, drflac_seek_origin origin) 7274 { 7275 drflac_oggbs* oggbs = (drflac_oggbs*)pUserData; 7276 int bytesSeeked = 0; 7277 7278 DRFLAC_ASSERT(oggbs != NULL); 7279 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 7280 7281 /* Seeking is always forward which makes things a lot simpler. */ 7282 if (origin == drflac_seek_origin_start) { 7283 if (!drflac_oggbs__seek_physical(oggbs, (int)oggbs->firstBytePos, drflac_seek_origin_start)) { 7284 return DRFLAC_FALSE; 7285 } 7286 7287 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) { 7288 return DRFLAC_FALSE; 7289 } 7290 7291 return drflac__on_seek_ogg(pUserData, offset, drflac_seek_origin_current); 7292 } 7293 7294 DRFLAC_ASSERT(origin == drflac_seek_origin_current); 7295 7296 while (bytesSeeked < offset) { 7297 int bytesRemainingToSeek = offset - bytesSeeked; 7298 DRFLAC_ASSERT(bytesRemainingToSeek >= 0); 7299 7300 if (oggbs->bytesRemainingInPage >= (size_t)bytesRemainingToSeek) { 7301 bytesSeeked += bytesRemainingToSeek; 7302 (void)bytesSeeked; /* <-- Silence a dead store warning emitted by Clang Static Analyzer. */ 7303 oggbs->bytesRemainingInPage -= bytesRemainingToSeek; 7304 break; 7305 } 7306 7307 /* If we get here it means some of the requested data is contained in the next pages. */ 7308 if (oggbs->bytesRemainingInPage > 0) { 7309 bytesSeeked += (int)oggbs->bytesRemainingInPage; 7310 oggbs->bytesRemainingInPage = 0; 7311 } 7312 7313 DRFLAC_ASSERT(bytesRemainingToSeek > 0); 7314 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) { 7315 /* Failed to go to the next page. We either hit the end of the stream or had a CRC mismatch. */ 7316 return DRFLAC_FALSE; 7317 } 7318 } 7319 7320 return DRFLAC_TRUE; 7321 } 7322 7323 7324 static drflac_bool32 drflac_ogg__seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex) 7325 { 7326 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 7327 drflac_uint64 originalBytePos; 7328 drflac_uint64 runningGranulePosition; 7329 drflac_uint64 runningFrameBytePos; 7330 drflac_uint64 runningPCMFrameCount; 7331 7332 DRFLAC_ASSERT(oggbs != NULL); 7333 7334 originalBytePos = oggbs->currentBytePos; /* For recovery. Points to the OggS identifier. */ 7335 7336 /* First seek to the first frame. */ 7337 if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes)) { 7338 return DRFLAC_FALSE; 7339 } 7340 oggbs->bytesRemainingInPage = 0; 7341 7342 runningGranulePosition = 0; 7343 for (;;) { 7344 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 7345 drflac_oggbs__seek_physical(oggbs, originalBytePos, drflac_seek_origin_start); 7346 return DRFLAC_FALSE; /* Never did find that sample... */ 7347 } 7348 7349 runningFrameBytePos = oggbs->currentBytePos - drflac_ogg__get_page_header_size(&oggbs->currentPageHeader) - oggbs->pageDataSize; 7350 if (oggbs->currentPageHeader.granulePosition >= pcmFrameIndex) { 7351 break; /* The sample is somewhere in the previous page. */ 7352 } 7353 7354 /* 7355 At this point we know the sample is not in the previous page. It could possibly be in this page. For simplicity we 7356 disregard any pages that do not begin a fresh packet. 7357 */ 7358 if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { /* <-- Is it a fresh page? */ 7359 if (oggbs->currentPageHeader.segmentTable[0] >= 2) { 7360 drflac_uint8 firstBytesInPage[2]; 7361 firstBytesInPage[0] = oggbs->pageData[0]; 7362 firstBytesInPage[1] = oggbs->pageData[1]; 7363 7364 if ((firstBytesInPage[0] == 0xFF) && (firstBytesInPage[1] & 0xFC) == 0xF8) { /* <-- Does the page begin with a frame's sync code? */ 7365 runningGranulePosition = oggbs->currentPageHeader.granulePosition; 7366 } 7367 7368 continue; 7369 } 7370 } 7371 } 7372 7373 /* 7374 We found the page that that is closest to the sample, so now we need to find it. The first thing to do is seek to the 7375 start of that page. In the loop above we checked that it was a fresh page which means this page is also the start of 7376 a new frame. This property means that after we've seeked to the page we can immediately start looping over frames until 7377 we find the one containing the target sample. 7378 */ 7379 if (!drflac_oggbs__seek_physical(oggbs, runningFrameBytePos, drflac_seek_origin_start)) { 7380 return DRFLAC_FALSE; 7381 } 7382 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 7383 return DRFLAC_FALSE; 7384 } 7385 7386 /* 7387 At this point we'll be sitting on the first byte of the frame header of the first frame in the page. We just keep 7388 looping over these frames until we find the one containing the sample we're after. 7389 */ 7390 runningPCMFrameCount = runningGranulePosition; 7391 for (;;) { 7392 /* 7393 There are two ways to find the sample and seek past irrelevant frames: 7394 1) Use the native FLAC decoder. 7395 2) Use Ogg's framing system. 7396 7397 Both of these options have their own pros and cons. Using the native FLAC decoder is slower because it needs to 7398 do a full decode of the frame. Using Ogg's framing system is faster, but more complicated and involves some code 7399 duplication for the decoding of frame headers. 7400 7401 Another thing to consider is that using the Ogg framing system will perform direct seeking of the physical Ogg 7402 bitstream. This is important to consider because it means we cannot read data from the drflac_bs object using the 7403 standard drflac__*() APIs because that will read in extra data for its own internal caching which in turn breaks 7404 the positioning of the read pointer of the physical Ogg bitstream. Therefore, anything that would normally be read 7405 using the native FLAC decoding APIs, such as drflac__read_next_flac_frame_header(), need to be re-implemented so as to 7406 avoid the use of the drflac_bs object. 7407 7408 Considering these issues, I have decided to use the slower native FLAC decoding method for the following reasons: 7409 1) Seeking is already partially accelerated using Ogg's paging system in the code block above. 7410 2) Seeking in an Ogg encapsulated FLAC stream is probably quite uncommon. 7411 3) Simplicity. 7412 */ 7413 drflac_uint64 firstPCMFrameInFLACFrame = 0; 7414 drflac_uint64 lastPCMFrameInFLACFrame = 0; 7415 drflac_uint64 pcmFrameCountInThisFrame; 7416 7417 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 7418 return DRFLAC_FALSE; 7419 } 7420 7421 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 7422 7423 pcmFrameCountInThisFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 7424 7425 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 7426 if (pcmFrameIndex == pFlac->totalPCMFrameCount && (runningPCMFrameCount + pcmFrameCountInThisFrame) == pFlac->totalPCMFrameCount) { 7427 drflac_result result = drflac__decode_flac_frame(pFlac); 7428 if (result == DRFLAC_SUCCESS) { 7429 pFlac->currentPCMFrame = pcmFrameIndex; 7430 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 7431 return DRFLAC_TRUE; 7432 } else { 7433 return DRFLAC_FALSE; 7434 } 7435 } 7436 7437 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFrame)) { 7438 /* 7439 The sample should be in this FLAC frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend 7440 it never existed and keep iterating. 7441 */ 7442 drflac_result result = drflac__decode_flac_frame(pFlac); 7443 if (result == DRFLAC_SUCCESS) { 7444 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 7445 drflac_uint64 pcmFramesToDecode = (size_t)(pcmFrameIndex - runningPCMFrameCount); /* <-- Safe cast because the maximum number of samples in a frame is 65535. */ 7446 if (pcmFramesToDecode == 0) { 7447 return DRFLAC_TRUE; 7448 } 7449 7450 pFlac->currentPCMFrame = runningPCMFrameCount; 7451 7452 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 7453 } else { 7454 if (result == DRFLAC_CRC_MISMATCH) { 7455 continue; /* CRC mismatch. Pretend this frame never existed. */ 7456 } else { 7457 return DRFLAC_FALSE; 7458 } 7459 } 7460 } else { 7461 /* 7462 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 7463 frame never existed and leave the running sample count untouched. 7464 */ 7465 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 7466 if (result == DRFLAC_SUCCESS) { 7467 runningPCMFrameCount += pcmFrameCountInThisFrame; 7468 } else { 7469 if (result == DRFLAC_CRC_MISMATCH) { 7470 continue; /* CRC mismatch. Pretend this frame never existed. */ 7471 } else { 7472 return DRFLAC_FALSE; 7473 } 7474 } 7475 } 7476 } 7477 } 7478 7479 7480 7481 static drflac_bool32 drflac__init_private__ogg(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed) 7482 { 7483 drflac_ogg_page_header header; 7484 drflac_uint32 crc32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32; 7485 drflac_uint32 bytesRead = 0; 7486 7487 /* Pre Condition: The bit stream should be sitting just past the 4-byte OggS capture pattern. */ 7488 (void)relaxed; 7489 7490 pInit->container = drflac_container_ogg; 7491 pInit->oggFirstBytePos = 0; 7492 7493 /* 7494 We'll get here if the first 4 bytes of the stream were the OggS capture pattern, however it doesn't necessarily mean the 7495 stream includes FLAC encoded audio. To check for this we need to scan the beginning-of-stream page markers and check if 7496 any match the FLAC specification. Important to keep in mind that the stream may be multiplexed. 7497 */ 7498 if (drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 7499 return DRFLAC_FALSE; 7500 } 7501 pInit->runningFilePos += bytesRead; 7502 7503 for (;;) { 7504 int pageBodySize; 7505 7506 /* Break if we're past the beginning of stream page. */ 7507 if ((header.headerType & 0x02) == 0) { 7508 return DRFLAC_FALSE; 7509 } 7510 7511 /* Check if it's a FLAC header. */ 7512 pageBodySize = drflac_ogg__get_page_body_size(&header); 7513 if (pageBodySize == 51) { /* 51 = the lacing value of the FLAC header packet. */ 7514 /* It could be a FLAC page... */ 7515 drflac_uint32 bytesRemainingInPage = pageBodySize; 7516 drflac_uint8 packetType; 7517 7518 if (onRead(pUserData, &packetType, 1) != 1) { 7519 return DRFLAC_FALSE; 7520 } 7521 7522 bytesRemainingInPage -= 1; 7523 if (packetType == 0x7F) { 7524 /* Increasingly more likely to be a FLAC page... */ 7525 drflac_uint8 sig[4]; 7526 if (onRead(pUserData, sig, 4) != 4) { 7527 return DRFLAC_FALSE; 7528 } 7529 7530 bytesRemainingInPage -= 4; 7531 if (sig[0] == 'F' && sig[1] == 'L' && sig[2] == 'A' && sig[3] == 'C') { 7532 /* Almost certainly a FLAC page... */ 7533 drflac_uint8 mappingVersion[2]; 7534 if (onRead(pUserData, mappingVersion, 2) != 2) { 7535 return DRFLAC_FALSE; 7536 } 7537 7538 if (mappingVersion[0] != 1) { 7539 return DRFLAC_FALSE; /* Only supporting version 1.x of the Ogg mapping. */ 7540 } 7541 7542 /* 7543 The next 2 bytes are the non-audio packets, not including this one. We don't care about this because we're going to 7544 be handling it in a generic way based on the serial number and packet types. 7545 */ 7546 if (!onSeek(pUserData, 2, drflac_seek_origin_current)) { 7547 return DRFLAC_FALSE; 7548 } 7549 7550 /* Expecting the native FLAC signature "fLaC". */ 7551 if (onRead(pUserData, sig, 4) != 4) { 7552 return DRFLAC_FALSE; 7553 } 7554 7555 if (sig[0] == 'f' && sig[1] == 'L' && sig[2] == 'a' && sig[3] == 'C') { 7556 /* The remaining data in the page should be the STREAMINFO block. */ 7557 drflac_streaminfo streaminfo; 7558 drflac_uint8 isLastBlock; 7559 drflac_uint8 blockType; 7560 drflac_uint32 blockSize; 7561 if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) { 7562 return DRFLAC_FALSE; 7563 } 7564 7565 if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) { 7566 return DRFLAC_FALSE; /* Invalid block type. First block must be the STREAMINFO block. */ 7567 } 7568 7569 if (drflac__read_streaminfo(onRead, pUserData, &streaminfo)) { 7570 /* Success! */ 7571 pInit->hasStreamInfoBlock = DRFLAC_TRUE; 7572 pInit->sampleRate = streaminfo.sampleRate; 7573 pInit->channels = streaminfo.channels; 7574 pInit->bitsPerSample = streaminfo.bitsPerSample; 7575 pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount; 7576 pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; 7577 pInit->hasMetadataBlocks = !isLastBlock; 7578 7579 if (onMeta) { 7580 drflac_metadata metadata; 7581 metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO; 7582 metadata.pRawData = NULL; 7583 metadata.rawDataSize = 0; 7584 metadata.data.streaminfo = streaminfo; 7585 onMeta(pUserDataMD, &metadata); 7586 } 7587 7588 pInit->runningFilePos += pageBodySize; 7589 pInit->oggFirstBytePos = pInit->runningFilePos - 79; /* Subtracting 79 will place us right on top of the "OggS" identifier of the FLAC bos page. */ 7590 pInit->oggSerial = header.serialNumber; 7591 pInit->oggBosHeader = header; 7592 break; 7593 } else { 7594 /* Failed to read STREAMINFO block. Aww, so close... */ 7595 return DRFLAC_FALSE; 7596 } 7597 } else { 7598 /* Invalid file. */ 7599 return DRFLAC_FALSE; 7600 } 7601 } else { 7602 /* Not a FLAC header. Skip it. */ 7603 if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) { 7604 return DRFLAC_FALSE; 7605 } 7606 } 7607 } else { 7608 /* Not a FLAC header. Seek past the entire page and move on to the next. */ 7609 if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) { 7610 return DRFLAC_FALSE; 7611 } 7612 } 7613 } else { 7614 if (!onSeek(pUserData, pageBodySize, drflac_seek_origin_current)) { 7615 return DRFLAC_FALSE; 7616 } 7617 } 7618 7619 pInit->runningFilePos += pageBodySize; 7620 7621 7622 /* Read the header of the next page. */ 7623 if (drflac_ogg__read_page_header(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 7624 return DRFLAC_FALSE; 7625 } 7626 pInit->runningFilePos += bytesRead; 7627 } 7628 7629 /* 7630 If we get here it means we found a FLAC audio stream. We should be sitting on the first byte of the header of the next page. The next 7631 packets in the FLAC logical stream contain the metadata. The only thing left to do in the initialization phase for Ogg is to create the 7632 Ogg bistream object. 7633 */ 7634 pInit->hasMetadataBlocks = DRFLAC_TRUE; /* <-- Always have at least VORBIS_COMMENT metadata block. */ 7635 return DRFLAC_TRUE; 7636 } 7637 #endif 7638 7639 static drflac_bool32 drflac__init_private(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD) 7640 { 7641 drflac_bool32 relaxed; 7642 drflac_uint8 id[4]; 7643 7644 if (pInit == NULL || onRead == NULL || onSeek == NULL) { 7645 return DRFLAC_FALSE; 7646 } 7647 7648 DRFLAC_ZERO_MEMORY(pInit, sizeof(*pInit)); 7649 pInit->onRead = onRead; 7650 pInit->onSeek = onSeek; 7651 pInit->onMeta = onMeta; 7652 pInit->container = container; 7653 pInit->pUserData = pUserData; 7654 pInit->pUserDataMD = pUserDataMD; 7655 7656 pInit->bs.onRead = onRead; 7657 pInit->bs.onSeek = onSeek; 7658 pInit->bs.pUserData = pUserData; 7659 drflac__reset_cache(&pInit->bs); 7660 7661 7662 /* If the container is explicitly defined then we can try opening in relaxed mode. */ 7663 relaxed = container != drflac_container_unknown; 7664 7665 /* Skip over any ID3 tags. */ 7666 for (;;) { 7667 if (onRead(pUserData, id, 4) != 4) { 7668 return DRFLAC_FALSE; /* Ran out of data. */ 7669 } 7670 pInit->runningFilePos += 4; 7671 7672 if (id[0] == 'I' && id[1] == 'D' && id[2] == '3') { 7673 drflac_uint8 header[6]; 7674 drflac_uint8 flags; 7675 drflac_uint32 headerSize; 7676 7677 if (onRead(pUserData, header, 6) != 6) { 7678 return DRFLAC_FALSE; /* Ran out of data. */ 7679 } 7680 pInit->runningFilePos += 6; 7681 7682 flags = header[1]; 7683 7684 DRFLAC_COPY_MEMORY(&headerSize, header+2, 4); 7685 headerSize = drflac__unsynchsafe_32(drflac__be2host_32(headerSize)); 7686 if (flags & 0x10) { 7687 headerSize += 10; 7688 } 7689 7690 if (!onSeek(pUserData, headerSize, drflac_seek_origin_current)) { 7691 return DRFLAC_FALSE; /* Failed to seek past the tag. */ 7692 } 7693 pInit->runningFilePos += headerSize; 7694 } else { 7695 break; 7696 } 7697 } 7698 7699 if (id[0] == 'f' && id[1] == 'L' && id[2] == 'a' && id[3] == 'C') { 7700 return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 7701 } 7702 #ifndef DR_FLAC_NO_OGG 7703 if (id[0] == 'O' && id[1] == 'g' && id[2] == 'g' && id[3] == 'S') { 7704 return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 7705 } 7706 #endif 7707 7708 /* If we get here it means we likely don't have a header. Try opening in relaxed mode, if applicable. */ 7709 if (relaxed) { 7710 if (container == drflac_container_native) { 7711 return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 7712 } 7713 #ifndef DR_FLAC_NO_OGG 7714 if (container == drflac_container_ogg) { 7715 return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 7716 } 7717 #endif 7718 } 7719 7720 /* Unsupported container. */ 7721 return DRFLAC_FALSE; 7722 } 7723 7724 static void drflac__init_from_info(drflac* pFlac, const drflac_init_info* pInit) 7725 { 7726 DRFLAC_ASSERT(pFlac != NULL); 7727 DRFLAC_ASSERT(pInit != NULL); 7728 7729 DRFLAC_ZERO_MEMORY(pFlac, sizeof(*pFlac)); 7730 pFlac->bs = pInit->bs; 7731 pFlac->onMeta = pInit->onMeta; 7732 pFlac->pUserDataMD = pInit->pUserDataMD; 7733 pFlac->maxBlockSizeInPCMFrames = pInit->maxBlockSizeInPCMFrames; 7734 pFlac->sampleRate = pInit->sampleRate; 7735 pFlac->channels = (drflac_uint8)pInit->channels; 7736 pFlac->bitsPerSample = (drflac_uint8)pInit->bitsPerSample; 7737 pFlac->totalPCMFrameCount = pInit->totalPCMFrameCount; 7738 pFlac->container = pInit->container; 7739 } 7740 7741 7742 static drflac* drflac_open_with_metadata_private(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD, const drflac_allocation_callbacks* pAllocationCallbacks) 7743 { 7744 drflac_init_info init; 7745 drflac_uint32 allocationSize; 7746 drflac_uint32 wholeSIMDVectorCountPerChannel; 7747 drflac_uint32 decodedSamplesAllocationSize; 7748 #ifndef DR_FLAC_NO_OGG 7749 drflac_oggbs oggbs; 7750 #endif 7751 drflac_uint64 firstFramePos; 7752 drflac_uint64 seektablePos; 7753 drflac_uint32 seektableSize; 7754 drflac_allocation_callbacks allocationCallbacks; 7755 drflac* pFlac; 7756 7757 /* CPU support first. */ 7758 drflac__init_cpu_caps(); 7759 7760 if (!drflac__init_private(&init, onRead, onSeek, onMeta, container, pUserData, pUserDataMD)) { 7761 return NULL; 7762 } 7763 7764 if (pAllocationCallbacks != NULL) { 7765 allocationCallbacks = *pAllocationCallbacks; 7766 if (allocationCallbacks.onFree == NULL || (allocationCallbacks.onMalloc == NULL && allocationCallbacks.onRealloc == NULL)) { 7767 return NULL; /* Invalid allocation callbacks. */ 7768 } 7769 } else { 7770 allocationCallbacks.pUserData = NULL; 7771 allocationCallbacks.onMalloc = drflac__malloc_default; 7772 allocationCallbacks.onRealloc = drflac__realloc_default; 7773 allocationCallbacks.onFree = drflac__free_default; 7774 } 7775 7776 7777 /* 7778 The size of the allocation for the drflac object needs to be large enough to fit the following: 7779 1) The main members of the drflac structure 7780 2) A block of memory large enough to store the decoded samples of the largest frame in the stream 7781 3) If the container is Ogg, a drflac_oggbs object 7782 7783 The complicated part of the allocation is making sure there's enough room the decoded samples, taking into consideration 7784 the different SIMD instruction sets. 7785 */ 7786 allocationSize = sizeof(drflac); 7787 7788 /* 7789 The allocation size for decoded frames depends on the number of 32-bit integers that fit inside the largest SIMD vector 7790 we are supporting. 7791 */ 7792 if ((init.maxBlockSizeInPCMFrames % (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) == 0) { 7793 wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))); 7794 } else { 7795 wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) + 1; 7796 } 7797 7798 decodedSamplesAllocationSize = wholeSIMDVectorCountPerChannel * DRFLAC_MAX_SIMD_VECTOR_SIZE * init.channels; 7799 7800 allocationSize += decodedSamplesAllocationSize; 7801 allocationSize += DRFLAC_MAX_SIMD_VECTOR_SIZE; /* Allocate extra bytes to ensure we have enough for alignment. */ 7802 7803 #ifndef DR_FLAC_NO_OGG 7804 /* There's additional data required for Ogg streams. */ 7805 if (init.container == drflac_container_ogg) { 7806 allocationSize += sizeof(drflac_oggbs); 7807 } 7808 7809 DRFLAC_ZERO_MEMORY(&oggbs, sizeof(oggbs)); 7810 if (init.container == drflac_container_ogg) { 7811 oggbs.onRead = onRead; 7812 oggbs.onSeek = onSeek; 7813 oggbs.pUserData = pUserData; 7814 oggbs.currentBytePos = init.oggFirstBytePos; 7815 oggbs.firstBytePos = init.oggFirstBytePos; 7816 oggbs.serialNumber = init.oggSerial; 7817 oggbs.bosPageHeader = init.oggBosHeader; 7818 oggbs.bytesRemainingInPage = 0; 7819 } 7820 #endif 7821 7822 /* 7823 This part is a bit awkward. We need to load the seektable so that it can be referenced in-memory, but I want the drflac object to 7824 consist of only a single heap allocation. To this, the size of the seek table needs to be known, which we determine when reading 7825 and decoding the metadata. 7826 */ 7827 firstFramePos = 42; /* <-- We know we are at byte 42 at this point. */ 7828 seektablePos = 0; 7829 seektableSize = 0; 7830 if (init.hasMetadataBlocks) { 7831 drflac_read_proc onReadOverride = onRead; 7832 drflac_seek_proc onSeekOverride = onSeek; 7833 void* pUserDataOverride = pUserData; 7834 7835 #ifndef DR_FLAC_NO_OGG 7836 if (init.container == drflac_container_ogg) { 7837 onReadOverride = drflac__on_read_ogg; 7838 onSeekOverride = drflac__on_seek_ogg; 7839 pUserDataOverride = (void*)&oggbs; 7840 } 7841 #endif 7842 7843 if (!drflac__read_and_decode_metadata(onReadOverride, onSeekOverride, onMeta, pUserDataOverride, pUserDataMD, &firstFramePos, &seektablePos, &seektableSize, &allocationCallbacks)) { 7844 return NULL; 7845 } 7846 7847 allocationSize += seektableSize; 7848 } 7849 7850 7851 pFlac = (drflac*)drflac__malloc_from_callbacks(allocationSize, &allocationCallbacks); 7852 if (pFlac == NULL) { 7853 return NULL; 7854 } 7855 7856 drflac__init_from_info(pFlac, &init); 7857 pFlac->allocationCallbacks = allocationCallbacks; 7858 pFlac->pDecodedSamples = (drflac_int32*)drflac_align((size_t)pFlac->pExtraData, DRFLAC_MAX_SIMD_VECTOR_SIZE); 7859 7860 #ifndef DR_FLAC_NO_OGG 7861 if (init.container == drflac_container_ogg) { 7862 drflac_oggbs* pInternalOggbs = (drflac_oggbs*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize + seektableSize); 7863 *pInternalOggbs = oggbs; 7864 7865 /* The Ogg bistream needs to be layered on top of the original bitstream. */ 7866 pFlac->bs.onRead = drflac__on_read_ogg; 7867 pFlac->bs.onSeek = drflac__on_seek_ogg; 7868 pFlac->bs.pUserData = (void*)pInternalOggbs; 7869 pFlac->_oggbs = (void*)pInternalOggbs; 7870 } 7871 #endif 7872 7873 pFlac->firstFLACFramePosInBytes = firstFramePos; 7874 7875 /* NOTE: Seektables are not currently compatible with Ogg encapsulation (Ogg has its own accelerated seeking system). I may change this later, so I'm leaving this here for now. */ 7876 #ifndef DR_FLAC_NO_OGG 7877 if (init.container == drflac_container_ogg) 7878 { 7879 pFlac->pSeekpoints = NULL; 7880 pFlac->seekpointCount = 0; 7881 } 7882 else 7883 #endif 7884 { 7885 /* If we have a seektable we need to load it now, making sure we move back to where we were previously. */ 7886 if (seektablePos != 0) { 7887 pFlac->seekpointCount = seektableSize / sizeof(*pFlac->pSeekpoints); 7888 pFlac->pSeekpoints = (drflac_seekpoint*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize); 7889 7890 DRFLAC_ASSERT(pFlac->bs.onSeek != NULL); 7891 DRFLAC_ASSERT(pFlac->bs.onRead != NULL); 7892 7893 /* Seek to the seektable, then just read directly into our seektable buffer. */ 7894 if (pFlac->bs.onSeek(pFlac->bs.pUserData, (int)seektablePos, drflac_seek_origin_start)) { 7895 if (pFlac->bs.onRead(pFlac->bs.pUserData, pFlac->pSeekpoints, seektableSize) == seektableSize) { 7896 /* Endian swap. */ 7897 drflac_uint32 iSeekpoint; 7898 for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) { 7899 pFlac->pSeekpoints[iSeekpoint].firstPCMFrame = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].firstPCMFrame); 7900 pFlac->pSeekpoints[iSeekpoint].flacFrameOffset = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].flacFrameOffset); 7901 pFlac->pSeekpoints[iSeekpoint].pcmFrameCount = drflac__be2host_16(pFlac->pSeekpoints[iSeekpoint].pcmFrameCount); 7902 } 7903 } else { 7904 /* Failed to read the seektable. Pretend we don't have one. */ 7905 pFlac->pSeekpoints = NULL; 7906 pFlac->seekpointCount = 0; 7907 } 7908 7909 /* We need to seek back to where we were. If this fails it's a critical error. */ 7910 if (!pFlac->bs.onSeek(pFlac->bs.pUserData, (int)pFlac->firstFLACFramePosInBytes, drflac_seek_origin_start)) { 7911 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 7912 return NULL; 7913 } 7914 } else { 7915 /* Failed to seek to the seektable. Ominous sign, but for now we can just pretend we don't have one. */ 7916 pFlac->pSeekpoints = NULL; 7917 pFlac->seekpointCount = 0; 7918 } 7919 } 7920 } 7921 7922 7923 /* 7924 If we get here, but don't have a STREAMINFO block, it means we've opened the stream in relaxed mode and need to decode 7925 the first frame. 7926 */ 7927 if (!init.hasStreamInfoBlock) { 7928 pFlac->currentFLACFrame.header = init.firstFrameHeader; 7929 for (;;) { 7930 drflac_result result = drflac__decode_flac_frame(pFlac); 7931 if (result == DRFLAC_SUCCESS) { 7932 break; 7933 } else { 7934 if (result == DRFLAC_CRC_MISMATCH) { 7935 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 7936 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 7937 return NULL; 7938 } 7939 continue; 7940 } else { 7941 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 7942 return NULL; 7943 } 7944 } 7945 } 7946 } 7947 7948 return pFlac; 7949 } 7950 7951 7952 7953 #ifndef DR_FLAC_NO_STDIO 7954 #include <stdio.h> 7955 #include <wchar.h> /* For wcslen(), wcsrtombs() */ 7956 7957 /* drflac_result_from_errno() is only used for fopen() and wfopen() so putting it inside DR_WAV_NO_STDIO for now. If something else needs this later we can move it out. */ 7958 #include <errno.h> 7959 static drflac_result drflac_result_from_errno(int e) 7960 { 7961 switch (e) 7962 { 7963 case 0: return DRFLAC_SUCCESS; 7964 #ifdef EPERM 7965 case EPERM: return DRFLAC_INVALID_OPERATION; 7966 #endif 7967 #ifdef ENOENT 7968 case ENOENT: return DRFLAC_DOES_NOT_EXIST; 7969 #endif 7970 #ifdef ESRCH 7971 case ESRCH: return DRFLAC_DOES_NOT_EXIST; 7972 #endif 7973 #ifdef EINTR 7974 case EINTR: return DRFLAC_INTERRUPT; 7975 #endif 7976 #ifdef EIO 7977 case EIO: return DRFLAC_IO_ERROR; 7978 #endif 7979 #ifdef ENXIO 7980 case ENXIO: return DRFLAC_DOES_NOT_EXIST; 7981 #endif 7982 #ifdef E2BIG 7983 case E2BIG: return DRFLAC_INVALID_ARGS; 7984 #endif 7985 #ifdef ENOEXEC 7986 case ENOEXEC: return DRFLAC_INVALID_FILE; 7987 #endif 7988 #ifdef EBADF 7989 case EBADF: return DRFLAC_INVALID_FILE; 7990 #endif 7991 #ifdef ECHILD 7992 case ECHILD: return DRFLAC_ERROR; 7993 #endif 7994 #ifdef EAGAIN 7995 case EAGAIN: return DRFLAC_UNAVAILABLE; 7996 #endif 7997 #ifdef ENOMEM 7998 case ENOMEM: return DRFLAC_OUT_OF_MEMORY; 7999 #endif 8000 #ifdef EACCES 8001 case EACCES: return DRFLAC_ACCESS_DENIED; 8002 #endif 8003 #ifdef EFAULT 8004 case EFAULT: return DRFLAC_BAD_ADDRESS; 8005 #endif 8006 #ifdef ENOTBLK 8007 case ENOTBLK: return DRFLAC_ERROR; 8008 #endif 8009 #ifdef EBUSY 8010 case EBUSY: return DRFLAC_BUSY; 8011 #endif 8012 #ifdef EEXIST 8013 case EEXIST: return DRFLAC_ALREADY_EXISTS; 8014 #endif 8015 #ifdef EXDEV 8016 case EXDEV: return DRFLAC_ERROR; 8017 #endif 8018 #ifdef ENODEV 8019 case ENODEV: return DRFLAC_DOES_NOT_EXIST; 8020 #endif 8021 #ifdef ENOTDIR 8022 case ENOTDIR: return DRFLAC_NOT_DIRECTORY; 8023 #endif 8024 #ifdef EISDIR 8025 case EISDIR: return DRFLAC_IS_DIRECTORY; 8026 #endif 8027 #ifdef EINVAL 8028 case EINVAL: return DRFLAC_INVALID_ARGS; 8029 #endif 8030 #ifdef ENFILE 8031 case ENFILE: return DRFLAC_TOO_MANY_OPEN_FILES; 8032 #endif 8033 #ifdef EMFILE 8034 case EMFILE: return DRFLAC_TOO_MANY_OPEN_FILES; 8035 #endif 8036 #ifdef ENOTTY 8037 case ENOTTY: return DRFLAC_INVALID_OPERATION; 8038 #endif 8039 #ifdef ETXTBSY 8040 case ETXTBSY: return DRFLAC_BUSY; 8041 #endif 8042 #ifdef EFBIG 8043 case EFBIG: return DRFLAC_TOO_BIG; 8044 #endif 8045 #ifdef ENOSPC 8046 case ENOSPC: return DRFLAC_NO_SPACE; 8047 #endif 8048 #ifdef ESPIPE 8049 case ESPIPE: return DRFLAC_BAD_SEEK; 8050 #endif 8051 #ifdef EROFS 8052 case EROFS: return DRFLAC_ACCESS_DENIED; 8053 #endif 8054 #ifdef EMLINK 8055 case EMLINK: return DRFLAC_TOO_MANY_LINKS; 8056 #endif 8057 #ifdef EPIPE 8058 case EPIPE: return DRFLAC_BAD_PIPE; 8059 #endif 8060 #ifdef EDOM 8061 case EDOM: return DRFLAC_OUT_OF_RANGE; 8062 #endif 8063 #ifdef ERANGE 8064 case ERANGE: return DRFLAC_OUT_OF_RANGE; 8065 #endif 8066 #ifdef EDEADLK 8067 case EDEADLK: return DRFLAC_DEADLOCK; 8068 #endif 8069 #ifdef ENAMETOOLONG 8070 case ENAMETOOLONG: return DRFLAC_PATH_TOO_LONG; 8071 #endif 8072 #ifdef ENOLCK 8073 case ENOLCK: return DRFLAC_ERROR; 8074 #endif 8075 #ifdef ENOSYS 8076 case ENOSYS: return DRFLAC_NOT_IMPLEMENTED; 8077 #endif 8078 #ifdef ENOTEMPTY 8079 case ENOTEMPTY: return DRFLAC_DIRECTORY_NOT_EMPTY; 8080 #endif 8081 #ifdef ELOOP 8082 case ELOOP: return DRFLAC_TOO_MANY_LINKS; 8083 #endif 8084 #ifdef ENOMSG 8085 case ENOMSG: return DRFLAC_NO_MESSAGE; 8086 #endif 8087 #ifdef EIDRM 8088 case EIDRM: return DRFLAC_ERROR; 8089 #endif 8090 #ifdef ECHRNG 8091 case ECHRNG: return DRFLAC_ERROR; 8092 #endif 8093 #ifdef EL2NSYNC 8094 case EL2NSYNC: return DRFLAC_ERROR; 8095 #endif 8096 #ifdef EL3HLT 8097 case EL3HLT: return DRFLAC_ERROR; 8098 #endif 8099 #ifdef EL3RST 8100 case EL3RST: return DRFLAC_ERROR; 8101 #endif 8102 #ifdef ELNRNG 8103 case ELNRNG: return DRFLAC_OUT_OF_RANGE; 8104 #endif 8105 #ifdef EUNATCH 8106 case EUNATCH: return DRFLAC_ERROR; 8107 #endif 8108 #ifdef ENOCSI 8109 case ENOCSI: return DRFLAC_ERROR; 8110 #endif 8111 #ifdef EL2HLT 8112 case EL2HLT: return DRFLAC_ERROR; 8113 #endif 8114 #ifdef EBADE 8115 case EBADE: return DRFLAC_ERROR; 8116 #endif 8117 #ifdef EBADR 8118 case EBADR: return DRFLAC_ERROR; 8119 #endif 8120 #ifdef EXFULL 8121 case EXFULL: return DRFLAC_ERROR; 8122 #endif 8123 #ifdef ENOANO 8124 case ENOANO: return DRFLAC_ERROR; 8125 #endif 8126 #ifdef EBADRQC 8127 case EBADRQC: return DRFLAC_ERROR; 8128 #endif 8129 #ifdef EBADSLT 8130 case EBADSLT: return DRFLAC_ERROR; 8131 #endif 8132 #ifdef EBFONT 8133 case EBFONT: return DRFLAC_INVALID_FILE; 8134 #endif 8135 #ifdef ENOSTR 8136 case ENOSTR: return DRFLAC_ERROR; 8137 #endif 8138 #ifdef ENODATA 8139 case ENODATA: return DRFLAC_NO_DATA_AVAILABLE; 8140 #endif 8141 #ifdef ETIME 8142 case ETIME: return DRFLAC_TIMEOUT; 8143 #endif 8144 #ifdef ENOSR 8145 case ENOSR: return DRFLAC_NO_DATA_AVAILABLE; 8146 #endif 8147 #ifdef ENONET 8148 case ENONET: return DRFLAC_NO_NETWORK; 8149 #endif 8150 #ifdef ENOPKG 8151 case ENOPKG: return DRFLAC_ERROR; 8152 #endif 8153 #ifdef EREMOTE 8154 case EREMOTE: return DRFLAC_ERROR; 8155 #endif 8156 #ifdef ENOLINK 8157 case ENOLINK: return DRFLAC_ERROR; 8158 #endif 8159 #ifdef EADV 8160 case EADV: return DRFLAC_ERROR; 8161 #endif 8162 #ifdef ESRMNT 8163 case ESRMNT: return DRFLAC_ERROR; 8164 #endif 8165 #ifdef ECOMM 8166 case ECOMM: return DRFLAC_ERROR; 8167 #endif 8168 #ifdef EPROTO 8169 case EPROTO: return DRFLAC_ERROR; 8170 #endif 8171 #ifdef EMULTIHOP 8172 case EMULTIHOP: return DRFLAC_ERROR; 8173 #endif 8174 #ifdef EDOTDOT 8175 case EDOTDOT: return DRFLAC_ERROR; 8176 #endif 8177 #ifdef EBADMSG 8178 case EBADMSG: return DRFLAC_BAD_MESSAGE; 8179 #endif 8180 #ifdef EOVERFLOW 8181 case EOVERFLOW: return DRFLAC_TOO_BIG; 8182 #endif 8183 #ifdef ENOTUNIQ 8184 case ENOTUNIQ: return DRFLAC_NOT_UNIQUE; 8185 #endif 8186 #ifdef EBADFD 8187 case EBADFD: return DRFLAC_ERROR; 8188 #endif 8189 #ifdef EREMCHG 8190 case EREMCHG: return DRFLAC_ERROR; 8191 #endif 8192 #ifdef ELIBACC 8193 case ELIBACC: return DRFLAC_ACCESS_DENIED; 8194 #endif 8195 #ifdef ELIBBAD 8196 case ELIBBAD: return DRFLAC_INVALID_FILE; 8197 #endif 8198 #ifdef ELIBSCN 8199 case ELIBSCN: return DRFLAC_INVALID_FILE; 8200 #endif 8201 #ifdef ELIBMAX 8202 case ELIBMAX: return DRFLAC_ERROR; 8203 #endif 8204 #ifdef ELIBEXEC 8205 case ELIBEXEC: return DRFLAC_ERROR; 8206 #endif 8207 #ifdef EILSEQ 8208 case EILSEQ: return DRFLAC_INVALID_DATA; 8209 #endif 8210 #ifdef ERESTART 8211 case ERESTART: return DRFLAC_ERROR; 8212 #endif 8213 #ifdef ESTRPIPE 8214 case ESTRPIPE: return DRFLAC_ERROR; 8215 #endif 8216 #ifdef EUSERS 8217 case EUSERS: return DRFLAC_ERROR; 8218 #endif 8219 #ifdef ENOTSOCK 8220 case ENOTSOCK: return DRFLAC_NOT_SOCKET; 8221 #endif 8222 #ifdef EDESTADDRREQ 8223 case EDESTADDRREQ: return DRFLAC_NO_ADDRESS; 8224 #endif 8225 #ifdef EMSGSIZE 8226 case EMSGSIZE: return DRFLAC_TOO_BIG; 8227 #endif 8228 #ifdef EPROTOTYPE 8229 case EPROTOTYPE: return DRFLAC_BAD_PROTOCOL; 8230 #endif 8231 #ifdef ENOPROTOOPT 8232 case ENOPROTOOPT: return DRFLAC_PROTOCOL_UNAVAILABLE; 8233 #endif 8234 #ifdef EPROTONOSUPPORT 8235 case EPROTONOSUPPORT: return DRFLAC_PROTOCOL_NOT_SUPPORTED; 8236 #endif 8237 #ifdef ESOCKTNOSUPPORT 8238 case ESOCKTNOSUPPORT: return DRFLAC_SOCKET_NOT_SUPPORTED; 8239 #endif 8240 #ifdef EOPNOTSUPP 8241 case EOPNOTSUPP: return DRFLAC_INVALID_OPERATION; 8242 #endif 8243 #ifdef EPFNOSUPPORT 8244 case EPFNOSUPPORT: return DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED; 8245 #endif 8246 #ifdef EAFNOSUPPORT 8247 case EAFNOSUPPORT: return DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED; 8248 #endif 8249 #ifdef EADDRINUSE 8250 case EADDRINUSE: return DRFLAC_ALREADY_IN_USE; 8251 #endif 8252 #ifdef EADDRNOTAVAIL 8253 case EADDRNOTAVAIL: return DRFLAC_ERROR; 8254 #endif 8255 #ifdef ENETDOWN 8256 case ENETDOWN: return DRFLAC_NO_NETWORK; 8257 #endif 8258 #ifdef ENETUNREACH 8259 case ENETUNREACH: return DRFLAC_NO_NETWORK; 8260 #endif 8261 #ifdef ENETRESET 8262 case ENETRESET: return DRFLAC_NO_NETWORK; 8263 #endif 8264 #ifdef ECONNABORTED 8265 case ECONNABORTED: return DRFLAC_NO_NETWORK; 8266 #endif 8267 #ifdef ECONNRESET 8268 case ECONNRESET: return DRFLAC_CONNECTION_RESET; 8269 #endif 8270 #ifdef ENOBUFS 8271 case ENOBUFS: return DRFLAC_NO_SPACE; 8272 #endif 8273 #ifdef EISCONN 8274 case EISCONN: return DRFLAC_ALREADY_CONNECTED; 8275 #endif 8276 #ifdef ENOTCONN 8277 case ENOTCONN: return DRFLAC_NOT_CONNECTED; 8278 #endif 8279 #ifdef ESHUTDOWN 8280 case ESHUTDOWN: return DRFLAC_ERROR; 8281 #endif 8282 #ifdef ETOOMANYREFS 8283 case ETOOMANYREFS: return DRFLAC_ERROR; 8284 #endif 8285 #ifdef ETIMEDOUT 8286 case ETIMEDOUT: return DRFLAC_TIMEOUT; 8287 #endif 8288 #ifdef ECONNREFUSED 8289 case ECONNREFUSED: return DRFLAC_CONNECTION_REFUSED; 8290 #endif 8291 #ifdef EHOSTDOWN 8292 case EHOSTDOWN: return DRFLAC_NO_HOST; 8293 #endif 8294 #ifdef EHOSTUNREACH 8295 case EHOSTUNREACH: return DRFLAC_NO_HOST; 8296 #endif 8297 #ifdef EALREADY 8298 case EALREADY: return DRFLAC_IN_PROGRESS; 8299 #endif 8300 #ifdef EINPROGRESS 8301 case EINPROGRESS: return DRFLAC_IN_PROGRESS; 8302 #endif 8303 #ifdef ESTALE 8304 case ESTALE: return DRFLAC_INVALID_FILE; 8305 #endif 8306 #ifdef EUCLEAN 8307 case EUCLEAN: return DRFLAC_ERROR; 8308 #endif 8309 #ifdef ENOTNAM 8310 case ENOTNAM: return DRFLAC_ERROR; 8311 #endif 8312 #ifdef ENAVAIL 8313 case ENAVAIL: return DRFLAC_ERROR; 8314 #endif 8315 #ifdef EISNAM 8316 case EISNAM: return DRFLAC_ERROR; 8317 #endif 8318 #ifdef EREMOTEIO 8319 case EREMOTEIO: return DRFLAC_IO_ERROR; 8320 #endif 8321 #ifdef EDQUOT 8322 case EDQUOT: return DRFLAC_NO_SPACE; 8323 #endif 8324 #ifdef ENOMEDIUM 8325 case ENOMEDIUM: return DRFLAC_DOES_NOT_EXIST; 8326 #endif 8327 #ifdef EMEDIUMTYPE 8328 case EMEDIUMTYPE: return DRFLAC_ERROR; 8329 #endif 8330 #ifdef ECANCELED 8331 case ECANCELED: return DRFLAC_CANCELLED; 8332 #endif 8333 #ifdef ENOKEY 8334 case ENOKEY: return DRFLAC_ERROR; 8335 #endif 8336 #ifdef EKEYEXPIRED 8337 case EKEYEXPIRED: return DRFLAC_ERROR; 8338 #endif 8339 #ifdef EKEYREVOKED 8340 case EKEYREVOKED: return DRFLAC_ERROR; 8341 #endif 8342 #ifdef EKEYREJECTED 8343 case EKEYREJECTED: return DRFLAC_ERROR; 8344 #endif 8345 #ifdef EOWNERDEAD 8346 case EOWNERDEAD: return DRFLAC_ERROR; 8347 #endif 8348 #ifdef ENOTRECOVERABLE 8349 case ENOTRECOVERABLE: return DRFLAC_ERROR; 8350 #endif 8351 #ifdef ERFKILL 8352 case ERFKILL: return DRFLAC_ERROR; 8353 #endif 8354 #ifdef EHWPOISON 8355 case EHWPOISON: return DRFLAC_ERROR; 8356 #endif 8357 default: return DRFLAC_ERROR; 8358 } 8359 } 8360 8361 static drflac_result drflac_fopen(FILE** ppFile, const char* pFilePath, const char* pOpenMode) 8362 { 8363 #if defined(_MSC_VER) && _MSC_VER >= 1400 8364 errno_t err; 8365 #endif 8366 8367 if (ppFile != NULL) { 8368 *ppFile = NULL; /* Safety. */ 8369 } 8370 8371 if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) { 8372 return DRFLAC_INVALID_ARGS; 8373 } 8374 8375 #if defined(_MSC_VER) && _MSC_VER >= 1400 8376 err = fopen_s(ppFile, pFilePath, pOpenMode); 8377 if (err != 0) { 8378 return drflac_result_from_errno(err); 8379 } 8380 #else 8381 #if defined(_WIN32) || defined(__APPLE__) 8382 *ppFile = fopen(pFilePath, pOpenMode); 8383 #else 8384 #if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64 && defined(_LARGEFILE64_SOURCE) 8385 *ppFile = fopen64(pFilePath, pOpenMode); 8386 #else 8387 *ppFile = fopen(pFilePath, pOpenMode); 8388 #endif 8389 #endif 8390 if (*ppFile == NULL) { 8391 drflac_result result = drflac_result_from_errno(errno); 8392 if (result == DRFLAC_SUCCESS) { 8393 result = DRFLAC_ERROR; /* Just a safety check to make sure we never ever return success when pFile == NULL. */ 8394 } 8395 8396 return result; 8397 } 8398 #endif 8399 8400 return DRFLAC_SUCCESS; 8401 } 8402 8403 /* 8404 _wfopen() isn't always available in all compilation environments. 8405 8406 * Windows only. 8407 * MSVC seems to support it universally as far back as VC6 from what I can tell (haven't checked further back). 8408 * MinGW-64 (both 32- and 64-bit) seems to support it. 8409 * MinGW wraps it in !defined(__STRICT_ANSI__). 8410 * OpenWatcom wraps it in !defined(_NO_EXT_KEYS). 8411 8412 This can be reviewed as compatibility issues arise. The preference is to use _wfopen_s() and _wfopen() as opposed to the wcsrtombs() 8413 fallback, so if you notice your compiler not detecting this properly I'm happy to look at adding support. 8414 */ 8415 #if defined(_WIN32) 8416 #if defined(_MSC_VER) || defined(__MINGW64__) || (!defined(__STRICT_ANSI__) && !defined(_NO_EXT_KEYS)) 8417 #define DRFLAC_HAS_WFOPEN 8418 #endif 8419 #endif 8420 8421 static drflac_result drflac_wfopen(FILE** ppFile, const wchar_t* pFilePath, const wchar_t* pOpenMode, const drflac_allocation_callbacks* pAllocationCallbacks) 8422 { 8423 if (ppFile != NULL) { 8424 *ppFile = NULL; /* Safety. */ 8425 } 8426 8427 if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) { 8428 return DRFLAC_INVALID_ARGS; 8429 } 8430 8431 #if defined(DRFLAC_HAS_WFOPEN) 8432 { 8433 /* Use _wfopen() on Windows. */ 8434 #if defined(_MSC_VER) && _MSC_VER >= 1400 8435 errno_t err = _wfopen_s(ppFile, pFilePath, pOpenMode); 8436 if (err != 0) { 8437 return drflac_result_from_errno(err); 8438 } 8439 #else 8440 *ppFile = _wfopen(pFilePath, pOpenMode); 8441 if (*ppFile == NULL) { 8442 return drflac_result_from_errno(errno); 8443 } 8444 #endif 8445 (void)pAllocationCallbacks; 8446 } 8447 #else 8448 /* 8449 Use fopen() on anything other than Windows. Requires a conversion. This is annoying because fopen() is locale specific. The only real way I can 8450 think of to do this is with wcsrtombs(). Note that wcstombs() is apparently not thread-safe because it uses a static global mbstate_t object for 8451 maintaining state. I've checked this with -std=c89 and it works, but if somebody get's a compiler error I'll look into improving compatibility. 8452 */ 8453 { 8454 mbstate_t mbs; 8455 size_t lenMB; 8456 const wchar_t* pFilePathTemp = pFilePath; 8457 char* pFilePathMB = NULL; 8458 char pOpenModeMB[32] = {0}; 8459 8460 /* Get the length first. */ 8461 DRFLAC_ZERO_OBJECT(&mbs); 8462 lenMB = wcsrtombs(NULL, &pFilePathTemp, 0, &mbs); 8463 if (lenMB == (size_t)-1) { 8464 return drflac_result_from_errno(errno); 8465 } 8466 8467 pFilePathMB = (char*)drflac__malloc_from_callbacks(lenMB + 1, pAllocationCallbacks); 8468 if (pFilePathMB == NULL) { 8469 return DRFLAC_OUT_OF_MEMORY; 8470 } 8471 8472 pFilePathTemp = pFilePath; 8473 DRFLAC_ZERO_OBJECT(&mbs); 8474 wcsrtombs(pFilePathMB, &pFilePathTemp, lenMB + 1, &mbs); 8475 8476 /* The open mode should always consist of ASCII characters so we should be able to do a trivial conversion. */ 8477 { 8478 size_t i = 0; 8479 for (;;) { 8480 if (pOpenMode[i] == 0) { 8481 pOpenModeMB[i] = '\0'; 8482 break; 8483 } 8484 8485 pOpenModeMB[i] = (char)pOpenMode[i]; 8486 i += 1; 8487 } 8488 } 8489 8490 *ppFile = fopen(pFilePathMB, pOpenModeMB); 8491 8492 drflac__free_from_callbacks(pFilePathMB, pAllocationCallbacks); 8493 } 8494 8495 if (*ppFile == NULL) { 8496 return DRFLAC_ERROR; 8497 } 8498 #endif 8499 8500 return DRFLAC_SUCCESS; 8501 } 8502 8503 static size_t drflac__on_read_stdio(void* pUserData, void* bufferOut, size_t bytesToRead) 8504 { 8505 return fread(bufferOut, 1, bytesToRead, (FILE*)pUserData); 8506 } 8507 8508 static drflac_bool32 drflac__on_seek_stdio(void* pUserData, int offset, drflac_seek_origin origin) 8509 { 8510 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 8511 8512 return fseek((FILE*)pUserData, offset, (origin == drflac_seek_origin_current) ? SEEK_CUR : SEEK_SET) == 0; 8513 } 8514 8515 8516 DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks) 8517 { 8518 drflac* pFlac; 8519 FILE* pFile; 8520 8521 if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) { 8522 return NULL; 8523 } 8524 8525 pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks); 8526 if (pFlac == NULL) { 8527 fclose(pFile); 8528 return NULL; 8529 } 8530 8531 return pFlac; 8532 } 8533 8534 DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks) 8535 { 8536 drflac* pFlac; 8537 FILE* pFile; 8538 8539 if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) { 8540 return NULL; 8541 } 8542 8543 pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks); 8544 if (pFlac == NULL) { 8545 fclose(pFile); 8546 return NULL; 8547 } 8548 8549 return pFlac; 8550 } 8551 8552 DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8553 { 8554 drflac* pFlac; 8555 FILE* pFile; 8556 8557 if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) { 8558 return NULL; 8559 } 8560 8561 pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks); 8562 if (pFlac == NULL) { 8563 fclose(pFile); 8564 return pFlac; 8565 } 8566 8567 return pFlac; 8568 } 8569 8570 DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8571 { 8572 drflac* pFlac; 8573 FILE* pFile; 8574 8575 if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) { 8576 return NULL; 8577 } 8578 8579 pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks); 8580 if (pFlac == NULL) { 8581 fclose(pFile); 8582 return pFlac; 8583 } 8584 8585 return pFlac; 8586 } 8587 #endif /* DR_FLAC_NO_STDIO */ 8588 8589 static size_t drflac__on_read_memory(void* pUserData, void* bufferOut, size_t bytesToRead) 8590 { 8591 drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData; 8592 size_t bytesRemaining; 8593 8594 DRFLAC_ASSERT(memoryStream != NULL); 8595 DRFLAC_ASSERT(memoryStream->dataSize >= memoryStream->currentReadPos); 8596 8597 bytesRemaining = memoryStream->dataSize - memoryStream->currentReadPos; 8598 if (bytesToRead > bytesRemaining) { 8599 bytesToRead = bytesRemaining; 8600 } 8601 8602 if (bytesToRead > 0) { 8603 DRFLAC_COPY_MEMORY(bufferOut, memoryStream->data + memoryStream->currentReadPos, bytesToRead); 8604 memoryStream->currentReadPos += bytesToRead; 8605 } 8606 8607 return bytesToRead; 8608 } 8609 8610 static drflac_bool32 drflac__on_seek_memory(void* pUserData, int offset, drflac_seek_origin origin) 8611 { 8612 drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData; 8613 8614 DRFLAC_ASSERT(memoryStream != NULL); 8615 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 8616 8617 if (offset > (drflac_int64)memoryStream->dataSize) { 8618 return DRFLAC_FALSE; 8619 } 8620 8621 if (origin == drflac_seek_origin_current) { 8622 if (memoryStream->currentReadPos + offset <= memoryStream->dataSize) { 8623 memoryStream->currentReadPos += offset; 8624 } else { 8625 return DRFLAC_FALSE; /* Trying to seek too far forward. */ 8626 } 8627 } else { 8628 if ((drflac_uint32)offset <= memoryStream->dataSize) { 8629 memoryStream->currentReadPos = offset; 8630 } else { 8631 return DRFLAC_FALSE; /* Trying to seek too far forward. */ 8632 } 8633 } 8634 8635 return DRFLAC_TRUE; 8636 } 8637 8638 DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks) 8639 { 8640 drflac__memory_stream memoryStream; 8641 drflac* pFlac; 8642 8643 memoryStream.data = (const drflac_uint8*)pData; 8644 memoryStream.dataSize = dataSize; 8645 memoryStream.currentReadPos = 0; 8646 pFlac = drflac_open(drflac__on_read_memory, drflac__on_seek_memory, &memoryStream, pAllocationCallbacks); 8647 if (pFlac == NULL) { 8648 return NULL; 8649 } 8650 8651 pFlac->memoryStream = memoryStream; 8652 8653 /* This is an awful hack... */ 8654 #ifndef DR_FLAC_NO_OGG 8655 if (pFlac->container == drflac_container_ogg) 8656 { 8657 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 8658 oggbs->pUserData = &pFlac->memoryStream; 8659 } 8660 else 8661 #endif 8662 { 8663 pFlac->bs.pUserData = &pFlac->memoryStream; 8664 } 8665 8666 return pFlac; 8667 } 8668 8669 DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8670 { 8671 drflac__memory_stream memoryStream; 8672 drflac* pFlac; 8673 8674 memoryStream.data = (const drflac_uint8*)pData; 8675 memoryStream.dataSize = dataSize; 8676 memoryStream.currentReadPos = 0; 8677 pFlac = drflac_open_with_metadata_private(drflac__on_read_memory, drflac__on_seek_memory, onMeta, drflac_container_unknown, &memoryStream, pUserData, pAllocationCallbacks); 8678 if (pFlac == NULL) { 8679 return NULL; 8680 } 8681 8682 pFlac->memoryStream = memoryStream; 8683 8684 /* This is an awful hack... */ 8685 #ifndef DR_FLAC_NO_OGG 8686 if (pFlac->container == drflac_container_ogg) 8687 { 8688 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 8689 oggbs->pUserData = &pFlac->memoryStream; 8690 } 8691 else 8692 #endif 8693 { 8694 pFlac->bs.pUserData = &pFlac->memoryStream; 8695 } 8696 8697 return pFlac; 8698 } 8699 8700 8701 8702 DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8703 { 8704 return drflac_open_with_metadata_private(onRead, onSeek, NULL, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks); 8705 } 8706 DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8707 { 8708 return drflac_open_with_metadata_private(onRead, onSeek, NULL, container, pUserData, pUserData, pAllocationCallbacks); 8709 } 8710 8711 DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8712 { 8713 return drflac_open_with_metadata_private(onRead, onSeek, onMeta, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks); 8714 } 8715 DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 8716 { 8717 return drflac_open_with_metadata_private(onRead, onSeek, onMeta, container, pUserData, pUserData, pAllocationCallbacks); 8718 } 8719 8720 DRFLAC_API void drflac_close(drflac* pFlac) 8721 { 8722 if (pFlac == NULL) { 8723 return; 8724 } 8725 8726 #ifndef DR_FLAC_NO_STDIO 8727 /* 8728 If we opened the file with drflac_open_file() we will want to close the file handle. We can know whether or not drflac_open_file() 8729 was used by looking at the callbacks. 8730 */ 8731 if (pFlac->bs.onRead == drflac__on_read_stdio) { 8732 fclose((FILE*)pFlac->bs.pUserData); 8733 } 8734 8735 #ifndef DR_FLAC_NO_OGG 8736 /* Need to clean up Ogg streams a bit differently due to the way the bit streaming is chained. */ 8737 if (pFlac->container == drflac_container_ogg) { 8738 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 8739 DRFLAC_ASSERT(pFlac->bs.onRead == drflac__on_read_ogg); 8740 8741 if (oggbs->onRead == drflac__on_read_stdio) { 8742 fclose((FILE*)oggbs->pUserData); 8743 } 8744 } 8745 #endif 8746 #endif 8747 8748 drflac__free_from_callbacks(pFlac, &pFlac->allocationCallbacks); 8749 } 8750 8751 8752 #if 0 8753 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8754 { 8755 drflac_uint64 i; 8756 for (i = 0; i < frameCount; ++i) { 8757 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 8758 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 8759 drflac_uint32 right = left - side; 8760 8761 pOutputSamples[i*2+0] = (drflac_int32)left; 8762 pOutputSamples[i*2+1] = (drflac_int32)right; 8763 } 8764 } 8765 #endif 8766 8767 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8768 { 8769 drflac_uint64 i; 8770 drflac_uint64 frameCount4 = frameCount >> 2; 8771 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 8772 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 8773 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 8774 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 8775 8776 for (i = 0; i < frameCount4; ++i) { 8777 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 8778 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 8779 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 8780 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 8781 8782 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 8783 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 8784 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 8785 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 8786 8787 drflac_uint32 right0 = left0 - side0; 8788 drflac_uint32 right1 = left1 - side1; 8789 drflac_uint32 right2 = left2 - side2; 8790 drflac_uint32 right3 = left3 - side3; 8791 8792 pOutputSamples[i*8+0] = (drflac_int32)left0; 8793 pOutputSamples[i*8+1] = (drflac_int32)right0; 8794 pOutputSamples[i*8+2] = (drflac_int32)left1; 8795 pOutputSamples[i*8+3] = (drflac_int32)right1; 8796 pOutputSamples[i*8+4] = (drflac_int32)left2; 8797 pOutputSamples[i*8+5] = (drflac_int32)right2; 8798 pOutputSamples[i*8+6] = (drflac_int32)left3; 8799 pOutputSamples[i*8+7] = (drflac_int32)right3; 8800 } 8801 8802 for (i = (frameCount4 << 2); i < frameCount; ++i) { 8803 drflac_uint32 left = pInputSamples0U32[i] << shift0; 8804 drflac_uint32 side = pInputSamples1U32[i] << shift1; 8805 drflac_uint32 right = left - side; 8806 8807 pOutputSamples[i*2+0] = (drflac_int32)left; 8808 pOutputSamples[i*2+1] = (drflac_int32)right; 8809 } 8810 } 8811 8812 #if defined(DRFLAC_SUPPORT_SSE2) 8813 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8814 { 8815 drflac_uint64 i; 8816 drflac_uint64 frameCount4 = frameCount >> 2; 8817 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 8818 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 8819 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 8820 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 8821 8822 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 8823 8824 for (i = 0; i < frameCount4; ++i) { 8825 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 8826 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 8827 __m128i right = _mm_sub_epi32(left, side); 8828 8829 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 8830 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 8831 } 8832 8833 for (i = (frameCount4 << 2); i < frameCount; ++i) { 8834 drflac_uint32 left = pInputSamples0U32[i] << shift0; 8835 drflac_uint32 side = pInputSamples1U32[i] << shift1; 8836 drflac_uint32 right = left - side; 8837 8838 pOutputSamples[i*2+0] = (drflac_int32)left; 8839 pOutputSamples[i*2+1] = (drflac_int32)right; 8840 } 8841 } 8842 #endif 8843 8844 #if defined(DRFLAC_SUPPORT_NEON) 8845 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8846 { 8847 drflac_uint64 i; 8848 drflac_uint64 frameCount4 = frameCount >> 2; 8849 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 8850 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 8851 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 8852 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 8853 int32x4_t shift0_4; 8854 int32x4_t shift1_4; 8855 8856 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 8857 8858 shift0_4 = vdupq_n_s32(shift0); 8859 shift1_4 = vdupq_n_s32(shift1); 8860 8861 for (i = 0; i < frameCount4; ++i) { 8862 uint32x4_t left; 8863 uint32x4_t side; 8864 uint32x4_t right; 8865 8866 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 8867 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 8868 right = vsubq_u32(left, side); 8869 8870 drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right)); 8871 } 8872 8873 for (i = (frameCount4 << 2); i < frameCount; ++i) { 8874 drflac_uint32 left = pInputSamples0U32[i] << shift0; 8875 drflac_uint32 side = pInputSamples1U32[i] << shift1; 8876 drflac_uint32 right = left - side; 8877 8878 pOutputSamples[i*2+0] = (drflac_int32)left; 8879 pOutputSamples[i*2+1] = (drflac_int32)right; 8880 } 8881 } 8882 #endif 8883 8884 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8885 { 8886 #if defined(DRFLAC_SUPPORT_SSE2) 8887 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 8888 drflac_read_pcm_frames_s32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 8889 } else 8890 #elif defined(DRFLAC_SUPPORT_NEON) 8891 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 8892 drflac_read_pcm_frames_s32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 8893 } else 8894 #endif 8895 { 8896 /* Scalar fallback. */ 8897 #if 0 8898 drflac_read_pcm_frames_s32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 8899 #else 8900 drflac_read_pcm_frames_s32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 8901 #endif 8902 } 8903 } 8904 8905 8906 #if 0 8907 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8908 { 8909 drflac_uint64 i; 8910 for (i = 0; i < frameCount; ++i) { 8911 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 8912 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 8913 drflac_uint32 left = right + side; 8914 8915 pOutputSamples[i*2+0] = (drflac_int32)left; 8916 pOutputSamples[i*2+1] = (drflac_int32)right; 8917 } 8918 } 8919 #endif 8920 8921 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8922 { 8923 drflac_uint64 i; 8924 drflac_uint64 frameCount4 = frameCount >> 2; 8925 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 8926 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 8927 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 8928 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 8929 8930 for (i = 0; i < frameCount4; ++i) { 8931 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 8932 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 8933 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 8934 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 8935 8936 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 8937 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 8938 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 8939 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 8940 8941 drflac_uint32 left0 = right0 + side0; 8942 drflac_uint32 left1 = right1 + side1; 8943 drflac_uint32 left2 = right2 + side2; 8944 drflac_uint32 left3 = right3 + side3; 8945 8946 pOutputSamples[i*8+0] = (drflac_int32)left0; 8947 pOutputSamples[i*8+1] = (drflac_int32)right0; 8948 pOutputSamples[i*8+2] = (drflac_int32)left1; 8949 pOutputSamples[i*8+3] = (drflac_int32)right1; 8950 pOutputSamples[i*8+4] = (drflac_int32)left2; 8951 pOutputSamples[i*8+5] = (drflac_int32)right2; 8952 pOutputSamples[i*8+6] = (drflac_int32)left3; 8953 pOutputSamples[i*8+7] = (drflac_int32)right3; 8954 } 8955 8956 for (i = (frameCount4 << 2); i < frameCount; ++i) { 8957 drflac_uint32 side = pInputSamples0U32[i] << shift0; 8958 drflac_uint32 right = pInputSamples1U32[i] << shift1; 8959 drflac_uint32 left = right + side; 8960 8961 pOutputSamples[i*2+0] = (drflac_int32)left; 8962 pOutputSamples[i*2+1] = (drflac_int32)right; 8963 } 8964 } 8965 8966 #if defined(DRFLAC_SUPPORT_SSE2) 8967 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 8968 { 8969 drflac_uint64 i; 8970 drflac_uint64 frameCount4 = frameCount >> 2; 8971 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 8972 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 8973 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 8974 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 8975 8976 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 8977 8978 for (i = 0; i < frameCount4; ++i) { 8979 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 8980 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 8981 __m128i left = _mm_add_epi32(right, side); 8982 8983 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 8984 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 8985 } 8986 8987 for (i = (frameCount4 << 2); i < frameCount; ++i) { 8988 drflac_uint32 side = pInputSamples0U32[i] << shift0; 8989 drflac_uint32 right = pInputSamples1U32[i] << shift1; 8990 drflac_uint32 left = right + side; 8991 8992 pOutputSamples[i*2+0] = (drflac_int32)left; 8993 pOutputSamples[i*2+1] = (drflac_int32)right; 8994 } 8995 } 8996 #endif 8997 8998 #if defined(DRFLAC_SUPPORT_NEON) 8999 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9000 { 9001 drflac_uint64 i; 9002 drflac_uint64 frameCount4 = frameCount >> 2; 9003 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9004 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9005 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9006 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9007 int32x4_t shift0_4; 9008 int32x4_t shift1_4; 9009 9010 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9011 9012 shift0_4 = vdupq_n_s32(shift0); 9013 shift1_4 = vdupq_n_s32(shift1); 9014 9015 for (i = 0; i < frameCount4; ++i) { 9016 uint32x4_t side; 9017 uint32x4_t right; 9018 uint32x4_t left; 9019 9020 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 9021 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 9022 left = vaddq_u32(right, side); 9023 9024 drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right)); 9025 } 9026 9027 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9028 drflac_uint32 side = pInputSamples0U32[i] << shift0; 9029 drflac_uint32 right = pInputSamples1U32[i] << shift1; 9030 drflac_uint32 left = right + side; 9031 9032 pOutputSamples[i*2+0] = (drflac_int32)left; 9033 pOutputSamples[i*2+1] = (drflac_int32)right; 9034 } 9035 } 9036 #endif 9037 9038 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9039 { 9040 #if defined(DRFLAC_SUPPORT_SSE2) 9041 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 9042 drflac_read_pcm_frames_s32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9043 } else 9044 #elif defined(DRFLAC_SUPPORT_NEON) 9045 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 9046 drflac_read_pcm_frames_s32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9047 } else 9048 #endif 9049 { 9050 /* Scalar fallback. */ 9051 #if 0 9052 drflac_read_pcm_frames_s32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9053 #else 9054 drflac_read_pcm_frames_s32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9055 #endif 9056 } 9057 } 9058 9059 9060 #if 0 9061 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9062 { 9063 for (drflac_uint64 i = 0; i < frameCount; ++i) { 9064 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9065 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9066 9067 mid = (mid << 1) | (side & 0x01); 9068 9069 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample); 9070 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample); 9071 } 9072 } 9073 #endif 9074 9075 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9076 { 9077 drflac_uint64 i; 9078 drflac_uint64 frameCount4 = frameCount >> 2; 9079 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9080 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9081 drflac_int32 shift = unusedBitsPerSample; 9082 9083 if (shift > 0) { 9084 shift -= 1; 9085 for (i = 0; i < frameCount4; ++i) { 9086 drflac_uint32 temp0L; 9087 drflac_uint32 temp1L; 9088 drflac_uint32 temp2L; 9089 drflac_uint32 temp3L; 9090 drflac_uint32 temp0R; 9091 drflac_uint32 temp1R; 9092 drflac_uint32 temp2R; 9093 drflac_uint32 temp3R; 9094 9095 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9096 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9097 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9098 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9099 9100 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9101 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9102 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9103 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9104 9105 mid0 = (mid0 << 1) | (side0 & 0x01); 9106 mid1 = (mid1 << 1) | (side1 & 0x01); 9107 mid2 = (mid2 << 1) | (side2 & 0x01); 9108 mid3 = (mid3 << 1) | (side3 & 0x01); 9109 9110 temp0L = (mid0 + side0) << shift; 9111 temp1L = (mid1 + side1) << shift; 9112 temp2L = (mid2 + side2) << shift; 9113 temp3L = (mid3 + side3) << shift; 9114 9115 temp0R = (mid0 - side0) << shift; 9116 temp1R = (mid1 - side1) << shift; 9117 temp2R = (mid2 - side2) << shift; 9118 temp3R = (mid3 - side3) << shift; 9119 9120 pOutputSamples[i*8+0] = (drflac_int32)temp0L; 9121 pOutputSamples[i*8+1] = (drflac_int32)temp0R; 9122 pOutputSamples[i*8+2] = (drflac_int32)temp1L; 9123 pOutputSamples[i*8+3] = (drflac_int32)temp1R; 9124 pOutputSamples[i*8+4] = (drflac_int32)temp2L; 9125 pOutputSamples[i*8+5] = (drflac_int32)temp2R; 9126 pOutputSamples[i*8+6] = (drflac_int32)temp3L; 9127 pOutputSamples[i*8+7] = (drflac_int32)temp3R; 9128 } 9129 } else { 9130 for (i = 0; i < frameCount4; ++i) { 9131 drflac_uint32 temp0L; 9132 drflac_uint32 temp1L; 9133 drflac_uint32 temp2L; 9134 drflac_uint32 temp3L; 9135 drflac_uint32 temp0R; 9136 drflac_uint32 temp1R; 9137 drflac_uint32 temp2R; 9138 drflac_uint32 temp3R; 9139 9140 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9141 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9142 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9143 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9144 9145 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9146 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9147 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9148 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9149 9150 mid0 = (mid0 << 1) | (side0 & 0x01); 9151 mid1 = (mid1 << 1) | (side1 & 0x01); 9152 mid2 = (mid2 << 1) | (side2 & 0x01); 9153 mid3 = (mid3 << 1) | (side3 & 0x01); 9154 9155 temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1); 9156 temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1); 9157 temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1); 9158 temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1); 9159 9160 temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1); 9161 temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1); 9162 temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1); 9163 temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1); 9164 9165 pOutputSamples[i*8+0] = (drflac_int32)temp0L; 9166 pOutputSamples[i*8+1] = (drflac_int32)temp0R; 9167 pOutputSamples[i*8+2] = (drflac_int32)temp1L; 9168 pOutputSamples[i*8+3] = (drflac_int32)temp1R; 9169 pOutputSamples[i*8+4] = (drflac_int32)temp2L; 9170 pOutputSamples[i*8+5] = (drflac_int32)temp2R; 9171 pOutputSamples[i*8+6] = (drflac_int32)temp3L; 9172 pOutputSamples[i*8+7] = (drflac_int32)temp3R; 9173 } 9174 } 9175 9176 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9177 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9178 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9179 9180 mid = (mid << 1) | (side & 0x01); 9181 9182 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample); 9183 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample); 9184 } 9185 } 9186 9187 #if defined(DRFLAC_SUPPORT_SSE2) 9188 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9189 { 9190 drflac_uint64 i; 9191 drflac_uint64 frameCount4 = frameCount >> 2; 9192 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9193 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9194 drflac_int32 shift = unusedBitsPerSample; 9195 9196 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9197 9198 if (shift == 0) { 9199 for (i = 0; i < frameCount4; ++i) { 9200 __m128i mid; 9201 __m128i side; 9202 __m128i left; 9203 __m128i right; 9204 9205 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 9206 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 9207 9208 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 9209 9210 left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 9211 right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 9212 9213 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 9214 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 9215 } 9216 9217 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9218 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9219 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9220 9221 mid = (mid << 1) | (side & 0x01); 9222 9223 pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1; 9224 pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1; 9225 } 9226 } else { 9227 shift -= 1; 9228 for (i = 0; i < frameCount4; ++i) { 9229 __m128i mid; 9230 __m128i side; 9231 __m128i left; 9232 __m128i right; 9233 9234 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 9235 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 9236 9237 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 9238 9239 left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 9240 right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 9241 9242 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 9243 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 9244 } 9245 9246 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9247 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9248 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9249 9250 mid = (mid << 1) | (side & 0x01); 9251 9252 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift); 9253 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift); 9254 } 9255 } 9256 } 9257 #endif 9258 9259 #if defined(DRFLAC_SUPPORT_NEON) 9260 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9261 { 9262 drflac_uint64 i; 9263 drflac_uint64 frameCount4 = frameCount >> 2; 9264 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9265 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9266 drflac_int32 shift = unusedBitsPerSample; 9267 int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */ 9268 int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */ 9269 uint32x4_t one4; 9270 9271 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9272 9273 wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 9274 wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 9275 one4 = vdupq_n_u32(1); 9276 9277 if (shift == 0) { 9278 for (i = 0; i < frameCount4; ++i) { 9279 uint32x4_t mid; 9280 uint32x4_t side; 9281 int32x4_t left; 9282 int32x4_t right; 9283 9284 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 9285 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 9286 9287 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4)); 9288 9289 left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 9290 right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 9291 9292 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 9293 } 9294 9295 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9296 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9297 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9298 9299 mid = (mid << 1) | (side & 0x01); 9300 9301 pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1; 9302 pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1; 9303 } 9304 } else { 9305 int32x4_t shift4; 9306 9307 shift -= 1; 9308 shift4 = vdupq_n_s32(shift); 9309 9310 for (i = 0; i < frameCount4; ++i) { 9311 uint32x4_t mid; 9312 uint32x4_t side; 9313 int32x4_t left; 9314 int32x4_t right; 9315 9316 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 9317 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 9318 9319 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4)); 9320 9321 left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 9322 right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 9323 9324 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 9325 } 9326 9327 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9328 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9329 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9330 9331 mid = (mid << 1) | (side & 0x01); 9332 9333 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift); 9334 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift); 9335 } 9336 } 9337 } 9338 #endif 9339 9340 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9341 { 9342 #if defined(DRFLAC_SUPPORT_SSE2) 9343 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 9344 drflac_read_pcm_frames_s32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9345 } else 9346 #elif defined(DRFLAC_SUPPORT_NEON) 9347 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 9348 drflac_read_pcm_frames_s32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9349 } else 9350 #endif 9351 { 9352 /* Scalar fallback. */ 9353 #if 0 9354 drflac_read_pcm_frames_s32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9355 #else 9356 drflac_read_pcm_frames_s32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9357 #endif 9358 } 9359 } 9360 9361 9362 #if 0 9363 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9364 { 9365 for (drflac_uint64 i = 0; i < frameCount; ++i) { 9366 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)); 9367 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)); 9368 } 9369 } 9370 #endif 9371 9372 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9373 { 9374 drflac_uint64 i; 9375 drflac_uint64 frameCount4 = frameCount >> 2; 9376 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9377 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9378 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9379 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9380 9381 for (i = 0; i < frameCount4; ++i) { 9382 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 9383 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 9384 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 9385 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 9386 9387 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 9388 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 9389 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 9390 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 9391 9392 pOutputSamples[i*8+0] = (drflac_int32)tempL0; 9393 pOutputSamples[i*8+1] = (drflac_int32)tempR0; 9394 pOutputSamples[i*8+2] = (drflac_int32)tempL1; 9395 pOutputSamples[i*8+3] = (drflac_int32)tempR1; 9396 pOutputSamples[i*8+4] = (drflac_int32)tempL2; 9397 pOutputSamples[i*8+5] = (drflac_int32)tempR2; 9398 pOutputSamples[i*8+6] = (drflac_int32)tempL3; 9399 pOutputSamples[i*8+7] = (drflac_int32)tempR3; 9400 } 9401 9402 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9403 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 9404 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 9405 } 9406 } 9407 9408 #if defined(DRFLAC_SUPPORT_SSE2) 9409 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9410 { 9411 drflac_uint64 i; 9412 drflac_uint64 frameCount4 = frameCount >> 2; 9413 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9414 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9415 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9416 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9417 9418 for (i = 0; i < frameCount4; ++i) { 9419 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 9420 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 9421 9422 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 9423 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 9424 } 9425 9426 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9427 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 9428 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 9429 } 9430 } 9431 #endif 9432 9433 #if defined(DRFLAC_SUPPORT_NEON) 9434 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9435 { 9436 drflac_uint64 i; 9437 drflac_uint64 frameCount4 = frameCount >> 2; 9438 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9439 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9440 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9441 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9442 9443 int32x4_t shift4_0 = vdupq_n_s32(shift0); 9444 int32x4_t shift4_1 = vdupq_n_s32(shift1); 9445 9446 for (i = 0; i < frameCount4; ++i) { 9447 int32x4_t left; 9448 int32x4_t right; 9449 9450 left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift4_0)); 9451 right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift4_1)); 9452 9453 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 9454 } 9455 9456 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9457 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 9458 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 9459 } 9460 } 9461 #endif 9462 9463 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 9464 { 9465 #if defined(DRFLAC_SUPPORT_SSE2) 9466 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 9467 drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9468 } else 9469 #elif defined(DRFLAC_SUPPORT_NEON) 9470 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 9471 drflac_read_pcm_frames_s32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9472 } else 9473 #endif 9474 { 9475 /* Scalar fallback. */ 9476 #if 0 9477 drflac_read_pcm_frames_s32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9478 #else 9479 drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9480 #endif 9481 } 9482 } 9483 9484 9485 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut) 9486 { 9487 drflac_uint64 framesRead; 9488 drflac_uint32 unusedBitsPerSample; 9489 9490 if (pFlac == NULL || framesToRead == 0) { 9491 return 0; 9492 } 9493 9494 if (pBufferOut == NULL) { 9495 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 9496 } 9497 9498 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 9499 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 9500 9501 framesRead = 0; 9502 while (framesToRead > 0) { 9503 /* If we've run out of samples in this frame, go to the next. */ 9504 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 9505 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 9506 break; /* Couldn't read the next frame, so just break from the loop and return. */ 9507 } 9508 } else { 9509 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 9510 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 9511 drflac_uint64 frameCountThisIteration = framesToRead; 9512 9513 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 9514 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 9515 } 9516 9517 if (channelCount == 2) { 9518 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 9519 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 9520 9521 switch (pFlac->currentFLACFrame.header.channelAssignment) 9522 { 9523 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 9524 { 9525 drflac_read_pcm_frames_s32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 9526 } break; 9527 9528 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 9529 { 9530 drflac_read_pcm_frames_s32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 9531 } break; 9532 9533 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 9534 { 9535 drflac_read_pcm_frames_s32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 9536 } break; 9537 9538 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 9539 default: 9540 { 9541 drflac_read_pcm_frames_s32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 9542 } break; 9543 } 9544 } else { 9545 /* Generic interleaving. */ 9546 drflac_uint64 i; 9547 for (i = 0; i < frameCountThisIteration; ++i) { 9548 unsigned int j; 9549 for (j = 0; j < channelCount; ++j) { 9550 pBufferOut[(i*channelCount)+j] = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 9551 } 9552 } 9553 } 9554 9555 framesRead += frameCountThisIteration; 9556 pBufferOut += frameCountThisIteration * channelCount; 9557 framesToRead -= frameCountThisIteration; 9558 pFlac->currentPCMFrame += frameCountThisIteration; 9559 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration; 9560 } 9561 } 9562 9563 return framesRead; 9564 } 9565 9566 9567 #if 0 9568 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9569 { 9570 drflac_uint64 i; 9571 for (i = 0; i < frameCount; ++i) { 9572 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 9573 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 9574 drflac_uint32 right = left - side; 9575 9576 left >>= 16; 9577 right >>= 16; 9578 9579 pOutputSamples[i*2+0] = (drflac_int16)left; 9580 pOutputSamples[i*2+1] = (drflac_int16)right; 9581 } 9582 } 9583 #endif 9584 9585 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9586 { 9587 drflac_uint64 i; 9588 drflac_uint64 frameCount4 = frameCount >> 2; 9589 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9590 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9591 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9592 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9593 9594 for (i = 0; i < frameCount4; ++i) { 9595 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 9596 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 9597 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 9598 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 9599 9600 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 9601 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 9602 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 9603 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 9604 9605 drflac_uint32 right0 = left0 - side0; 9606 drflac_uint32 right1 = left1 - side1; 9607 drflac_uint32 right2 = left2 - side2; 9608 drflac_uint32 right3 = left3 - side3; 9609 9610 left0 >>= 16; 9611 left1 >>= 16; 9612 left2 >>= 16; 9613 left3 >>= 16; 9614 9615 right0 >>= 16; 9616 right1 >>= 16; 9617 right2 >>= 16; 9618 right3 >>= 16; 9619 9620 pOutputSamples[i*8+0] = (drflac_int16)left0; 9621 pOutputSamples[i*8+1] = (drflac_int16)right0; 9622 pOutputSamples[i*8+2] = (drflac_int16)left1; 9623 pOutputSamples[i*8+3] = (drflac_int16)right1; 9624 pOutputSamples[i*8+4] = (drflac_int16)left2; 9625 pOutputSamples[i*8+5] = (drflac_int16)right2; 9626 pOutputSamples[i*8+6] = (drflac_int16)left3; 9627 pOutputSamples[i*8+7] = (drflac_int16)right3; 9628 } 9629 9630 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9631 drflac_uint32 left = pInputSamples0U32[i] << shift0; 9632 drflac_uint32 side = pInputSamples1U32[i] << shift1; 9633 drflac_uint32 right = left - side; 9634 9635 left >>= 16; 9636 right >>= 16; 9637 9638 pOutputSamples[i*2+0] = (drflac_int16)left; 9639 pOutputSamples[i*2+1] = (drflac_int16)right; 9640 } 9641 } 9642 9643 #if defined(DRFLAC_SUPPORT_SSE2) 9644 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9645 { 9646 drflac_uint64 i; 9647 drflac_uint64 frameCount4 = frameCount >> 2; 9648 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9649 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9650 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9651 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9652 9653 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9654 9655 for (i = 0; i < frameCount4; ++i) { 9656 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 9657 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 9658 __m128i right = _mm_sub_epi32(left, side); 9659 9660 left = _mm_srai_epi32(left, 16); 9661 right = _mm_srai_epi32(right, 16); 9662 9663 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 9664 } 9665 9666 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9667 drflac_uint32 left = pInputSamples0U32[i] << shift0; 9668 drflac_uint32 side = pInputSamples1U32[i] << shift1; 9669 drflac_uint32 right = left - side; 9670 9671 left >>= 16; 9672 right >>= 16; 9673 9674 pOutputSamples[i*2+0] = (drflac_int16)left; 9675 pOutputSamples[i*2+1] = (drflac_int16)right; 9676 } 9677 } 9678 #endif 9679 9680 #if defined(DRFLAC_SUPPORT_NEON) 9681 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9682 { 9683 drflac_uint64 i; 9684 drflac_uint64 frameCount4 = frameCount >> 2; 9685 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9686 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9687 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9688 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9689 int32x4_t shift0_4; 9690 int32x4_t shift1_4; 9691 9692 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9693 9694 shift0_4 = vdupq_n_s32(shift0); 9695 shift1_4 = vdupq_n_s32(shift1); 9696 9697 for (i = 0; i < frameCount4; ++i) { 9698 uint32x4_t left; 9699 uint32x4_t side; 9700 uint32x4_t right; 9701 9702 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 9703 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 9704 right = vsubq_u32(left, side); 9705 9706 left = vshrq_n_u32(left, 16); 9707 right = vshrq_n_u32(right, 16); 9708 9709 drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right))); 9710 } 9711 9712 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9713 drflac_uint32 left = pInputSamples0U32[i] << shift0; 9714 drflac_uint32 side = pInputSamples1U32[i] << shift1; 9715 drflac_uint32 right = left - side; 9716 9717 left >>= 16; 9718 right >>= 16; 9719 9720 pOutputSamples[i*2+0] = (drflac_int16)left; 9721 pOutputSamples[i*2+1] = (drflac_int16)right; 9722 } 9723 } 9724 #endif 9725 9726 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9727 { 9728 #if defined(DRFLAC_SUPPORT_SSE2) 9729 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 9730 drflac_read_pcm_frames_s16__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9731 } else 9732 #elif defined(DRFLAC_SUPPORT_NEON) 9733 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 9734 drflac_read_pcm_frames_s16__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9735 } else 9736 #endif 9737 { 9738 /* Scalar fallback. */ 9739 #if 0 9740 drflac_read_pcm_frames_s16__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9741 #else 9742 drflac_read_pcm_frames_s16__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9743 #endif 9744 } 9745 } 9746 9747 9748 #if 0 9749 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9750 { 9751 drflac_uint64 i; 9752 for (i = 0; i < frameCount; ++i) { 9753 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 9754 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 9755 drflac_uint32 left = right + side; 9756 9757 left >>= 16; 9758 right >>= 16; 9759 9760 pOutputSamples[i*2+0] = (drflac_int16)left; 9761 pOutputSamples[i*2+1] = (drflac_int16)right; 9762 } 9763 } 9764 #endif 9765 9766 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9767 { 9768 drflac_uint64 i; 9769 drflac_uint64 frameCount4 = frameCount >> 2; 9770 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9771 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9772 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9773 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9774 9775 for (i = 0; i < frameCount4; ++i) { 9776 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 9777 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 9778 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 9779 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 9780 9781 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 9782 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 9783 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 9784 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 9785 9786 drflac_uint32 left0 = right0 + side0; 9787 drflac_uint32 left1 = right1 + side1; 9788 drflac_uint32 left2 = right2 + side2; 9789 drflac_uint32 left3 = right3 + side3; 9790 9791 left0 >>= 16; 9792 left1 >>= 16; 9793 left2 >>= 16; 9794 left3 >>= 16; 9795 9796 right0 >>= 16; 9797 right1 >>= 16; 9798 right2 >>= 16; 9799 right3 >>= 16; 9800 9801 pOutputSamples[i*8+0] = (drflac_int16)left0; 9802 pOutputSamples[i*8+1] = (drflac_int16)right0; 9803 pOutputSamples[i*8+2] = (drflac_int16)left1; 9804 pOutputSamples[i*8+3] = (drflac_int16)right1; 9805 pOutputSamples[i*8+4] = (drflac_int16)left2; 9806 pOutputSamples[i*8+5] = (drflac_int16)right2; 9807 pOutputSamples[i*8+6] = (drflac_int16)left3; 9808 pOutputSamples[i*8+7] = (drflac_int16)right3; 9809 } 9810 9811 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9812 drflac_uint32 side = pInputSamples0U32[i] << shift0; 9813 drflac_uint32 right = pInputSamples1U32[i] << shift1; 9814 drflac_uint32 left = right + side; 9815 9816 left >>= 16; 9817 right >>= 16; 9818 9819 pOutputSamples[i*2+0] = (drflac_int16)left; 9820 pOutputSamples[i*2+1] = (drflac_int16)right; 9821 } 9822 } 9823 9824 #if defined(DRFLAC_SUPPORT_SSE2) 9825 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9826 { 9827 drflac_uint64 i; 9828 drflac_uint64 frameCount4 = frameCount >> 2; 9829 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9830 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9831 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9832 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9833 9834 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9835 9836 for (i = 0; i < frameCount4; ++i) { 9837 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 9838 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 9839 __m128i left = _mm_add_epi32(right, side); 9840 9841 left = _mm_srai_epi32(left, 16); 9842 right = _mm_srai_epi32(right, 16); 9843 9844 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 9845 } 9846 9847 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9848 drflac_uint32 side = pInputSamples0U32[i] << shift0; 9849 drflac_uint32 right = pInputSamples1U32[i] << shift1; 9850 drflac_uint32 left = right + side; 9851 9852 left >>= 16; 9853 right >>= 16; 9854 9855 pOutputSamples[i*2+0] = (drflac_int16)left; 9856 pOutputSamples[i*2+1] = (drflac_int16)right; 9857 } 9858 } 9859 #endif 9860 9861 #if defined(DRFLAC_SUPPORT_NEON) 9862 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9863 { 9864 drflac_uint64 i; 9865 drflac_uint64 frameCount4 = frameCount >> 2; 9866 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9867 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9868 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9869 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9870 int32x4_t shift0_4; 9871 int32x4_t shift1_4; 9872 9873 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 9874 9875 shift0_4 = vdupq_n_s32(shift0); 9876 shift1_4 = vdupq_n_s32(shift1); 9877 9878 for (i = 0; i < frameCount4; ++i) { 9879 uint32x4_t side; 9880 uint32x4_t right; 9881 uint32x4_t left; 9882 9883 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 9884 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 9885 left = vaddq_u32(right, side); 9886 9887 left = vshrq_n_u32(left, 16); 9888 right = vshrq_n_u32(right, 16); 9889 9890 drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right))); 9891 } 9892 9893 for (i = (frameCount4 << 2); i < frameCount; ++i) { 9894 drflac_uint32 side = pInputSamples0U32[i] << shift0; 9895 drflac_uint32 right = pInputSamples1U32[i] << shift1; 9896 drflac_uint32 left = right + side; 9897 9898 left >>= 16; 9899 right >>= 16; 9900 9901 pOutputSamples[i*2+0] = (drflac_int16)left; 9902 pOutputSamples[i*2+1] = (drflac_int16)right; 9903 } 9904 } 9905 #endif 9906 9907 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9908 { 9909 #if defined(DRFLAC_SUPPORT_SSE2) 9910 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 9911 drflac_read_pcm_frames_s16__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9912 } else 9913 #elif defined(DRFLAC_SUPPORT_NEON) 9914 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 9915 drflac_read_pcm_frames_s16__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9916 } else 9917 #endif 9918 { 9919 /* Scalar fallback. */ 9920 #if 0 9921 drflac_read_pcm_frames_s16__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9922 #else 9923 drflac_read_pcm_frames_s16__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 9924 #endif 9925 } 9926 } 9927 9928 9929 #if 0 9930 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9931 { 9932 for (drflac_uint64 i = 0; i < frameCount; ++i) { 9933 drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9934 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9935 9936 mid = (mid << 1) | (side & 0x01); 9937 9938 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16); 9939 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16); 9940 } 9941 } 9942 #endif 9943 9944 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 9945 { 9946 drflac_uint64 i; 9947 drflac_uint64 frameCount4 = frameCount >> 2; 9948 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 9949 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 9950 drflac_uint32 shift = unusedBitsPerSample; 9951 9952 if (shift > 0) { 9953 shift -= 1; 9954 for (i = 0; i < frameCount4; ++i) { 9955 drflac_uint32 temp0L; 9956 drflac_uint32 temp1L; 9957 drflac_uint32 temp2L; 9958 drflac_uint32 temp3L; 9959 drflac_uint32 temp0R; 9960 drflac_uint32 temp1R; 9961 drflac_uint32 temp2R; 9962 drflac_uint32 temp3R; 9963 9964 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9965 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9966 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9967 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 9968 9969 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9970 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9971 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9972 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 9973 9974 mid0 = (mid0 << 1) | (side0 & 0x01); 9975 mid1 = (mid1 << 1) | (side1 & 0x01); 9976 mid2 = (mid2 << 1) | (side2 & 0x01); 9977 mid3 = (mid3 << 1) | (side3 & 0x01); 9978 9979 temp0L = (mid0 + side0) << shift; 9980 temp1L = (mid1 + side1) << shift; 9981 temp2L = (mid2 + side2) << shift; 9982 temp3L = (mid3 + side3) << shift; 9983 9984 temp0R = (mid0 - side0) << shift; 9985 temp1R = (mid1 - side1) << shift; 9986 temp2R = (mid2 - side2) << shift; 9987 temp3R = (mid3 - side3) << shift; 9988 9989 temp0L >>= 16; 9990 temp1L >>= 16; 9991 temp2L >>= 16; 9992 temp3L >>= 16; 9993 9994 temp0R >>= 16; 9995 temp1R >>= 16; 9996 temp2R >>= 16; 9997 temp3R >>= 16; 9998 9999 pOutputSamples[i*8+0] = (drflac_int16)temp0L; 10000 pOutputSamples[i*8+1] = (drflac_int16)temp0R; 10001 pOutputSamples[i*8+2] = (drflac_int16)temp1L; 10002 pOutputSamples[i*8+3] = (drflac_int16)temp1R; 10003 pOutputSamples[i*8+4] = (drflac_int16)temp2L; 10004 pOutputSamples[i*8+5] = (drflac_int16)temp2R; 10005 pOutputSamples[i*8+6] = (drflac_int16)temp3L; 10006 pOutputSamples[i*8+7] = (drflac_int16)temp3R; 10007 } 10008 } else { 10009 for (i = 0; i < frameCount4; ++i) { 10010 drflac_uint32 temp0L; 10011 drflac_uint32 temp1L; 10012 drflac_uint32 temp2L; 10013 drflac_uint32 temp3L; 10014 drflac_uint32 temp0R; 10015 drflac_uint32 temp1R; 10016 drflac_uint32 temp2R; 10017 drflac_uint32 temp3R; 10018 10019 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10020 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10021 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10022 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10023 10024 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10025 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10026 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10027 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10028 10029 mid0 = (mid0 << 1) | (side0 & 0x01); 10030 mid1 = (mid1 << 1) | (side1 & 0x01); 10031 mid2 = (mid2 << 1) | (side2 & 0x01); 10032 mid3 = (mid3 << 1) | (side3 & 0x01); 10033 10034 temp0L = ((drflac_int32)(mid0 + side0) >> 1); 10035 temp1L = ((drflac_int32)(mid1 + side1) >> 1); 10036 temp2L = ((drflac_int32)(mid2 + side2) >> 1); 10037 temp3L = ((drflac_int32)(mid3 + side3) >> 1); 10038 10039 temp0R = ((drflac_int32)(mid0 - side0) >> 1); 10040 temp1R = ((drflac_int32)(mid1 - side1) >> 1); 10041 temp2R = ((drflac_int32)(mid2 - side2) >> 1); 10042 temp3R = ((drflac_int32)(mid3 - side3) >> 1); 10043 10044 temp0L >>= 16; 10045 temp1L >>= 16; 10046 temp2L >>= 16; 10047 temp3L >>= 16; 10048 10049 temp0R >>= 16; 10050 temp1R >>= 16; 10051 temp2R >>= 16; 10052 temp3R >>= 16; 10053 10054 pOutputSamples[i*8+0] = (drflac_int16)temp0L; 10055 pOutputSamples[i*8+1] = (drflac_int16)temp0R; 10056 pOutputSamples[i*8+2] = (drflac_int16)temp1L; 10057 pOutputSamples[i*8+3] = (drflac_int16)temp1R; 10058 pOutputSamples[i*8+4] = (drflac_int16)temp2L; 10059 pOutputSamples[i*8+5] = (drflac_int16)temp2R; 10060 pOutputSamples[i*8+6] = (drflac_int16)temp3L; 10061 pOutputSamples[i*8+7] = (drflac_int16)temp3R; 10062 } 10063 } 10064 10065 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10066 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10067 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10068 10069 mid = (mid << 1) | (side & 0x01); 10070 10071 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16); 10072 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16); 10073 } 10074 } 10075 10076 #if defined(DRFLAC_SUPPORT_SSE2) 10077 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10078 { 10079 drflac_uint64 i; 10080 drflac_uint64 frameCount4 = frameCount >> 2; 10081 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10082 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10083 drflac_uint32 shift = unusedBitsPerSample; 10084 10085 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10086 10087 if (shift == 0) { 10088 for (i = 0; i < frameCount4; ++i) { 10089 __m128i mid; 10090 __m128i side; 10091 __m128i left; 10092 __m128i right; 10093 10094 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10095 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10096 10097 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10098 10099 left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 10100 right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 10101 10102 left = _mm_srai_epi32(left, 16); 10103 right = _mm_srai_epi32(right, 16); 10104 10105 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10106 } 10107 10108 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10109 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10110 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10111 10112 mid = (mid << 1) | (side & 0x01); 10113 10114 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16); 10115 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16); 10116 } 10117 } else { 10118 shift -= 1; 10119 for (i = 0; i < frameCount4; ++i) { 10120 __m128i mid; 10121 __m128i side; 10122 __m128i left; 10123 __m128i right; 10124 10125 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10126 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10127 10128 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10129 10130 left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 10131 right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 10132 10133 left = _mm_srai_epi32(left, 16); 10134 right = _mm_srai_epi32(right, 16); 10135 10136 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10137 } 10138 10139 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10140 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10141 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10142 10143 mid = (mid << 1) | (side & 0x01); 10144 10145 pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16); 10146 pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16); 10147 } 10148 } 10149 } 10150 #endif 10151 10152 #if defined(DRFLAC_SUPPORT_NEON) 10153 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10154 { 10155 drflac_uint64 i; 10156 drflac_uint64 frameCount4 = frameCount >> 2; 10157 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10158 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10159 drflac_uint32 shift = unusedBitsPerSample; 10160 int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */ 10161 int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */ 10162 10163 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10164 10165 wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10166 wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10167 10168 if (shift == 0) { 10169 for (i = 0; i < frameCount4; ++i) { 10170 uint32x4_t mid; 10171 uint32x4_t side; 10172 int32x4_t left; 10173 int32x4_t right; 10174 10175 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 10176 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 10177 10178 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10179 10180 left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 10181 right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 10182 10183 left = vshrq_n_s32(left, 16); 10184 right = vshrq_n_s32(right, 16); 10185 10186 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10187 } 10188 10189 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10190 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10191 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10192 10193 mid = (mid << 1) | (side & 0x01); 10194 10195 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16); 10196 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16); 10197 } 10198 } else { 10199 int32x4_t shift4; 10200 10201 shift -= 1; 10202 shift4 = vdupq_n_s32(shift); 10203 10204 for (i = 0; i < frameCount4; ++i) { 10205 uint32x4_t mid; 10206 uint32x4_t side; 10207 int32x4_t left; 10208 int32x4_t right; 10209 10210 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 10211 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 10212 10213 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10214 10215 left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 10216 right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 10217 10218 left = vshrq_n_s32(left, 16); 10219 right = vshrq_n_s32(right, 16); 10220 10221 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10222 } 10223 10224 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10225 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10226 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10227 10228 mid = (mid << 1) | (side & 0x01); 10229 10230 pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16); 10231 pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16); 10232 } 10233 } 10234 } 10235 #endif 10236 10237 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10238 { 10239 #if defined(DRFLAC_SUPPORT_SSE2) 10240 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10241 drflac_read_pcm_frames_s16__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10242 } else 10243 #elif defined(DRFLAC_SUPPORT_NEON) 10244 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10245 drflac_read_pcm_frames_s16__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10246 } else 10247 #endif 10248 { 10249 /* Scalar fallback. */ 10250 #if 0 10251 drflac_read_pcm_frames_s16__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10252 #else 10253 drflac_read_pcm_frames_s16__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10254 #endif 10255 } 10256 } 10257 10258 10259 #if 0 10260 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10261 { 10262 for (drflac_uint64 i = 0; i < frameCount; ++i) { 10263 pOutputSamples[i*2+0] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) >> 16); 10264 pOutputSamples[i*2+1] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) >> 16); 10265 } 10266 } 10267 #endif 10268 10269 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10270 { 10271 drflac_uint64 i; 10272 drflac_uint64 frameCount4 = frameCount >> 2; 10273 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10274 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10275 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10276 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10277 10278 for (i = 0; i < frameCount4; ++i) { 10279 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 10280 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 10281 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 10282 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 10283 10284 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 10285 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 10286 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 10287 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 10288 10289 tempL0 >>= 16; 10290 tempL1 >>= 16; 10291 tempL2 >>= 16; 10292 tempL3 >>= 16; 10293 10294 tempR0 >>= 16; 10295 tempR1 >>= 16; 10296 tempR2 >>= 16; 10297 tempR3 >>= 16; 10298 10299 pOutputSamples[i*8+0] = (drflac_int16)tempL0; 10300 pOutputSamples[i*8+1] = (drflac_int16)tempR0; 10301 pOutputSamples[i*8+2] = (drflac_int16)tempL1; 10302 pOutputSamples[i*8+3] = (drflac_int16)tempR1; 10303 pOutputSamples[i*8+4] = (drflac_int16)tempL2; 10304 pOutputSamples[i*8+5] = (drflac_int16)tempR2; 10305 pOutputSamples[i*8+6] = (drflac_int16)tempL3; 10306 pOutputSamples[i*8+7] = (drflac_int16)tempR3; 10307 } 10308 10309 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10310 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10311 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10312 } 10313 } 10314 10315 #if defined(DRFLAC_SUPPORT_SSE2) 10316 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10317 { 10318 drflac_uint64 i; 10319 drflac_uint64 frameCount4 = frameCount >> 2; 10320 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10321 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10322 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10323 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10324 10325 for (i = 0; i < frameCount4; ++i) { 10326 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10327 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10328 10329 left = _mm_srai_epi32(left, 16); 10330 right = _mm_srai_epi32(right, 16); 10331 10332 /* At this point we have results. We can now pack and interleave these into a single __m128i object and then store the in the output buffer. */ 10333 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10334 } 10335 10336 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10337 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10338 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10339 } 10340 } 10341 #endif 10342 10343 #if defined(DRFLAC_SUPPORT_NEON) 10344 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10345 { 10346 drflac_uint64 i; 10347 drflac_uint64 frameCount4 = frameCount >> 2; 10348 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10349 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10350 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10351 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10352 10353 int32x4_t shift0_4 = vdupq_n_s32(shift0); 10354 int32x4_t shift1_4 = vdupq_n_s32(shift1); 10355 10356 for (i = 0; i < frameCount4; ++i) { 10357 int32x4_t left; 10358 int32x4_t right; 10359 10360 left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4)); 10361 right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4)); 10362 10363 left = vshrq_n_s32(left, 16); 10364 right = vshrq_n_s32(right, 16); 10365 10366 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10367 } 10368 10369 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10370 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10371 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10372 } 10373 } 10374 #endif 10375 10376 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10377 { 10378 #if defined(DRFLAC_SUPPORT_SSE2) 10379 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10380 drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10381 } else 10382 #elif defined(DRFLAC_SUPPORT_NEON) 10383 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10384 drflac_read_pcm_frames_s16__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10385 } else 10386 #endif 10387 { 10388 /* Scalar fallback. */ 10389 #if 0 10390 drflac_read_pcm_frames_s16__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10391 #else 10392 drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10393 #endif 10394 } 10395 } 10396 10397 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut) 10398 { 10399 drflac_uint64 framesRead; 10400 drflac_uint32 unusedBitsPerSample; 10401 10402 if (pFlac == NULL || framesToRead == 0) { 10403 return 0; 10404 } 10405 10406 if (pBufferOut == NULL) { 10407 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 10408 } 10409 10410 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 10411 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 10412 10413 framesRead = 0; 10414 while (framesToRead > 0) { 10415 /* If we've run out of samples in this frame, go to the next. */ 10416 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 10417 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 10418 break; /* Couldn't read the next frame, so just break from the loop and return. */ 10419 } 10420 } else { 10421 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 10422 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 10423 drflac_uint64 frameCountThisIteration = framesToRead; 10424 10425 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 10426 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 10427 } 10428 10429 if (channelCount == 2) { 10430 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 10431 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 10432 10433 switch (pFlac->currentFLACFrame.header.channelAssignment) 10434 { 10435 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 10436 { 10437 drflac_read_pcm_frames_s16__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10438 } break; 10439 10440 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 10441 { 10442 drflac_read_pcm_frames_s16__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10443 } break; 10444 10445 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 10446 { 10447 drflac_read_pcm_frames_s16__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10448 } break; 10449 10450 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 10451 default: 10452 { 10453 drflac_read_pcm_frames_s16__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10454 } break; 10455 } 10456 } else { 10457 /* Generic interleaving. */ 10458 drflac_uint64 i; 10459 for (i = 0; i < frameCountThisIteration; ++i) { 10460 unsigned int j; 10461 for (j = 0; j < channelCount; ++j) { 10462 drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 10463 pBufferOut[(i*channelCount)+j] = (drflac_int16)(sampleS32 >> 16); 10464 } 10465 } 10466 } 10467 10468 framesRead += frameCountThisIteration; 10469 pBufferOut += frameCountThisIteration * channelCount; 10470 framesToRead -= frameCountThisIteration; 10471 pFlac->currentPCMFrame += frameCountThisIteration; 10472 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration; 10473 } 10474 } 10475 10476 return framesRead; 10477 } 10478 10479 10480 #if 0 10481 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10482 { 10483 drflac_uint64 i; 10484 for (i = 0; i < frameCount; ++i) { 10485 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10486 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10487 drflac_uint32 right = left - side; 10488 10489 pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0); 10490 pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0); 10491 } 10492 } 10493 #endif 10494 10495 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10496 { 10497 drflac_uint64 i; 10498 drflac_uint64 frameCount4 = frameCount >> 2; 10499 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10500 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10501 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10502 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10503 10504 float factor = 1 / 2147483648.0; 10505 10506 for (i = 0; i < frameCount4; ++i) { 10507 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 10508 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 10509 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 10510 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 10511 10512 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 10513 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 10514 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 10515 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 10516 10517 drflac_uint32 right0 = left0 - side0; 10518 drflac_uint32 right1 = left1 - side1; 10519 drflac_uint32 right2 = left2 - side2; 10520 drflac_uint32 right3 = left3 - side3; 10521 10522 pOutputSamples[i*8+0] = (drflac_int32)left0 * factor; 10523 pOutputSamples[i*8+1] = (drflac_int32)right0 * factor; 10524 pOutputSamples[i*8+2] = (drflac_int32)left1 * factor; 10525 pOutputSamples[i*8+3] = (drflac_int32)right1 * factor; 10526 pOutputSamples[i*8+4] = (drflac_int32)left2 * factor; 10527 pOutputSamples[i*8+5] = (drflac_int32)right2 * factor; 10528 pOutputSamples[i*8+6] = (drflac_int32)left3 * factor; 10529 pOutputSamples[i*8+7] = (drflac_int32)right3 * factor; 10530 } 10531 10532 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10533 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10534 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10535 drflac_uint32 right = left - side; 10536 10537 pOutputSamples[i*2+0] = (drflac_int32)left * factor; 10538 pOutputSamples[i*2+1] = (drflac_int32)right * factor; 10539 } 10540 } 10541 10542 #if defined(DRFLAC_SUPPORT_SSE2) 10543 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10544 { 10545 drflac_uint64 i; 10546 drflac_uint64 frameCount4 = frameCount >> 2; 10547 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10548 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10549 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10550 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10551 __m128 factor; 10552 10553 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10554 10555 factor = _mm_set1_ps(1.0f / 8388608.0f); 10556 10557 for (i = 0; i < frameCount4; ++i) { 10558 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10559 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10560 __m128i right = _mm_sub_epi32(left, side); 10561 __m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor); 10562 __m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor); 10563 10564 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10565 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10566 } 10567 10568 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10569 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10570 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10571 drflac_uint32 right = left - side; 10572 10573 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10574 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10575 } 10576 } 10577 #endif 10578 10579 #if defined(DRFLAC_SUPPORT_NEON) 10580 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10581 { 10582 drflac_uint64 i; 10583 drflac_uint64 frameCount4 = frameCount >> 2; 10584 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10585 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10586 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10587 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10588 float32x4_t factor4; 10589 int32x4_t shift0_4; 10590 int32x4_t shift1_4; 10591 10592 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10593 10594 factor4 = vdupq_n_f32(1.0f / 8388608.0f); 10595 shift0_4 = vdupq_n_s32(shift0); 10596 shift1_4 = vdupq_n_s32(shift1); 10597 10598 for (i = 0; i < frameCount4; ++i) { 10599 uint32x4_t left; 10600 uint32x4_t side; 10601 uint32x4_t right; 10602 float32x4_t leftf; 10603 float32x4_t rightf; 10604 10605 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 10606 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 10607 right = vsubq_u32(left, side); 10608 leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4); 10609 rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4); 10610 10611 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10612 } 10613 10614 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10615 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10616 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10617 drflac_uint32 right = left - side; 10618 10619 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10620 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10621 } 10622 } 10623 #endif 10624 10625 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10626 { 10627 #if defined(DRFLAC_SUPPORT_SSE2) 10628 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10629 drflac_read_pcm_frames_f32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10630 } else 10631 #elif defined(DRFLAC_SUPPORT_NEON) 10632 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10633 drflac_read_pcm_frames_f32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10634 } else 10635 #endif 10636 { 10637 /* Scalar fallback. */ 10638 #if 0 10639 drflac_read_pcm_frames_f32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10640 #else 10641 drflac_read_pcm_frames_f32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10642 #endif 10643 } 10644 } 10645 10646 10647 #if 0 10648 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10649 { 10650 drflac_uint64 i; 10651 for (i = 0; i < frameCount; ++i) { 10652 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10653 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10654 drflac_uint32 left = right + side; 10655 10656 pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0); 10657 pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0); 10658 } 10659 } 10660 #endif 10661 10662 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10663 { 10664 drflac_uint64 i; 10665 drflac_uint64 frameCount4 = frameCount >> 2; 10666 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10667 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10668 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10669 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10670 float factor = 1 / 2147483648.0; 10671 10672 for (i = 0; i < frameCount4; ++i) { 10673 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 10674 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 10675 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 10676 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 10677 10678 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 10679 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 10680 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 10681 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 10682 10683 drflac_uint32 left0 = right0 + side0; 10684 drflac_uint32 left1 = right1 + side1; 10685 drflac_uint32 left2 = right2 + side2; 10686 drflac_uint32 left3 = right3 + side3; 10687 10688 pOutputSamples[i*8+0] = (drflac_int32)left0 * factor; 10689 pOutputSamples[i*8+1] = (drflac_int32)right0 * factor; 10690 pOutputSamples[i*8+2] = (drflac_int32)left1 * factor; 10691 pOutputSamples[i*8+3] = (drflac_int32)right1 * factor; 10692 pOutputSamples[i*8+4] = (drflac_int32)left2 * factor; 10693 pOutputSamples[i*8+5] = (drflac_int32)right2 * factor; 10694 pOutputSamples[i*8+6] = (drflac_int32)left3 * factor; 10695 pOutputSamples[i*8+7] = (drflac_int32)right3 * factor; 10696 } 10697 10698 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10699 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10700 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10701 drflac_uint32 left = right + side; 10702 10703 pOutputSamples[i*2+0] = (drflac_int32)left * factor; 10704 pOutputSamples[i*2+1] = (drflac_int32)right * factor; 10705 } 10706 } 10707 10708 #if defined(DRFLAC_SUPPORT_SSE2) 10709 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10710 { 10711 drflac_uint64 i; 10712 drflac_uint64 frameCount4 = frameCount >> 2; 10713 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10714 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10715 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10716 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10717 __m128 factor; 10718 10719 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10720 10721 factor = _mm_set1_ps(1.0f / 8388608.0f); 10722 10723 for (i = 0; i < frameCount4; ++i) { 10724 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10725 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10726 __m128i left = _mm_add_epi32(right, side); 10727 __m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor); 10728 __m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor); 10729 10730 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10731 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10732 } 10733 10734 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10735 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10736 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10737 drflac_uint32 left = right + side; 10738 10739 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10740 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10741 } 10742 } 10743 #endif 10744 10745 #if defined(DRFLAC_SUPPORT_NEON) 10746 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10747 { 10748 drflac_uint64 i; 10749 drflac_uint64 frameCount4 = frameCount >> 2; 10750 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10751 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10752 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10753 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10754 float32x4_t factor4; 10755 int32x4_t shift0_4; 10756 int32x4_t shift1_4; 10757 10758 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10759 10760 factor4 = vdupq_n_f32(1.0f / 8388608.0f); 10761 shift0_4 = vdupq_n_s32(shift0); 10762 shift1_4 = vdupq_n_s32(shift1); 10763 10764 for (i = 0; i < frameCount4; ++i) { 10765 uint32x4_t side; 10766 uint32x4_t right; 10767 uint32x4_t left; 10768 float32x4_t leftf; 10769 float32x4_t rightf; 10770 10771 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 10772 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 10773 left = vaddq_u32(right, side); 10774 leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4); 10775 rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4); 10776 10777 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10778 } 10779 10780 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10781 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10782 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10783 drflac_uint32 left = right + side; 10784 10785 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10786 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10787 } 10788 } 10789 #endif 10790 10791 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10792 { 10793 #if defined(DRFLAC_SUPPORT_SSE2) 10794 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10795 drflac_read_pcm_frames_f32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10796 } else 10797 #elif defined(DRFLAC_SUPPORT_NEON) 10798 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10799 drflac_read_pcm_frames_f32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10800 } else 10801 #endif 10802 { 10803 /* Scalar fallback. */ 10804 #if 0 10805 drflac_read_pcm_frames_f32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10806 #else 10807 drflac_read_pcm_frames_f32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10808 #endif 10809 } 10810 } 10811 10812 10813 #if 0 10814 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10815 { 10816 for (drflac_uint64 i = 0; i < frameCount; ++i) { 10817 drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10818 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10819 10820 mid = (mid << 1) | (side & 0x01); 10821 10822 pOutputSamples[i*2+0] = (float)((((drflac_int32)(mid + side) >> 1) << (unusedBitsPerSample)) / 2147483648.0); 10823 pOutputSamples[i*2+1] = (float)((((drflac_int32)(mid - side) >> 1) << (unusedBitsPerSample)) / 2147483648.0); 10824 } 10825 } 10826 #endif 10827 10828 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10829 { 10830 drflac_uint64 i; 10831 drflac_uint64 frameCount4 = frameCount >> 2; 10832 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10833 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10834 drflac_uint32 shift = unusedBitsPerSample; 10835 float factor = 1 / 2147483648.0; 10836 10837 if (shift > 0) { 10838 shift -= 1; 10839 for (i = 0; i < frameCount4; ++i) { 10840 drflac_uint32 temp0L; 10841 drflac_uint32 temp1L; 10842 drflac_uint32 temp2L; 10843 drflac_uint32 temp3L; 10844 drflac_uint32 temp0R; 10845 drflac_uint32 temp1R; 10846 drflac_uint32 temp2R; 10847 drflac_uint32 temp3R; 10848 10849 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10850 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10851 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10852 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10853 10854 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10855 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10856 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10857 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10858 10859 mid0 = (mid0 << 1) | (side0 & 0x01); 10860 mid1 = (mid1 << 1) | (side1 & 0x01); 10861 mid2 = (mid2 << 1) | (side2 & 0x01); 10862 mid3 = (mid3 << 1) | (side3 & 0x01); 10863 10864 temp0L = (mid0 + side0) << shift; 10865 temp1L = (mid1 + side1) << shift; 10866 temp2L = (mid2 + side2) << shift; 10867 temp3L = (mid3 + side3) << shift; 10868 10869 temp0R = (mid0 - side0) << shift; 10870 temp1R = (mid1 - side1) << shift; 10871 temp2R = (mid2 - side2) << shift; 10872 temp3R = (mid3 - side3) << shift; 10873 10874 pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor; 10875 pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor; 10876 pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor; 10877 pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor; 10878 pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor; 10879 pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor; 10880 pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor; 10881 pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor; 10882 } 10883 } else { 10884 for (i = 0; i < frameCount4; ++i) { 10885 drflac_uint32 temp0L; 10886 drflac_uint32 temp1L; 10887 drflac_uint32 temp2L; 10888 drflac_uint32 temp3L; 10889 drflac_uint32 temp0R; 10890 drflac_uint32 temp1R; 10891 drflac_uint32 temp2R; 10892 drflac_uint32 temp3R; 10893 10894 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10895 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10896 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10897 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10898 10899 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10900 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10901 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10902 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10903 10904 mid0 = (mid0 << 1) | (side0 & 0x01); 10905 mid1 = (mid1 << 1) | (side1 & 0x01); 10906 mid2 = (mid2 << 1) | (side2 & 0x01); 10907 mid3 = (mid3 << 1) | (side3 & 0x01); 10908 10909 temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1); 10910 temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1); 10911 temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1); 10912 temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1); 10913 10914 temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1); 10915 temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1); 10916 temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1); 10917 temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1); 10918 10919 pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor; 10920 pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor; 10921 pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor; 10922 pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor; 10923 pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor; 10924 pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor; 10925 pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor; 10926 pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor; 10927 } 10928 } 10929 10930 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10931 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10932 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10933 10934 mid = (mid << 1) | (side & 0x01); 10935 10936 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) * factor; 10937 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) * factor; 10938 } 10939 } 10940 10941 #if defined(DRFLAC_SUPPORT_SSE2) 10942 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10943 { 10944 drflac_uint64 i; 10945 drflac_uint64 frameCount4 = frameCount >> 2; 10946 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10947 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10948 drflac_uint32 shift = unusedBitsPerSample - 8; 10949 float factor; 10950 __m128 factor128; 10951 10952 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10953 10954 factor = 1.0f / 8388608.0f; 10955 factor128 = _mm_set1_ps(factor); 10956 10957 if (shift == 0) { 10958 for (i = 0; i < frameCount4; ++i) { 10959 __m128i mid; 10960 __m128i side; 10961 __m128i tempL; 10962 __m128i tempR; 10963 __m128 leftf; 10964 __m128 rightf; 10965 10966 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10967 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10968 10969 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10970 10971 tempL = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 10972 tempR = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 10973 10974 leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128); 10975 rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128); 10976 10977 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10978 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10979 } 10980 10981 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10982 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10983 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10984 10985 mid = (mid << 1) | (side & 0x01); 10986 10987 pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor; 10988 pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor; 10989 } 10990 } else { 10991 shift -= 1; 10992 for (i = 0; i < frameCount4; ++i) { 10993 __m128i mid; 10994 __m128i side; 10995 __m128i tempL; 10996 __m128i tempR; 10997 __m128 leftf; 10998 __m128 rightf; 10999 11000 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 11001 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 11002 11003 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 11004 11005 tempL = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 11006 tempR = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 11007 11008 leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128); 11009 rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128); 11010 11011 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 11012 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 11013 } 11014 11015 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11016 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11017 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11018 11019 mid = (mid << 1) | (side & 0x01); 11020 11021 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor; 11022 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor; 11023 } 11024 } 11025 } 11026 #endif 11027 11028 #if defined(DRFLAC_SUPPORT_NEON) 11029 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11030 { 11031 drflac_uint64 i; 11032 drflac_uint64 frameCount4 = frameCount >> 2; 11033 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11034 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11035 drflac_uint32 shift = unusedBitsPerSample - 8; 11036 float factor; 11037 float32x4_t factor4; 11038 int32x4_t shift4; 11039 int32x4_t wbps0_4; /* Wasted Bits Per Sample */ 11040 int32x4_t wbps1_4; /* Wasted Bits Per Sample */ 11041 11042 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 11043 11044 factor = 1.0f / 8388608.0f; 11045 factor4 = vdupq_n_f32(factor); 11046 wbps0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 11047 wbps1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 11048 11049 if (shift == 0) { 11050 for (i = 0; i < frameCount4; ++i) { 11051 int32x4_t lefti; 11052 int32x4_t righti; 11053 float32x4_t leftf; 11054 float32x4_t rightf; 11055 11056 uint32x4_t mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4); 11057 uint32x4_t side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4); 11058 11059 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 11060 11061 lefti = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 11062 righti = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 11063 11064 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 11065 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 11066 11067 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 11068 } 11069 11070 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11071 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11072 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11073 11074 mid = (mid << 1) | (side & 0x01); 11075 11076 pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor; 11077 pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor; 11078 } 11079 } else { 11080 shift -= 1; 11081 shift4 = vdupq_n_s32(shift); 11082 for (i = 0; i < frameCount4; ++i) { 11083 uint32x4_t mid; 11084 uint32x4_t side; 11085 int32x4_t lefti; 11086 int32x4_t righti; 11087 float32x4_t leftf; 11088 float32x4_t rightf; 11089 11090 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4); 11091 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4); 11092 11093 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 11094 11095 lefti = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 11096 righti = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 11097 11098 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 11099 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 11100 11101 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 11102 } 11103 11104 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11105 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11106 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11107 11108 mid = (mid << 1) | (side & 0x01); 11109 11110 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor; 11111 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor; 11112 } 11113 } 11114 } 11115 #endif 11116 11117 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11118 { 11119 #if defined(DRFLAC_SUPPORT_SSE2) 11120 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 11121 drflac_read_pcm_frames_f32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11122 } else 11123 #elif defined(DRFLAC_SUPPORT_NEON) 11124 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 11125 drflac_read_pcm_frames_f32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11126 } else 11127 #endif 11128 { 11129 /* Scalar fallback. */ 11130 #if 0 11131 drflac_read_pcm_frames_f32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11132 #else 11133 drflac_read_pcm_frames_f32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11134 #endif 11135 } 11136 } 11137 11138 #if 0 11139 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11140 { 11141 for (drflac_uint64 i = 0; i < frameCount; ++i) { 11142 pOutputSamples[i*2+0] = (float)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) / 2147483648.0); 11143 pOutputSamples[i*2+1] = (float)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) / 2147483648.0); 11144 } 11145 } 11146 #endif 11147 11148 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11149 { 11150 drflac_uint64 i; 11151 drflac_uint64 frameCount4 = frameCount >> 2; 11152 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11153 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11154 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11155 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11156 float factor = 1 / 2147483648.0; 11157 11158 for (i = 0; i < frameCount4; ++i) { 11159 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 11160 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 11161 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 11162 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 11163 11164 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 11165 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 11166 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 11167 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 11168 11169 pOutputSamples[i*8+0] = (drflac_int32)tempL0 * factor; 11170 pOutputSamples[i*8+1] = (drflac_int32)tempR0 * factor; 11171 pOutputSamples[i*8+2] = (drflac_int32)tempL1 * factor; 11172 pOutputSamples[i*8+3] = (drflac_int32)tempR1 * factor; 11173 pOutputSamples[i*8+4] = (drflac_int32)tempL2 * factor; 11174 pOutputSamples[i*8+5] = (drflac_int32)tempR2 * factor; 11175 pOutputSamples[i*8+6] = (drflac_int32)tempL3 * factor; 11176 pOutputSamples[i*8+7] = (drflac_int32)tempR3 * factor; 11177 } 11178 11179 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11180 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11181 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11182 } 11183 } 11184 11185 #if defined(DRFLAC_SUPPORT_SSE2) 11186 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11187 { 11188 drflac_uint64 i; 11189 drflac_uint64 frameCount4 = frameCount >> 2; 11190 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11191 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11192 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 11193 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 11194 11195 float factor = 1.0f / 8388608.0f; 11196 __m128 factor128 = _mm_set1_ps(factor); 11197 11198 for (i = 0; i < frameCount4; ++i) { 11199 __m128i lefti; 11200 __m128i righti; 11201 __m128 leftf; 11202 __m128 rightf; 11203 11204 lefti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 11205 righti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 11206 11207 leftf = _mm_mul_ps(_mm_cvtepi32_ps(lefti), factor128); 11208 rightf = _mm_mul_ps(_mm_cvtepi32_ps(righti), factor128); 11209 11210 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 11211 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 11212 } 11213 11214 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11215 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11216 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11217 } 11218 } 11219 #endif 11220 11221 #if defined(DRFLAC_SUPPORT_NEON) 11222 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11223 { 11224 drflac_uint64 i; 11225 drflac_uint64 frameCount4 = frameCount >> 2; 11226 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11227 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11228 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 11229 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 11230 11231 float factor = 1.0f / 8388608.0f; 11232 float32x4_t factor4 = vdupq_n_f32(factor); 11233 int32x4_t shift0_4 = vdupq_n_s32(shift0); 11234 int32x4_t shift1_4 = vdupq_n_s32(shift1); 11235 11236 for (i = 0; i < frameCount4; ++i) { 11237 int32x4_t lefti; 11238 int32x4_t righti; 11239 float32x4_t leftf; 11240 float32x4_t rightf; 11241 11242 lefti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4)); 11243 righti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4)); 11244 11245 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 11246 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 11247 11248 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 11249 } 11250 11251 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11252 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11253 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11254 } 11255 } 11256 #endif 11257 11258 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11259 { 11260 #if defined(DRFLAC_SUPPORT_SSE2) 11261 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 11262 drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11263 } else 11264 #elif defined(DRFLAC_SUPPORT_NEON) 11265 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 11266 drflac_read_pcm_frames_f32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11267 } else 11268 #endif 11269 { 11270 /* Scalar fallback. */ 11271 #if 0 11272 drflac_read_pcm_frames_f32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11273 #else 11274 drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11275 #endif 11276 } 11277 } 11278 11279 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut) 11280 { 11281 drflac_uint64 framesRead; 11282 drflac_uint32 unusedBitsPerSample; 11283 11284 if (pFlac == NULL || framesToRead == 0) { 11285 return 0; 11286 } 11287 11288 if (pBufferOut == NULL) { 11289 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 11290 } 11291 11292 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 11293 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 11294 11295 framesRead = 0; 11296 while (framesToRead > 0) { 11297 /* If we've run out of samples in this frame, go to the next. */ 11298 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 11299 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 11300 break; /* Couldn't read the next frame, so just break from the loop and return. */ 11301 } 11302 } else { 11303 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 11304 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 11305 drflac_uint64 frameCountThisIteration = framesToRead; 11306 11307 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 11308 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 11309 } 11310 11311 if (channelCount == 2) { 11312 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 11313 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 11314 11315 switch (pFlac->currentFLACFrame.header.channelAssignment) 11316 { 11317 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 11318 { 11319 drflac_read_pcm_frames_f32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11320 } break; 11321 11322 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 11323 { 11324 drflac_read_pcm_frames_f32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11325 } break; 11326 11327 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 11328 { 11329 drflac_read_pcm_frames_f32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11330 } break; 11331 11332 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 11333 default: 11334 { 11335 drflac_read_pcm_frames_f32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11336 } break; 11337 } 11338 } else { 11339 /* Generic interleaving. */ 11340 drflac_uint64 i; 11341 for (i = 0; i < frameCountThisIteration; ++i) { 11342 unsigned int j; 11343 for (j = 0; j < channelCount; ++j) { 11344 drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 11345 pBufferOut[(i*channelCount)+j] = (float)(sampleS32 / 2147483648.0); 11346 } 11347 } 11348 } 11349 11350 framesRead += frameCountThisIteration; 11351 pBufferOut += frameCountThisIteration * channelCount; 11352 framesToRead -= frameCountThisIteration; 11353 pFlac->currentPCMFrame += frameCountThisIteration; 11354 pFlac->currentFLACFrame.pcmFramesRemaining -= (unsigned int)frameCountThisIteration; 11355 } 11356 } 11357 11358 return framesRead; 11359 } 11360 11361 11362 DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex) 11363 { 11364 if (pFlac == NULL) { 11365 return DRFLAC_FALSE; 11366 } 11367 11368 /* Don't do anything if we're already on the seek point. */ 11369 if (pFlac->currentPCMFrame == pcmFrameIndex) { 11370 return DRFLAC_TRUE; 11371 } 11372 11373 /* 11374 If we don't know where the first frame begins then we can't seek. This will happen when the STREAMINFO block was not present 11375 when the decoder was opened. 11376 */ 11377 if (pFlac->firstFLACFramePosInBytes == 0) { 11378 return DRFLAC_FALSE; 11379 } 11380 11381 if (pcmFrameIndex == 0) { 11382 pFlac->currentPCMFrame = 0; 11383 return drflac__seek_to_first_frame(pFlac); 11384 } else { 11385 drflac_bool32 wasSuccessful = DRFLAC_FALSE; 11386 drflac_uint64 originalPCMFrame = pFlac->currentPCMFrame; 11387 11388 /* Clamp the sample to the end. */ 11389 if (pcmFrameIndex > pFlac->totalPCMFrameCount) { 11390 pcmFrameIndex = pFlac->totalPCMFrameCount; 11391 } 11392 11393 /* If the target sample and the current sample are in the same frame we just move the position forward. */ 11394 if (pcmFrameIndex > pFlac->currentPCMFrame) { 11395 /* Forward. */ 11396 drflac_uint32 offset = (drflac_uint32)(pcmFrameIndex - pFlac->currentPCMFrame); 11397 if (pFlac->currentFLACFrame.pcmFramesRemaining > offset) { 11398 pFlac->currentFLACFrame.pcmFramesRemaining -= offset; 11399 pFlac->currentPCMFrame = pcmFrameIndex; 11400 return DRFLAC_TRUE; 11401 } 11402 } else { 11403 /* Backward. */ 11404 drflac_uint32 offsetAbs = (drflac_uint32)(pFlac->currentPCMFrame - pcmFrameIndex); 11405 drflac_uint32 currentFLACFramePCMFrameCount = pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 11406 drflac_uint32 currentFLACFramePCMFramesConsumed = currentFLACFramePCMFrameCount - pFlac->currentFLACFrame.pcmFramesRemaining; 11407 if (currentFLACFramePCMFramesConsumed > offsetAbs) { 11408 pFlac->currentFLACFrame.pcmFramesRemaining += offsetAbs; 11409 pFlac->currentPCMFrame = pcmFrameIndex; 11410 return DRFLAC_TRUE; 11411 } 11412 } 11413 11414 /* 11415 Different techniques depending on encapsulation. Using the native FLAC seektable with Ogg encapsulation is a bit awkward so 11416 we'll instead use Ogg's natural seeking facility. 11417 */ 11418 #ifndef DR_FLAC_NO_OGG 11419 if (pFlac->container == drflac_container_ogg) 11420 { 11421 wasSuccessful = drflac_ogg__seek_to_pcm_frame(pFlac, pcmFrameIndex); 11422 } 11423 else 11424 #endif 11425 { 11426 /* First try seeking via the seek table. If this fails, fall back to a brute force seek which is much slower. */ 11427 if (/*!wasSuccessful && */!pFlac->_noSeekTableSeek) { 11428 wasSuccessful = drflac__seek_to_pcm_frame__seek_table(pFlac, pcmFrameIndex); 11429 } 11430 11431 #if !defined(DR_FLAC_NO_CRC) 11432 /* Fall back to binary search if seek table seeking fails. This requires the length of the stream to be known. */ 11433 if (!wasSuccessful && !pFlac->_noBinarySearchSeek && pFlac->totalPCMFrameCount > 0) { 11434 wasSuccessful = drflac__seek_to_pcm_frame__binary_search(pFlac, pcmFrameIndex); 11435 } 11436 #endif 11437 11438 /* Fall back to brute force if all else fails. */ 11439 if (!wasSuccessful && !pFlac->_noBruteForceSeek) { 11440 wasSuccessful = drflac__seek_to_pcm_frame__brute_force(pFlac, pcmFrameIndex); 11441 } 11442 } 11443 11444 if (wasSuccessful) { 11445 pFlac->currentPCMFrame = pcmFrameIndex; 11446 } else { 11447 /* Seek failed. Try putting the decoder back to it's original state. */ 11448 if (drflac_seek_to_pcm_frame(pFlac, originalPCMFrame) == DRFLAC_FALSE) { 11449 /* Failed to seek back to the original PCM frame. Fall back to 0. */ 11450 drflac_seek_to_pcm_frame(pFlac, 0); 11451 } 11452 } 11453 11454 return wasSuccessful; 11455 } 11456 } 11457 11458 11459 11460 /* High Level APIs */ 11461 11462 #if defined(SIZE_MAX) 11463 #define DRFLAC_SIZE_MAX SIZE_MAX 11464 #else 11465 #if defined(DRFLAC_64BIT) 11466 #define DRFLAC_SIZE_MAX ((drflac_uint64)0xFFFFFFFFFFFFFFFF) 11467 #else 11468 #define DRFLAC_SIZE_MAX 0xFFFFFFFF 11469 #endif 11470 #endif 11471 11472 11473 /* Using a macro as the definition of the drflac__full_decode_and_close_*() API family. Sue me. */ 11474 #define DRFLAC_DEFINE_FULL_READ_AND_CLOSE(extension, type) \ 11475 static type* drflac__full_read_and_close_ ## extension (drflac* pFlac, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut)\ 11476 { \ 11477 type* pSampleData = NULL; \ 11478 drflac_uint64 totalPCMFrameCount; \ 11479 \ 11480 DRFLAC_ASSERT(pFlac != NULL); \ 11481 \ 11482 totalPCMFrameCount = pFlac->totalPCMFrameCount; \ 11483 \ 11484 if (totalPCMFrameCount == 0) { \ 11485 type buffer[4096]; \ 11486 drflac_uint64 pcmFramesRead; \ 11487 size_t sampleDataBufferSize = sizeof(buffer); \ 11488 \ 11489 pSampleData = (type*)drflac__malloc_from_callbacks(sampleDataBufferSize, &pFlac->allocationCallbacks); \ 11490 if (pSampleData == NULL) { \ 11491 goto on_error; \ 11492 } \ 11493 \ 11494 while ((pcmFramesRead = (drflac_uint64)drflac_read_pcm_frames_##extension(pFlac, sizeof(buffer)/sizeof(buffer[0])/pFlac->channels, buffer)) > 0) { \ 11495 if (((totalPCMFrameCount + pcmFramesRead) * pFlac->channels * sizeof(type)) > sampleDataBufferSize) { \ 11496 type* pNewSampleData; \ 11497 size_t newSampleDataBufferSize; \ 11498 \ 11499 newSampleDataBufferSize = sampleDataBufferSize * 2; \ 11500 pNewSampleData = (type*)drflac__realloc_from_callbacks(pSampleData, newSampleDataBufferSize, sampleDataBufferSize, &pFlac->allocationCallbacks); \ 11501 if (pNewSampleData == NULL) { \ 11502 drflac__free_from_callbacks(pSampleData, &pFlac->allocationCallbacks); \ 11503 goto on_error; \ 11504 } \ 11505 \ 11506 sampleDataBufferSize = newSampleDataBufferSize; \ 11507 pSampleData = pNewSampleData; \ 11508 } \ 11509 \ 11510 DRFLAC_COPY_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), buffer, (size_t)(pcmFramesRead*pFlac->channels*sizeof(type))); \ 11511 totalPCMFrameCount += pcmFramesRead; \ 11512 } \ 11513 \ 11514 /* At this point everything should be decoded, but we just want to fill the unused part buffer with silence - need to \ 11515 protect those ears from random noise! */ \ 11516 DRFLAC_ZERO_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), (size_t)(sampleDataBufferSize - totalPCMFrameCount*pFlac->channels*sizeof(type))); \ 11517 } else { \ 11518 drflac_uint64 dataSize = totalPCMFrameCount*pFlac->channels*sizeof(type); \ 11519 if (dataSize > (drflac_uint64)DRFLAC_SIZE_MAX) { \ 11520 goto on_error; /* The decoded data is too big. */ \ 11521 } \ 11522 \ 11523 pSampleData = (type*)drflac__malloc_from_callbacks((size_t)dataSize, &pFlac->allocationCallbacks); /* <-- Safe cast as per the check above. */ \ 11524 if (pSampleData == NULL) { \ 11525 goto on_error; \ 11526 } \ 11527 \ 11528 totalPCMFrameCount = drflac_read_pcm_frames_##extension(pFlac, pFlac->totalPCMFrameCount, pSampleData); \ 11529 } \ 11530 \ 11531 if (sampleRateOut) *sampleRateOut = pFlac->sampleRate; \ 11532 if (channelsOut) *channelsOut = pFlac->channels; \ 11533 if (totalPCMFrameCountOut) *totalPCMFrameCountOut = totalPCMFrameCount; \ 11534 \ 11535 drflac_close(pFlac); \ 11536 return pSampleData; \ 11537 \ 11538 on_error: \ 11539 drflac_close(pFlac); \ 11540 return NULL; \ 11541 } 11542 11543 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s32, drflac_int32) 11544 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s16, drflac_int16) 11545 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(f32, float) 11546 11547 DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11548 { 11549 drflac* pFlac; 11550 11551 if (channelsOut) { 11552 *channelsOut = 0; 11553 } 11554 if (sampleRateOut) { 11555 *sampleRateOut = 0; 11556 } 11557 if (totalPCMFrameCountOut) { 11558 *totalPCMFrameCountOut = 0; 11559 } 11560 11561 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11562 if (pFlac == NULL) { 11563 return NULL; 11564 } 11565 11566 return drflac__full_read_and_close_s32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11567 } 11568 11569 DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11570 { 11571 drflac* pFlac; 11572 11573 if (channelsOut) { 11574 *channelsOut = 0; 11575 } 11576 if (sampleRateOut) { 11577 *sampleRateOut = 0; 11578 } 11579 if (totalPCMFrameCountOut) { 11580 *totalPCMFrameCountOut = 0; 11581 } 11582 11583 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11584 if (pFlac == NULL) { 11585 return NULL; 11586 } 11587 11588 return drflac__full_read_and_close_s16(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11589 } 11590 11591 DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11592 { 11593 drflac* pFlac; 11594 11595 if (channelsOut) { 11596 *channelsOut = 0; 11597 } 11598 if (sampleRateOut) { 11599 *sampleRateOut = 0; 11600 } 11601 if (totalPCMFrameCountOut) { 11602 *totalPCMFrameCountOut = 0; 11603 } 11604 11605 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11606 if (pFlac == NULL) { 11607 return NULL; 11608 } 11609 11610 return drflac__full_read_and_close_f32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11611 } 11612 11613 #ifndef DR_FLAC_NO_STDIO 11614 DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11615 { 11616 drflac* pFlac; 11617 11618 if (sampleRate) { 11619 *sampleRate = 0; 11620 } 11621 if (channels) { 11622 *channels = 0; 11623 } 11624 if (totalPCMFrameCount) { 11625 *totalPCMFrameCount = 0; 11626 } 11627 11628 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11629 if (pFlac == NULL) { 11630 return NULL; 11631 } 11632 11633 return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount); 11634 } 11635 11636 DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11637 { 11638 drflac* pFlac; 11639 11640 if (sampleRate) { 11641 *sampleRate = 0; 11642 } 11643 if (channels) { 11644 *channels = 0; 11645 } 11646 if (totalPCMFrameCount) { 11647 *totalPCMFrameCount = 0; 11648 } 11649 11650 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11651 if (pFlac == NULL) { 11652 return NULL; 11653 } 11654 11655 return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount); 11656 } 11657 11658 DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11659 { 11660 drflac* pFlac; 11661 11662 if (sampleRate) { 11663 *sampleRate = 0; 11664 } 11665 if (channels) { 11666 *channels = 0; 11667 } 11668 if (totalPCMFrameCount) { 11669 *totalPCMFrameCount = 0; 11670 } 11671 11672 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11673 if (pFlac == NULL) { 11674 return NULL; 11675 } 11676 11677 return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount); 11678 } 11679 #endif 11680 11681 DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11682 { 11683 drflac* pFlac; 11684 11685 if (sampleRate) { 11686 *sampleRate = 0; 11687 } 11688 if (channels) { 11689 *channels = 0; 11690 } 11691 if (totalPCMFrameCount) { 11692 *totalPCMFrameCount = 0; 11693 } 11694 11695 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11696 if (pFlac == NULL) { 11697 return NULL; 11698 } 11699 11700 return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount); 11701 } 11702 11703 DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11704 { 11705 drflac* pFlac; 11706 11707 if (sampleRate) { 11708 *sampleRate = 0; 11709 } 11710 if (channels) { 11711 *channels = 0; 11712 } 11713 if (totalPCMFrameCount) { 11714 *totalPCMFrameCount = 0; 11715 } 11716 11717 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11718 if (pFlac == NULL) { 11719 return NULL; 11720 } 11721 11722 return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount); 11723 } 11724 11725 DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11726 { 11727 drflac* pFlac; 11728 11729 if (sampleRate) { 11730 *sampleRate = 0; 11731 } 11732 if (channels) { 11733 *channels = 0; 11734 } 11735 if (totalPCMFrameCount) { 11736 *totalPCMFrameCount = 0; 11737 } 11738 11739 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11740 if (pFlac == NULL) { 11741 return NULL; 11742 } 11743 11744 return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount); 11745 } 11746 11747 11748 DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks) 11749 { 11750 if (pAllocationCallbacks != NULL) { 11751 drflac__free_from_callbacks(p, pAllocationCallbacks); 11752 } else { 11753 drflac__free_default(p, NULL); 11754 } 11755 } 11756 11757 11758 11759 11760 DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments) 11761 { 11762 if (pIter == NULL) { 11763 return; 11764 } 11765 11766 pIter->countRemaining = commentCount; 11767 pIter->pRunningData = (const char*)pComments; 11768 } 11769 11770 DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut) 11771 { 11772 drflac_int32 length; 11773 const char* pComment; 11774 11775 /* Safety. */ 11776 if (pCommentLengthOut) { 11777 *pCommentLengthOut = 0; 11778 } 11779 11780 if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) { 11781 return NULL; 11782 } 11783 11784 length = drflac__le2host_32(*(const drflac_uint32*)pIter->pRunningData); 11785 pIter->pRunningData += 4; 11786 11787 pComment = pIter->pRunningData; 11788 pIter->pRunningData += length; 11789 pIter->countRemaining -= 1; 11790 11791 if (pCommentLengthOut) { 11792 *pCommentLengthOut = length; 11793 } 11794 11795 return pComment; 11796 } 11797 11798 11799 11800 11801 DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData) 11802 { 11803 if (pIter == NULL) { 11804 return; 11805 } 11806 11807 pIter->countRemaining = trackCount; 11808 pIter->pRunningData = (const char*)pTrackData; 11809 } 11810 11811 DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack) 11812 { 11813 drflac_cuesheet_track cuesheetTrack; 11814 const char* pRunningData; 11815 drflac_uint64 offsetHi; 11816 drflac_uint64 offsetLo; 11817 11818 if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) { 11819 return DRFLAC_FALSE; 11820 } 11821 11822 pRunningData = pIter->pRunningData; 11823 11824 offsetHi = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 11825 offsetLo = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 11826 cuesheetTrack.offset = offsetLo | (offsetHi << 32); 11827 cuesheetTrack.trackNumber = pRunningData[0]; pRunningData += 1; 11828 DRFLAC_COPY_MEMORY(cuesheetTrack.ISRC, pRunningData, sizeof(cuesheetTrack.ISRC)); pRunningData += 12; 11829 cuesheetTrack.isAudio = (pRunningData[0] & 0x80) != 0; 11830 cuesheetTrack.preEmphasis = (pRunningData[0] & 0x40) != 0; pRunningData += 14; 11831 cuesheetTrack.indexCount = pRunningData[0]; pRunningData += 1; 11832 cuesheetTrack.pIndexPoints = (const drflac_cuesheet_track_index*)pRunningData; pRunningData += cuesheetTrack.indexCount * sizeof(drflac_cuesheet_track_index); 11833 11834 pIter->pRunningData = pRunningData; 11835 pIter->countRemaining -= 1; 11836 11837 if (pCuesheetTrack) { 11838 *pCuesheetTrack = cuesheetTrack; 11839 } 11840 11841 return DRFLAC_TRUE; 11842 } 11843 11844 #if defined(__clang__) || (defined(__GNUC__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6))) 11845 #pragma GCC diagnostic pop 11846 #endif 11847 #endif /* dr_flac_c */ 11848 #endif /* DR_FLAC_IMPLEMENTATION */ 11849 11850 11851 /* 11852 REVISION HISTORY 11853 ================ 11854 v0.12.31 - 2021-08-16 11855 - Silence some warnings. 11856 11857 v0.12.30 - 2021-07-31 11858 - Fix platform detection for ARM64. 11859 11860 v0.12.29 - 2021-04-02 11861 - Fix a bug where the running PCM frame index is set to an invalid value when over-seeking. 11862 - Fix a decoding error due to an incorrect validation check. 11863 11864 v0.12.28 - 2021-02-21 11865 - Fix a warning due to referencing _MSC_VER when it is undefined. 11866 11867 v0.12.27 - 2021-01-31 11868 - Fix a static analysis warning. 11869 11870 v0.12.26 - 2021-01-17 11871 - Fix a compilation warning due to _BSD_SOURCE being deprecated. 11872 11873 v0.12.25 - 2020-12-26 11874 - Update documentation. 11875 11876 v0.12.24 - 2020-11-29 11877 - Fix ARM64/NEON detection when compiling with MSVC. 11878 11879 v0.12.23 - 2020-11-21 11880 - Fix compilation with OpenWatcom. 11881 11882 v0.12.22 - 2020-11-01 11883 - Fix an error with the previous release. 11884 11885 v0.12.21 - 2020-11-01 11886 - Fix a possible deadlock when seeking. 11887 - Improve compiler support for older versions of GCC. 11888 11889 v0.12.20 - 2020-09-08 11890 - Fix a compilation error on older compilers. 11891 11892 v0.12.19 - 2020-08-30 11893 - Fix a bug due to an undefined 32-bit shift. 11894 11895 v0.12.18 - 2020-08-14 11896 - Fix a crash when compiling with clang-cl. 11897 11898 v0.12.17 - 2020-08-02 11899 - Simplify sized types. 11900 11901 v0.12.16 - 2020-07-25 11902 - Fix a compilation warning. 11903 11904 v0.12.15 - 2020-07-06 11905 - Check for negative LPC shifts and return an error. 11906 11907 v0.12.14 - 2020-06-23 11908 - Add include guard for the implementation section. 11909 11910 v0.12.13 - 2020-05-16 11911 - Add compile-time and run-time version querying. 11912 - DRFLAC_VERSION_MINOR 11913 - DRFLAC_VERSION_MAJOR 11914 - DRFLAC_VERSION_REVISION 11915 - DRFLAC_VERSION_STRING 11916 - drflac_version() 11917 - drflac_version_string() 11918 11919 v0.12.12 - 2020-04-30 11920 - Fix compilation errors with VC6. 11921 11922 v0.12.11 - 2020-04-19 11923 - Fix some pedantic warnings. 11924 - Fix some undefined behaviour warnings. 11925 11926 v0.12.10 - 2020-04-10 11927 - Fix some bugs when trying to seek with an invalid seek table. 11928 11929 v0.12.9 - 2020-04-05 11930 - Fix warnings. 11931 11932 v0.12.8 - 2020-04-04 11933 - Add drflac_open_file_w() and drflac_open_file_with_metadata_w(). 11934 - Fix some static analysis warnings. 11935 - Minor documentation updates. 11936 11937 v0.12.7 - 2020-03-14 11938 - Fix compilation errors with VC6. 11939 11940 v0.12.6 - 2020-03-07 11941 - Fix compilation error with Visual Studio .NET 2003. 11942 11943 v0.12.5 - 2020-01-30 11944 - Silence some static analysis warnings. 11945 11946 v0.12.4 - 2020-01-29 11947 - Silence some static analysis warnings. 11948 11949 v0.12.3 - 2019-12-02 11950 - Fix some warnings when compiling with GCC and the -Og flag. 11951 - Fix a crash in out-of-memory situations. 11952 - Fix potential integer overflow bug. 11953 - Fix some static analysis warnings. 11954 - Fix a possible crash when using custom memory allocators without a custom realloc() implementation. 11955 - Fix a bug with binary search seeking where the bits per sample is not a multiple of 8. 11956 11957 v0.12.2 - 2019-10-07 11958 - Internal code clean up. 11959 11960 v0.12.1 - 2019-09-29 11961 - Fix some Clang Static Analyzer warnings. 11962 - Fix an unused variable warning. 11963 11964 v0.12.0 - 2019-09-23 11965 - API CHANGE: Add support for user defined memory allocation routines. This system allows the program to specify their own memory allocation 11966 routines with a user data pointer for client-specific contextual data. This adds an extra parameter to the end of the following APIs: 11967 - drflac_open() 11968 - drflac_open_relaxed() 11969 - drflac_open_with_metadata() 11970 - drflac_open_with_metadata_relaxed() 11971 - drflac_open_file() 11972 - drflac_open_file_with_metadata() 11973 - drflac_open_memory() 11974 - drflac_open_memory_with_metadata() 11975 - drflac_open_and_read_pcm_frames_s32() 11976 - drflac_open_and_read_pcm_frames_s16() 11977 - drflac_open_and_read_pcm_frames_f32() 11978 - drflac_open_file_and_read_pcm_frames_s32() 11979 - drflac_open_file_and_read_pcm_frames_s16() 11980 - drflac_open_file_and_read_pcm_frames_f32() 11981 - drflac_open_memory_and_read_pcm_frames_s32() 11982 - drflac_open_memory_and_read_pcm_frames_s16() 11983 - drflac_open_memory_and_read_pcm_frames_f32() 11984 Set this extra parameter to NULL to use defaults which is the same as the previous behaviour. Setting this NULL will use 11985 DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE. 11986 - Remove deprecated APIs: 11987 - drflac_read_s32() 11988 - drflac_read_s16() 11989 - drflac_read_f32() 11990 - drflac_seek_to_sample() 11991 - drflac_open_and_decode_s32() 11992 - drflac_open_and_decode_s16() 11993 - drflac_open_and_decode_f32() 11994 - drflac_open_and_decode_file_s32() 11995 - drflac_open_and_decode_file_s16() 11996 - drflac_open_and_decode_file_f32() 11997 - drflac_open_and_decode_memory_s32() 11998 - drflac_open_and_decode_memory_s16() 11999 - drflac_open_and_decode_memory_f32() 12000 - Remove drflac.totalSampleCount which is now replaced with drflac.totalPCMFrameCount. You can emulate drflac.totalSampleCount 12001 by doing pFlac->totalPCMFrameCount*pFlac->channels. 12002 - Rename drflac.currentFrame to drflac.currentFLACFrame to remove ambiguity with PCM frames. 12003 - Fix errors when seeking to the end of a stream. 12004 - Optimizations to seeking. 12005 - SSE improvements and optimizations. 12006 - ARM NEON optimizations. 12007 - Optimizations to drflac_read_pcm_frames_s16(). 12008 - Optimizations to drflac_read_pcm_frames_s32(). 12009 12010 v0.11.10 - 2019-06-26 12011 - Fix a compiler error. 12012 12013 v0.11.9 - 2019-06-16 12014 - Silence some ThreadSanitizer warnings. 12015 12016 v0.11.8 - 2019-05-21 12017 - Fix warnings. 12018 12019 v0.11.7 - 2019-05-06 12020 - C89 fixes. 12021 12022 v0.11.6 - 2019-05-05 12023 - Add support for C89. 12024 - Fix a compiler warning when CRC is disabled. 12025 - Change license to choice of public domain or MIT-0. 12026 12027 v0.11.5 - 2019-04-19 12028 - Fix a compiler error with GCC. 12029 12030 v0.11.4 - 2019-04-17 12031 - Fix some warnings with GCC when compiling with -std=c99. 12032 12033 v0.11.3 - 2019-04-07 12034 - Silence warnings with GCC. 12035 12036 v0.11.2 - 2019-03-10 12037 - Fix a warning. 12038 12039 v0.11.1 - 2019-02-17 12040 - Fix a potential bug with seeking. 12041 12042 v0.11.0 - 2018-12-16 12043 - API CHANGE: Deprecated drflac_read_s32(), drflac_read_s16() and drflac_read_f32() and replaced them with 12044 drflac_read_pcm_frames_s32(), drflac_read_pcm_frames_s16() and drflac_read_pcm_frames_f32(). The new APIs take 12045 and return PCM frame counts instead of sample counts. To upgrade you will need to change the input count by 12046 dividing it by the channel count, and then do the same with the return value. 12047 - API_CHANGE: Deprecated drflac_seek_to_sample() and replaced with drflac_seek_to_pcm_frame(). Same rules as 12048 the changes to drflac_read_*() apply. 12049 - API CHANGE: Deprecated drflac_open_and_decode_*() and replaced with drflac_open_*_and_read_*(). Same rules as 12050 the changes to drflac_read_*() apply. 12051 - Optimizations. 12052 12053 v0.10.0 - 2018-09-11 12054 - Remove the DR_FLAC_NO_WIN32_IO option and the Win32 file IO functionality. If you need to use Win32 file IO you 12055 need to do it yourself via the callback API. 12056 - Fix the clang build. 12057 - Fix undefined behavior. 12058 - Fix errors with CUESHEET metdata blocks. 12059 - Add an API for iterating over each cuesheet track in the CUESHEET metadata block. This works the same way as the 12060 Vorbis comment API. 12061 - Other miscellaneous bug fixes, mostly relating to invalid FLAC streams. 12062 - Minor optimizations. 12063 12064 v0.9.11 - 2018-08-29 12065 - Fix a bug with sample reconstruction. 12066 12067 v0.9.10 - 2018-08-07 12068 - Improve 64-bit detection. 12069 12070 v0.9.9 - 2018-08-05 12071 - Fix C++ build on older versions of GCC. 12072 12073 v0.9.8 - 2018-07-24 12074 - Fix compilation errors. 12075 12076 v0.9.7 - 2018-07-05 12077 - Fix a warning. 12078 12079 v0.9.6 - 2018-06-29 12080 - Fix some typos. 12081 12082 v0.9.5 - 2018-06-23 12083 - Fix some warnings. 12084 12085 v0.9.4 - 2018-06-14 12086 - Optimizations to seeking. 12087 - Clean up. 12088 12089 v0.9.3 - 2018-05-22 12090 - Bug fix. 12091 12092 v0.9.2 - 2018-05-12 12093 - Fix a compilation error due to a missing break statement. 12094 12095 v0.9.1 - 2018-04-29 12096 - Fix compilation error with Clang. 12097 12098 v0.9 - 2018-04-24 12099 - Fix Clang build. 12100 - Start using major.minor.revision versioning. 12101 12102 v0.8g - 2018-04-19 12103 - Fix build on non-x86/x64 architectures. 12104 12105 v0.8f - 2018-02-02 12106 - Stop pretending to support changing rate/channels mid stream. 12107 12108 v0.8e - 2018-02-01 12109 - Fix a crash when the block size of a frame is larger than the maximum block size defined by the FLAC stream. 12110 - Fix a crash the the Rice partition order is invalid. 12111 12112 v0.8d - 2017-09-22 12113 - Add support for decoding streams with ID3 tags. ID3 tags are just skipped. 12114 12115 v0.8c - 2017-09-07 12116 - Fix warning on non-x86/x64 architectures. 12117 12118 v0.8b - 2017-08-19 12119 - Fix build on non-x86/x64 architectures. 12120 12121 v0.8a - 2017-08-13 12122 - A small optimization for the Clang build. 12123 12124 v0.8 - 2017-08-12 12125 - API CHANGE: Rename dr_* types to drflac_*. 12126 - Optimizations. This brings dr_flac back to about the same class of efficiency as the reference implementation. 12127 - Add support for custom implementations of malloc(), realloc(), etc. 12128 - Add CRC checking to Ogg encapsulated streams. 12129 - Fix VC++ 6 build. This is only for the C++ compiler. The C compiler is not currently supported. 12130 - Bug fixes. 12131 12132 v0.7 - 2017-07-23 12133 - Add support for opening a stream without a header block. To do this, use drflac_open_relaxed() / drflac_open_with_metadata_relaxed(). 12134 12135 v0.6 - 2017-07-22 12136 - Add support for recovering from invalid frames. With this change, dr_flac will simply skip over invalid frames as if they 12137 never existed. Frames are checked against their sync code, the CRC-8 of the frame header and the CRC-16 of the whole frame. 12138 12139 v0.5 - 2017-07-16 12140 - Fix typos. 12141 - Change drflac_bool* types to unsigned. 12142 - Add CRC checking. This makes dr_flac slower, but can be disabled with #define DR_FLAC_NO_CRC. 12143 12144 v0.4f - 2017-03-10 12145 - Fix a couple of bugs with the bitstreaming code. 12146 12147 v0.4e - 2017-02-17 12148 - Fix some warnings. 12149 12150 v0.4d - 2016-12-26 12151 - Add support for 32-bit floating-point PCM decoding. 12152 - Use drflac_int* and drflac_uint* sized types to improve compiler support. 12153 - Minor improvements to documentation. 12154 12155 v0.4c - 2016-12-26 12156 - Add support for signed 16-bit integer PCM decoding. 12157 12158 v0.4b - 2016-10-23 12159 - A minor change to drflac_bool8 and drflac_bool32 types. 12160 12161 v0.4a - 2016-10-11 12162 - Rename drBool32 to drflac_bool32 for styling consistency. 12163 12164 v0.4 - 2016-09-29 12165 - API/ABI CHANGE: Use fixed size 32-bit booleans instead of the built-in bool type. 12166 - API CHANGE: Rename drflac_open_and_decode*() to drflac_open_and_decode*_s32(). 12167 - API CHANGE: Swap the order of "channels" and "sampleRate" parameters in drflac_open_and_decode*(). Rationale for this is to 12168 keep it consistent with drflac_audio. 12169 12170 v0.3f - 2016-09-21 12171 - Fix a warning with GCC. 12172 12173 v0.3e - 2016-09-18 12174 - Fixed a bug where GCC 4.3+ was not getting properly identified. 12175 - Fixed a few typos. 12176 - Changed date formats to ISO 8601 (YYYY-MM-DD). 12177 12178 v0.3d - 2016-06-11 12179 - Minor clean up. 12180 12181 v0.3c - 2016-05-28 12182 - Fixed compilation error. 12183 12184 v0.3b - 2016-05-16 12185 - Fixed Linux/GCC build. 12186 - Updated documentation. 12187 12188 v0.3a - 2016-05-15 12189 - Minor fixes to documentation. 12190 12191 v0.3 - 2016-05-11 12192 - Optimizations. Now at about parity with the reference implementation on 32-bit builds. 12193 - Lots of clean up. 12194 12195 v0.2b - 2016-05-10 12196 - Bug fixes. 12197 12198 v0.2a - 2016-05-10 12199 - Made drflac_open_and_decode() more robust. 12200 - Removed an unused debugging variable 12201 12202 v0.2 - 2016-05-09 12203 - Added support for Ogg encapsulation. 12204 - API CHANGE. Have the onSeek callback take a third argument which specifies whether or not the seek 12205 should be relative to the start or the current position. Also changes the seeking rules such that 12206 seeking offsets will never be negative. 12207 - Have drflac_open_and_decode() fail gracefully if the stream has an unknown total sample count. 12208 12209 v0.1b - 2016-05-07 12210 - Properly close the file handle in drflac_open_file() and family when the decoder fails to initialize. 12211 - Removed a stale comment. 12212 12213 v0.1a - 2016-05-05 12214 - Minor formatting changes. 12215 - Fixed a warning on the GCC build. 12216 12217 v0.1 - 2016-05-03 12218 - Initial versioned release. 12219 */ 12220 12221 /* 12222 This software is available as a choice of the following licenses. Choose 12223 whichever you prefer. 12224 12225 =============================================================================== 12226 ALTERNATIVE 1 - Public Domain (www.unlicense.org) 12227 =============================================================================== 12228 This is free and unencumbered software released into the public domain. 12229 12230 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this 12231 software, either in source code form or as a compiled binary, for any purpose, 12232 commercial or non-commercial, and by any means. 12233 12234 In jurisdictions that recognize copyright laws, the author or authors of this 12235 software dedicate any and all copyright interest in the software to the public 12236 domain. We make this dedication for the benefit of the public at large and to 12237 the detriment of our heirs and successors. We intend this dedication to be an 12238 overt act of relinquishment in perpetuity of all present and future rights to 12239 this software under copyright law. 12240 12241 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12242 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 12243 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 12244 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 12245 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 12246 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 12247 12248 For more information, please refer to <http://unlicense.org/> 12249 12250 =============================================================================== 12251 ALTERNATIVE 2 - MIT No Attribution 12252 =============================================================================== 12253 Copyright 2020 David Reid 12254 12255 Permission is hereby granted, free of charge, to any person obtaining a copy of 12256 this software and associated documentation files (the "Software"), to deal in 12257 the Software without restriction, including without limitation the rights to 12258 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 12259 of the Software, and to permit persons to whom the Software is furnished to do 12260 so. 12261 12262 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12263 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 12264 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 12265 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 12266 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 12267 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 12268 SOFTWARE. 12269 */