github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/checksum.md (about) 1 --- 2 layout: post 3 title: CHECKSUM 4 permalink: /docs/checksum 5 redirect_from: 6 - /checksum.md/ 7 - /docs/checksum.md/ 8 --- 9 10 ## Supported Checksums and Brief Theory of Operations 11 12 1. `xxhash` is the system-default checksum. 13 14 2. `xxhash` can be overridden on a bucket level; the following [CLI](/docs/cli.md) example configures bucket `abc` with `sha256` and bucket `xyz` without any checksum protection whatsoever: 15 16 ```console 17 $ ais bucket props ais://abc checksum.type <TAB-TAB> 18 crc32c md5 none sha256 sha512 xxhash 19 20 $ ais bucket props ais://abc checksum.type sha256 21 Bucket props successfully updated 22 "checksum.type" set to:"sha256" (was:"xxhash") 23 24 $ ais bucket props ais://xyz checksum.type none 25 Bucket props successfully updated 26 "checksum.type" set to:"none" (was:"xxhash") 27 ``` 28 29 > AIS-own metadata, both cluster-level and object metadata, is currently always protected with `xxhash`. 30 31 3. Unless checksum is disabled, objects stored in this bucket are protected with the checksum; user can override the system default on a bucket level by setting checksum=`none` (see example above). 32 33 4. Bucket (re)configuration can be done at any time. For instance, bucket's checksumming option can be changed from `xxhash` to `sha512`, and later to `crc32c`, and then back to `xxhash` - multiple times with no limitations. 34 35 5. An object with a bad checksum cannot be read from the bucket and cannot be replicated or migrated. Corrupted objects get eventually removed from the system. 36 37 6. GET and PUT operations support an option to validate checksums; validation is done against a checksum stored with an object (GET), or a checksum provided by a user (PUT). 38 39 7. Checksum configuration supports a number of options that can be changed both globally (for the entire cluster) and on a bucket level - notice the defaults below: 40 41 ```json 42 "checksum": { 43 "type": "xxhash", 44 "validate_cold_get": true, # validate cold GET from Cloud buckets 45 "validate_warm_get": false, # validate warm GET 46 "validate_obj_move": false, # validate object migration 47 "enable_read_range": false # enable checksumming for ranges 48 }, 49 ``` 50 51 8. In more detail: 52 53 * `checksum.type` (`string`): supports a number of checksums including `xxhash` (the current default); 54 * `checksum.validate_cold_get` (`bool`): indicates whether to perform checksum validation when cold GET-ing objects from Cloud buckets; 55 * `checksum.validate_warm_get` (`bool`): prescribes whether to perform checksum validation when reading objects stored in AIS cluster; 56 * `checksum.enable_read_range` (`bool`): indicates whether to generate checksums when executing GET(object, range), where `range` is offset and length (in bytes) to read; 57 * `checksum.validate_obj_move` (`bool`): indicates whether to perform checksum validation upon object migration. 58 59 9. Object replication is always checksum-protected. If an object does not have a checksum (see #3 above), the latter gets computed on the fly and stored with the object, so that subsequent replications/migrations could reuse it. 60 61 10. Finally, when two objects in the cluster have identical (bucket, object) names and identical checksums, they are considered to be full replicas of each other - the fact that allows optimizing PUT, replication, and object migration in a variety of use cases.