github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/checksum.md (about)

     1  ---
     2  layout: post
     3  title: CHECKSUM
     4  permalink: /docs/checksum
     5  redirect_from:
     6   - /checksum.md/
     7   - /docs/checksum.md/
     8  ---
     9  
    10  ## Supported Checksums and Brief Theory of Operations
    11  
    12  1. `xxhash` is the system-default checksum.
    13  
    14  2. `xxhash` can be overridden on a bucket level; the following [CLI](/docs/cli.md) example configures bucket `abc` with `sha256` and bucket `xyz` without any checksum protection whatsoever:
    15  
    16  	```console
    17  	$ ais bucket props ais://abc checksum.type  <TAB-TAB>
    18  	crc32c   md5      none     sha256   sha512   xxhash
    19  
    20  	$ ais bucket props ais://abc checksum.type sha256
    21  	Bucket props successfully updated
    22  	"checksum.type" set to:"sha256" (was:"xxhash")
    23  
    24  	$ ais bucket props ais://xyz checksum.type none
    25  	Bucket props successfully updated
    26  	"checksum.type" set to:"none" (was:"xxhash")
    27  	```
    28  
    29  	> AIS-own metadata, both cluster-level and object metadata, is currently always protected with `xxhash`.
    30  
    31  3. Unless checksum is disabled, objects stored in this bucket are protected with the checksum; user can override the system default on a bucket level by setting checksum=`none` (see example above).
    32  
    33  4. Bucket (re)configuration can be done at any time. For instance, bucket's checksumming option can be changed from `xxhash` to `sha512`,  and later to `crc32c`, and then back to `xxhash` - multiple times with no limitations.
    34  
    35  5. An object with a bad checksum cannot be read from the bucket and cannot be replicated or migrated. Corrupted objects get eventually removed from the system.
    36  
    37  6. GET and PUT operations support an option to validate checksums; validation is done against a checksum stored with an object (GET), or a checksum provided by a user (PUT).
    38  
    39  7. Checksum configuration supports a number of options that can be changed both globally (for the entire cluster) and on a bucket level - notice the defaults below:
    40  
    41  	```json
    42  		"checksum": {
    43  			"type":			"xxhash",
    44  			"validate_cold_get":	true,      # validate cold GET from Cloud buckets
    45  			"validate_warm_get":	false,     # validate warm GET
    46  			"validate_obj_move":	false,     # validate object migration
    47  			"enable_read_range":	false      # enable checksumming for ranges
    48  		},
    49  	```
    50  
    51  8. In more detail:
    52  
    53  	* `checksum.type` (`string`): supports a number of checksums including `xxhash` (the current default);
    54  	* `checksum.validate_cold_get` (`bool`): indicates whether to perform checksum validation when cold GET-ing objects from Cloud buckets;
    55  	* `checksum.validate_warm_get` (`bool`): prescribes whether to perform checksum validation when reading objects stored in AIS cluster;
    56  	* `checksum.enable_read_range` (`bool`): indicates whether to generate checksums when executing GET(object, range), where `range` is offset and length (in bytes) to read;
    57  	* `checksum.validate_obj_move` (`bool`): indicates whether to perform checksum validation upon object migration.
    58  
    59  9. Object replication is always checksum-protected. If an object does not have a checksum (see #3 above), the latter gets computed on the fly and stored with the object, so that subsequent replications/migrations could reuse it.
    60  
    61  10. Finally, when two objects in the cluster have identical (bucket, object) names and identical checksums, they are considered to be full replicas of each other - the fact that allows optimizing PUT, replication, and object migration in a variety of use cases.