github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/cli/bucket.md (about) 1 --- 2 layout: post 3 title: BUCKET 4 permalink: /docs/cli/bucket 5 redirect_from: 6 - /cli/bucket.md/ 7 - /docs/cli/bucket.md/ 8 --- 9 10 It is easy to see all CLI operations on *buckets*: 11 12 ```console 13 $ ais bucket <TAB-TAB> 14 15 ls summary lru evict show create cp mv rm props 16 ``` 17 18 For convenience, a few of the most popular verbs are also aliased: 19 20 ```console 21 $ ais alias | grep bucket 22 cp bucket cp 23 create bucket create 24 evict bucket evict 25 ls bucket ls 26 rmb bucket rm 27 ``` 28 29 > For types of supported buckets (AIS, Cloud, backend, etc.) and many more examples, see [in-depth overview](/docs/bucket.md). 30 31 ## Table of Contents 32 - [Create bucket](#create-bucket) 33 - [Delete bucket](#delete-bucket) 34 - [List buckets](#list-buckets) 35 - [List objects](#list-objects) 36 - [Evict remote bucket](#evict-remote-bucket) 37 - [Move or Rename a bucket](#move-or-rename-a-bucket) 38 - [Copy bucket](#copy-bucket) 39 - [Copy multiple objects](#copy-multiple-objects) 40 - [Example copying buckets and multi-objects with simultaneous synchronization](#example-copying-buckets-and-multi-objects-with-simultaneous-synchronization) 41 - [Show bucket summary](#show-bucket-summary) 42 - [Start N-way Mirroring](#start-n-way-mirroring) 43 - [Start Erasure Coding](#start-erasure-coding) 44 - [Show bucket properties](#show-bucket-properties) 45 - [Set bucket properties](#set-bucket-properties) 46 - [Show and set AWS-specific properties](#show-and-set-aws-specific-properties) 47 - [Reset bucket properties to cluster defaults](#reset-bucket-properties-to-cluster-defaults) 48 - [Show bucket metadata](#show-bucket-metadata) 49 50 ## Create bucket 51 52 `ais create BUCKET [BUCKET...]` 53 54 Create bucket(s). 55 56 ### Examples 57 58 #### Create AIS bucket 59 60 Create buckets `bucket_name1` and `bucket_name2`, both with AIS provider. 61 62 ```console 63 $ ais create ais://bucket_name1 ais://bucket_name2 64 "ais://bucket_name1" bucket created 65 "ais://bucket_name2" bucket created 66 ``` 67 68 #### Create AIS bucket in local namespace 69 70 Create bucket `bucket_name` in `ml` namespace. 71 72 ```console 73 $ ais create ais://#ml/bucket_name 74 "ais://#ml/bucket_name" bucket created 75 ``` 76 77 #### Create bucket in remote AIS cluster 78 79 Create bucket `bucket_name` in global namespace of AIS remote cluster with `Bghort1l` UUID. 80 81 ```console 82 $ ais create ais://@Bghort1l/bucket_name 83 "ais://@Bghort1l/bucket_name" bucket created 84 ``` 85 86 Create bucket `bucket_name` in `ml` namespace of AIS remote cluster with `Bghort1l` UUID. 87 88 ```console 89 $ ais create ais://@Bghort1l#ml/bucket_name 90 "ais://@Bghort1l#ml/bucket_name" bucket created 91 ``` 92 93 #### Create bucket with custom properties 94 95 Create bucket `bucket_name` with custom properties specified. 96 97 ```console 98 $ # Key-value format 99 $ ais create ais://@Bghort1l/bucket_name --props="mirror.enabled=true mirror.copies=2" 100 "ais://@Bghort1l/bucket_name" bucket created 101 $ 102 $ # JSON format 103 $ ais create ais://@Bghort1l/bucket_name --props='{"versioning": {"enabled": true, "validate_warm_get": true}}' 104 "ais://@Bghort1l/bucket_name" bucket created 105 ``` 106 107 #### Incorrect buckets creation 108 109 ```console 110 $ ais create aws://bucket_name 111 Create bucket "aws://bucket_name" failed: creating a bucket for any of the cloud or HTTP providers is not supported 112 ``` 113 114 ## Delete bucket 115 116 `ais bucket rm BUCKET [BUCKET...]` 117 118 Delete an ais bucket or buckets. 119 120 ### Examples 121 122 #### Remove AIS buckets 123 124 Remove AIS buckets `bucket_name1` and `bucket_name2`. 125 126 ```console 127 $ ais bucket rm ais://bucket_name1 ais://bucket_name2 128 "ais://bucket_name1" bucket destroyed 129 "ais://bucket_name2" bucket destroyed 130 ``` 131 132 #### Remove AIS bucket in local namespace 133 134 Remove bucket `bucket_name` from `ml` namespace. 135 136 ```console 137 $ ais bucket rm ais://#ml/bucket_name 138 "ais://#ml/bucket_name" bucket destroyed 139 ``` 140 141 #### Remove bucket in remote AIS cluster 142 143 Remove bucket `bucket_name` from global namespace of AIS remote cluster with `Bghort1l` UUID. 144 145 ```console 146 $ ais bucket rm ais://@Bghort1l/bucket_name 147 "ais://@Bghort1l/bucket_name" bucket destroyed 148 ``` 149 150 Remove bucket `bucket_name` from `ml` namespace of AIS remote cluster with `Bghort1l` UUID. 151 152 ```console 153 $ ais bucket rm ais://@Bghort1l#ml/bucket_name 154 "ais://@Bghort1l#ml/bucket_name" bucket destroyed 155 ``` 156 157 #### Incorrect buckets removal 158 159 Removing remote buckets is not supported. 160 161 ```console 162 $ ais bucket rm aws://bucket_name 163 Operation "destroy-bck" is not supported by "aws://bucket_name" 164 ``` 165 166 ## List buckets 167 168 `ais ls [command options] PROVIDER:[//BUCKET_NAME]` 169 170 Notice the optional `[//BUCKET_NAME]`. When there's no bucket, `ais ls` will list **buckets**. Otherwise, it'll list **objects**. 171 172 ## Usage 173 174 ```console 175 $ ais ls --help 176 NAME: 177 ais ls - (alias for "bucket ls") list buckets, objects in buckets, and files in (.tar, .tgz or .tar.gz, .zip, .tar.lz4)-formatted objects, 178 e.g.: 179 * ais ls - list all buckets in a cluster (all providers); 180 * ais ls ais://abc -props name,size,copies,location - list all objects from a given bucket, include only the (4) specified properties; 181 * ais ls ais://abc -props all - same as above but include all properties; 182 * ais ls ais://abc --page-size 20 --refresh 3s - list a very large bucket (20 items in each page), report progress every 3s; 183 * ais ls ais - list all ais buckets; 184 * ais ls s3 - list all s3 buckets that are present in the cluster; 185 * ais ls s3 --all - list all s3 buckets, both present and remote; 186 with template, regex, and/or prefix: 187 * ais ls gs: --regex "^abc" --all - list all accessible GCP buckets with names starting with "abc"; 188 * ais ls ais://abc --regex ".md" --props size,checksum - list *.md objects with their respective sizes and checksums; 189 * ais ls gs://abc --template images/ - list all objects from the virtual subdirectory called "images"; 190 * ais ls gs://abc --prefix images/ - same as above (for more examples, see '--template' below); 191 and summary (stats): 192 * ais ls s3 --summary - for each s3 bucket in the cluster: print object numbers and total size(s); 193 * ais ls s3 --summary --all - generate summary report for all s3 buckets; include remote objects and buckets that are _not present_ 194 ``` 195 196 ## Assorted options 197 198 The options are numerous. Here's a non-exhaustive list (for the most recent update, run `ais ls --help`) 199 200 ```console 201 OPTIONS: 202 --all depending on the context: 203 - all objects in a given bucket, including misplaced and copies, or 204 - all buckets, including accessible (visible) remote buckets that are _not present_ in the cluster 205 --cached list only those objects from a remote bucket that are present ("cached") 206 --name-only faster request to retrieve only the names of objects (if defined, '--props' flag will be ignored) 207 --props value comma-separated list of object properties including name, size, version, copies, and more; e.g.: 208 --props all 209 --props name,size,cached 210 --props "ec, copies, custom, location" 211 --regex value regular expression; use it to match either bucket names or objects in a given bucket, e.g.: 212 ais ls --regex "(m|n)" - match buckets such as ais://nnn, s3://mmm, etc.; 213 ais ls ais://nnn --regex "^A" - match object names starting with letter A 214 --summary show object numbers, bucket sizes, and used capacity; applies _only_ to buckets and objects that are _present_ in the cluster 215 --units value show statistics and/or parse command-line specified sizes using one of the following _units of measurement_: 216 iec - IEC format, e.g.: KiB, MiB, GiB (default) 217 si - SI (metric) format, e.g.: KB, MB, GB 218 raw - do not convert to (or from) human-readable format 219 --no-headers, -H display tables without headers 220 --no-footers display tables without footers 221 ``` 222 223 ### `ais ls --regex "ngn*"` 224 225 List all buckets matching the `ngn*` regex expression. 226 227 ### `ais ls aws:` or (same) `ais ls s3` 228 229 List all _existing_ buckets for the specific provider. 230 231 ### `ais ls aws --all` or (same) `ais ls s3: --all` 232 233 List absolutely all buckets that cluster can "see" inclduing those that are not necessarily **present** in the cluster. 234 235 ### `ais ls ais://` or (same) `ais ls ais` 236 237 List all AIS buckets. 238 239 ### `ais ls ais://#name` 240 241 List all buckets for the `ais` provider and `name` namespace. 242 243 ### `ais ls ais://@uuid#namespace` 244 245 List all remote AIS buckets that have `uuid#namespace` namespace. Note that: 246 247 * the `uuid` must be the remote cluster UUID (or its alias) 248 * while the `namespace` is optional name of the remote namespace 249 250 As a rule of thumb, when a (logical) `#namespace` in the bucket's name is omitted we use the global namespace that always exists. 251 252 ## List objects 253 254 `ais ls` is one of those commands that only keeps growing, in terms of supported options and capabilities. 255 256 The command: 257 258 `ais ls [command options] PROVIDER:[//BUCKET_NAME]` 259 260 can conveniently list buckets (with or without "summarizing" object counts and sizes) and objects. 261 262 Notice the optional `[//BUCKET_NAME]`. When there's no bucket, `ais ls` will list **buckets**. Otherwise, it'll list **objects**. 263 264 The command's inline help is also quite extensive, with (inline) examples followed by numerous supported options: 265 266 ```console 267 $ ais ls --help 268 NAME: 269 ais ls - (alias for "bucket ls") list buckets, objects in buckets, and files in (.tar, .tgz or .tar.gz, .zip, .tar.lz4)-formatted objects, 270 e.g.: 271 * ais ls - list all buckets in a cluster (all providers); 272 * ais ls ais://abc -props name,size,copies,location - list all objects from a given bucket, include only the (4) specified properties; 273 * ais ls ais://abc -props all - same as above but include all properties; 274 * ais ls ais://abc --page-size 20 --refresh 3s - list a very large bucket (20 items in each page), report progress every 3s; 275 * ais ls ais - list all ais buckets; 276 * ais ls s3 - list all s3 buckets that are present in the cluster; 277 * ais ls s3 --all - list all s3 buckets, both in-cluster and remote; 278 with template, regex, and/or prefix: 279 * ais ls gs: --regex "^abc" --all - list all accessible GCP buckets with names starting with "abc"; 280 * ais ls ais://abc --regex ".md" --props size,checksum - list *.md objects with their respective sizes and checksums; 281 * ais ls gs://abc --template images/ - list all objects from the virtual subdirectory called "images"; 282 * ais ls gs://abc --prefix images/ - same as above (for more examples, see '--template' below); 283 with in-cluster vs remote content comparison (diff): 284 * ais ls s3://abc --check-versions - for each remote object in s3://abc: check whether it has identical in-cluster copy, 285 and show missing objects 286 * ais ls s3://abc --check-versions --cached - for each in-cluster object in s3://abc: check whether it has identical remote copy, 287 and show deleted objects 288 with summary (stats): 289 * ais ls s3 --summary - for each s3 bucket in the cluster: print object numbers and total size(s); 290 * ais ls s3 --summary --all - generate summary report for all s3 buckets; include remote objects and buckets that are _not present_ 291 * ais ls s3 --summary --all --dont-add - same as above but without adding _non-present_ remote buckets to cluster's BMD 292 293 USAGE: 294 ais ls [command options] PROVIDER:[//BUCKET_NAME] 295 296 OPTIONS: 297 --all depending on the context, list: 298 - all buckets, including accessible (visible) remote buckets that are _not present_ in the cluster 299 - all objects in a given accessible (visible) bucket, including remote objects and misplaced copies 300 --cached list only those objects from a remote bucket that are present ("cached") 301 --name-only faster request to retrieve only the names of objects (if defined, '--props' flag will be ignored) 302 --props value comma-separated list of object properties including name, size, version, copies, and more; e.g.: 303 --props all 304 --props name,size,cached 305 --props "ec, copies, custom, location" 306 --regex value regular expression; use it to match either bucket names or objects in a given bucket, e.g.: 307 ais ls --regex "(m|n)" - match buckets such as ais://nnn, s3://mmm, etc.; 308 ais ls ais://nnn --regex "^A" - match object names starting with letter A 309 --template value template to match object or file names; may contain prefix (that could be empty) with zero or more ranges 310 (with optional steps and gaps), e.g.: 311 --template "" # (an empty or '*' template matches eveything) 312 --template 'dir/subdir/' 313 --template 'shard-{1000..9999}.tar' 314 --template "prefix-{0010..0013..2}-gap-{1..2}-suffix" 315 and similarly, when specifying files and directories: 316 --template '/home/dir/subdir/' 317 --template "/abc/prefix-{0010..9999..2}-suffix" 318 --prefix value list objects that have names starting with the specified prefix, e.g.: 319 '--prefix a/b/c' - list virtual directory a/b/c and/or objects from the virtual directory 320 a/b that have their names (relative to this directory) starting with the letter 'c' 321 --page-size value maximum number of names per page (0 - the maximum is defined by the corresponding backend) (default: 0) 322 --paged list objects page by page, one page at a time (see also '--page-size' and '--limit') 323 --limit value limit object name count (0 - unlimited) (default: 0) 324 --refresh value interval for continuous monitoring; 325 valid time units: ns, us (or µs), ms, s (default), m, h 326 --show-unmatched list also objects that were _not_ matched by regex and/or template (range) 327 --no-headers, -H display tables without headers 328 --no-footers display tables without footers 329 --max-pages value display up to this number pages of bucket objects (default: 0) 330 --start-after value list bucket's content alphabetically starting with the first name _after_ the specified 331 --summary show object numbers, bucket sizes, and used capacity; 332 note: applies only to buckets and objects that are _present_ in the cluster 333 --skip-lookup do not execute HEAD(bucket) request to lookup remote bucket and its properties; possible usage scenarios include: 334 1) adding remote bucket to aistore without first checking the bucket's accessibility 335 (e.g., to configure the bucket's aistore properties with alternative security profile and/or endpoint) 336 2) listing public-access Cloud buckets where certain operations (e.g., 'HEAD(bucket)') may be disallowed 337 --dont-add list remote bucket without adding it to cluster's metadata 338 - let's say, s3://abc is accessible but not present in the cluster (e.g., 'ais ls' returns error); 339 - then, if we ask aistore to list remote buckets: `ais ls s3://abc --all' 340 the bucket will be added (in effect, it'll be created); 341 - to prevent this from happening, either use this '--dont-add' flag or run 'ais evict' command later 342 --archive list archived content (see docs/archive.md for details) 343 --units value show statistics and/or parse command-line specified sizes using one of the following _units of measurement_: 344 iec - IEC format, e.g.: KiB, MiB, GiB (default) 345 si - SI (metric) format, e.g.: KB, MB, GB 346 raw - do not convert to (or from) human-readable format 347 --silent server-side flag, an indication for aistore _not_ to log assorted errors (e.g., HEAD(object) failures) 348 --dont-wait when _summarizing_ buckets do not wait for the respective job to finish - 349 use the job's UUID to query the results interactively 350 --check-versions check whether listed remote objects and their in-cluster copies are identical, ie., have the same versions 351 - applies to remote backends that maintain at least some form of versioning information (e.g., version, checksum, ETag) 352 - see related: 'ais get --latest', 'ais cp --sync', 'ais prefetch --latest' 353 --help, -h show help 354 ``` 355 356 ### Assorted options 357 358 | Name | Type | Description | Default | 359 | --- | --- | --- | --- | 360 | `--regex` | `string` | regular expression to match and select items in question | `""` | 361 | `--template` | `string` | template for matching object names, e.g.: 'shard-{900..999}.tar' | `""` | 362 | `--prefix` | `string` | list objects matching a given prefix | `""` | 363 | `--page-size` | `int` | maximum number of names per page (0 - the maximum is defined by the corresponding backend) | `0` | 364 | `--props` | `string` | comma-separated list of object properties including name, size, version, copies, EC data and parity info, custom metadata, location, and more; to include all properties, type '--props all' (default: "name,size") | `"name,size"` | 365 | `--limit` | `int` | limit object name count (0 - unlimited) | `0` | 366 | `--show-unmatched` | `bool` | list objects that were not matched by regex and/or template | `false` | 367 | `--all` | `bool` | depending on context: all objects (including misplaced ones and copies) _or_ all buckets (including remote buckets that are not present in the cluster) | `false` | 368 | -no-headers, -H | `bool` | display tables without headers | `false` | 369 | --no-footers | `bool` | display tables without footers | `false` | 370 | `--paged` | `bool` | list objects page by page, one page at a time (see also '--page-size' and '--limit') | `false` | 371 | `--max-pages` | `int` | display up to this number pages of bucket objects (default: 0) | `0` | 372 | `--marker` | `string` | list bucket's content alphabetically starting with the first name _after_ the specified | `""` | 373 | `--start-after` | `string` | Object name (marker) after which the listing should start | `""` | 374 | `--cached` | `bool` | list only those objects from a remote bucket that are present ("cached") | `false` | 375 | `--skip-lookup` | `bool` | list public-access Cloud buckets that may disallow certain operations (e.g., `HEAD(bucket)`); use this option for performance _or_ to read Cloud buckets that allow _anonymous_ access | `false` | 376 | `--archive` | `bool` | list archived content | `false` | 377 | `--check-versions` | `bool` | check whether listed remote objects and their in-cluster copies are identical, ie., have the same versions; applies to remote backends that maintain at least some form of versioning information (e.g., version, checksum, ETag) | `false` | 378 | `--summary` | `bool` | show bucket sizes and used capacity; by default, applies only to the buckets that are _present_ in the cluster (use '--all' option to override) | `false` | 379 | `--bytes` | `bool` | show sizes in bytes (ie., do not convert to KiB, MiB, GiB, etc.) | `false` | 380 | `--name-only` | `bool` | fast request to retrieve only the names of objects in the bucket; if defined, all comma-separated fields in the `--props` flag will be ignored with only two exceptions: `name` and `status` | `false` | 381 382 ### Examples 383 384 #### List AIS and Cloud buckets with all defaults 385 386 List objects in the AIS bucket `bucket_name`. 387 388 ```console 389 $ ais ls ais://bucket_name 390 NAME SIZE 391 shard-0.tar 16.00KiB 392 shard-1.tar 16.00KiB 393 ... 394 ``` 395 396 List objects in the remote bucket `bucket_name`. 397 398 ```console 399 ais ls aws://bucket_name 400 NAME SIZE 401 shard-0.tar 16.00KiB 402 shard-1.tar 16.00KiB 403 ... 404 ``` 405 406 #### Include all properties 407 408 ```console 409 # ais ls gs://webdataset-abc --skip-lookup --props all 410 NAME SIZE CHECKSUM ATIME VERSION CACHED TARGET URL STATUS COPIES 411 coco-train2014-seg-000000.tar 958.48MiB bdb89d1b854040b6050319e80ef44dde 1657297128665686 no http://aistore:8081 ok 0 412 coco-train2014-seg-000001.tar 958.47MiB 8b94939b7d166114498e794859fb472c 1657297129387272 no http://aistore:8081 ok 0 413 coco-train2014-seg-000002.tar 958.47MiB 142a8e81f965f9bcafc8b04eda65a0ce 1657297129904067 no http://aistore:8081 ok 0 414 coco-train2014-seg-000003.tar 958.22MiB 113024d5def81365cbb6c404c908efb1 1657297130555590 no http://aistore:8081 ok 0 415 ... 416 ``` 417 418 #### List bucket from AIS remote cluster 419 420 List objects in the bucket `bucket_name` and `ml` namespace contained on AIS remote cluster with `Bghort1l` UUID. 421 422 ```console 423 $ ais ls ais://@Bghort1l#ml/bucket_name 424 NAME SIZE VERSION 425 shard-0.tar 16.00KiB 1 426 shard-1.tar 16.00KiB 1 427 ... 428 ``` 429 430 #### With prefix 431 432 List objects which match given prefix. 433 434 ```console 435 $ ais ls ais://bucket_name --prefix "shard-1" 436 NAME SIZE VERSION 437 shard-1.tar 16.00KiB 1 438 shard-10.tar 16.00KiB 1 439 ``` 440 441 #### List archived contect 442 443 ```console 444 $ ais ls ais://abc/ --prefix log 445 NAME SIZE 446 log.tar.gz 3.11KiB 447 448 $ ais ls ais://abc/ --prefix log --archive 449 NAME SIZE 450 log.tar.gz 3.11KiB 451 log2.tar.gz/t_2021-07-27_14-08-50.log 959B 452 log2.tar.gz/t_2021-07-27_14-10-36.log 959B 453 log2.tar.gz/t_2021-07-27_14-12-18.log 959B 454 log2.tar.gz/t_2021-07-27_14-13-23.log 295B 455 log2.tar.gz/t_2021-07-27_14-13-31.log 1.02KiB 456 log2.tar.gz/t_2021-07-27_14-14-16.log 1.71KiB 457 log2.tar.gz/t_2021-07-27_14-15-15.log 1.90KiB 458 ``` 459 460 #### List anonymously (i.e., list public-access Cloud bucket) 461 462 ```console 463 $ ais ls gs://webdataset-abc --skip-lookup 464 NAME SIZE 465 coco-train2014-seg-000000.tar 958.48MiB 466 coco-train2014-seg-000001.tar 958.47MiB 467 coco-train2014-seg-000002.tar 958.47MiB 468 coco-train2014-seg-000003.tar 958.22MiB 469 coco-train2014-seg-000004.tar 958.56MiB 470 coco-train2014-seg-000005.tar 958.19MiB 471 ... 472 ``` 473 474 #### Use '--prefix' that crosses shard boundary 475 476 For starters, we archive all aistore docs: 477 478 ```console 479 $ ais put docs ais://A.tar --archive -r 480 ``` 481 To list a certain virtual subdirectory _inside_ this newly created shard: 482 483 ```console 484 $ ais archive ls ais://nnn --prefix "A.tar/tutorials" 485 NAME SIZE 486 A.tar/tutorials/README.md 561B 487 A.tar/tutorials/etl/compute_md5.md 8.28KiB 488 A.tar/tutorials/etl/etl_imagenet_pytorch.md 4.16KiB 489 A.tar/tutorials/etl/etl_webdataset.md 3.97KiB 490 Listed: 4 names 491 ```` 492 493 or, same: 494 495 ```console 496 $ ais ls ais://nnn --prefix "A.tar/tutorials" --archive 497 NAME SIZE 498 A.tar/tutorials/README.md 561B 499 A.tar/tutorials/etl/compute_md5.md 8.28KiB 500 A.tar/tutorials/etl/etl_imagenet_pytorch.md 4.16KiB 501 A.tar/tutorials/etl/etl_webdataset.md 3.97KiB 502 Listed: 4 names 503 ``` 504 505 ## Evict remote bucket 506 507 `ais bucket evict BUCKET` 508 509 Evict a [remote bucket](/docs/bucket.md#remote-bucket). It also resets the properties of the bucket (if changed). 510 All data from the remote bucket stored in the cluster will be removed, and AIS will stop keeping track of the remote bucket. 511 Read more about this feature [here](/docs/bucket.md#evict-remote-bucket). 512 513 ```console 514 $ ais bucket evict aws://abc 515 "aws://abc" bucket evicted 516 517 # Dry run: the cluster will not be modified 518 $ ais bucket evict --dry-run aws://abc 519 [DRY RUN] No modifications on the cluster 520 EVICT: "aws://abc" 521 522 # Only evict the remote bucket's data (AIS will retain the bucket's metadata) 523 $ ais bucket evict --keep-md aws://abc 524 "aws://abc" bucket evicted 525 ``` 526 527 Here's a fuller example that lists remote bucket and then reads and evicts a selected object: 528 529 ```console 530 $ ais ls gs://wrQkliptRt 531 NAME SIZE 532 TDXBNBEZNl.tar 8.50KiB 533 qFpwOOifUe.tar 8.50KiB 534 thmdpZXetG.tar 8.50KiB 535 536 $ ais get gcp://wrQkliptRt/qFpwOOifUe.tar /tmp/qFpwOOifUe.tar 537 GET "qFpwOOifUe.tar" from bucket "gcp://wrQkliptRt" as "/tmp/qFpwOOifUe.tar" [8.50KiB] 538 539 $ ais ls gs://wrQkliptRt --props all 540 NAME SIZE CHECKSUM ATIME VERSION CACHED STATUS COPIES 541 TDXBNBEZNl.tar 8.50KiB 33345a69bade096a30abd42058da4537 1622133976984266 no ok 0 542 qFpwOOifUe.tar 8.50KiB 47dd59e41f6b7723 28 May 21 12:02 PDT 1622133846120151 yes ok 1 543 thmdpZXetG.tar 8.50KiB cfe0c386e91daa1571d6a659f49b1408 1622137609269706 no ok 0 544 545 $ ais bucket evict gcp://wrQkliptRt 546 "gcp://wrQkliptRt" bucket evicted 547 548 $ ais ls gs://wrQkliptRt --props all 549 NAME SIZE CHECKSUM ATIME VERSION CACHED STATUS COPIES 550 TDXBNBEZNl.tar 8.50KiB 33345a69bade096a30abd42058da4537 1622133976984266 no ok 0 551 qFpwOOifUe.tar 8.50KiB 8b5919c0850a07d931c3c46ed9101eab 1622133846120151 no ok 0 552 thmdpZXetG.tar 8.50KiB cfe0c386e91daa1571d6a659f49b1408 1622137609269706 no ok 0 553 ``` 554 555 ## Move or Rename a bucket 556 557 `ais bucket mv BUCKET NEW_BUCKET` 558 559 Move (ie. rename) an AIS bucket. 560 If the `NEW_BUCKET` already exists, the `mv` operation will not proceed. 561 562 > Cloud bucket move is not supported. 563 564 ### Examples 565 566 #### Move AIS bucket 567 568 Move AIS bucket `bucket_name` to AIS bucket `new_bucket_name`. 569 570 ```console 571 $ ais bucket mv ais://bucket_name ais://new_bucket_name 572 Moving bucket "ais://bucket_name" to "ais://new_bucket_name" in progress. 573 To check the status, run: ais show job xaction mvlb ais://new_bucket_name 574 ``` 575 576 ## Copy bucket 577 578 `ais cp [command options] SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET` 579 580 Source bucket must exist. When the destination bucket is remote (e.g. in the Cloud) it must also exist and be writeable. 581 582 > **NOTE:** there's _no_ requirement that either of the buckets is _present_ in aistore. 583 584 > **NOTE:** not to confuse in-cluster _presence_ and existence. Remote object may exist (remotely), etc. 585 586 Moreover, when the destination is AIS (`ais://`) or remote AIS (`ais://@remote-alias`) bucket, the existence is optional: the destination will be created on the fly, with bucket properties copied from the source (`SRC_BUCKET`). 587 588 Finally, the option to copy remote bucket onto itself is also supported - syntax-wise. Here's an example that'll shed some light: 589 590 ```console 591 ## 1. at first, we don't have any gs:// buckets in the cluster 592 593 $ ais ls gs 594 No "gs://" buckets in the cluster. Use '--all' option to list matching remote buckets, if any. 595 596 ## 2. notwithstanding, we go ahead and start copying gs://coco-dataset 597 598 $ ais cp gs://coco-dataset gs://coco-dataset --prefix d-tokens --progress --all 599 Copied objects: 282/393 [===========================================>------------------] 72 % 600 Copied size: 719.48 MiB / 1000.08 MiB [============================================>-----------------] 72 % 601 602 ## 3. and done: all 393 objects from the remote bucket are now present ("cached") in the cluster 603 604 $ ais ls gs://coco-dataset --cached | grep Listed 605 Listed: 393 names 606 ``` 607 608 > Incidentally, notice the `--cached` difference: 609 610 ```console 611 $ ais ls gs://coco-dataset --cached | grep Listed 612 Listed: 393 names 613 614 ## vs _all_ including remote: 615 616 $ ais ls gs://coco-dataset | grep Listed 617 Listed: 2,290 names 618 ``` 619 620 ### Options 621 622 ```console 623 $ ais cp --help 624 NAME: 625 ais cp - (alias for "bucket cp") copy entire bucket or selected objects (to select, use '--list', '--template', or '--prefix'), e.g.: 626 - 'ais cp gs://webdaset-coco ais://dst' - copy entire Cloud bucket; 627 - 'ais cp s3://abc ais://nnn --all' - copy entire Cloud bucket that may not be _present_ in the cluster; 628 - 'ais cp s3://abc gs://xyz --all' - copy Cloud bucket to another Cloud; 629 - 'ais cp s3://abc ais://nnn --latest' - copy Cloud bucket, and make sure that already present in-cluster copies are updated to the latest (remote) versions; 630 - 'ais cp s3://abc ais://nnn --sync' - same as above, but in addition delete in-cluster copies that do not exist (any longer) in the source bucket 631 with template, prefix, and/or progress bar: 632 - 'ais cp ais://nnn/111 ais://mmm' - copy a single object (assuming, prefix '111' corresponds to a single object); 633 - 'ais cp gs://webdataset-coco ais:/dst --template d-tokens/shard-{000000..000999}.tar.lz4' - copy up to 1000 objects that share the specified prefix; 634 - 'ais cp gs://webdataset-coco ais:/dst --prefix d-tokens/ --progress --all' - show progress while copying virtual subdirectory 'd-tokens' 635 636 USAGE: 637 ais cp [command options] SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET 638 639 OPTIONS: 640 --list value comma-separated list of object or file names, e.g.: 641 --list 'o1,o2,o3' 642 --list "abc/1.tar, abc/1.cls, abc/1.jpeg" 643 or, when listing files and/or directories: 644 --list "/home/docs, /home/abc/1.tar, /home/abc/1.jpeg" 645 --template value template to match object or file names; may contain prefix (that could be empty) with zero or more ranges 646 (with optional steps and gaps), e.g.: 647 --template "" # (an empty or '*' template matches eveything) 648 --template 'dir/subdir/' 649 --template 'shard-{1000..9999}.tar' 650 --template "prefix-{0010..0013..2}-gap-{1..2}-suffix" 651 and similarly, when specifying files and directories: 652 --template '/home/dir/subdir/' 653 --template "/abc/prefix-{0010..9999..2}-suffix" 654 --prefix value select objects that have names starting with the specified prefix, e.g.: 655 '--prefix a/b/c' - matches names 'a/b/c/d', 'a/b/cdef', and similar; 656 '--prefix a/b/c/' - only matches objects from the virtual directory a/b/c/ 657 --all copy all objects from a remote bucket including those that are not present (not "cached") in the cluster 658 --cont-on-err keep running archiving xaction (job) in presence of errors in a any given multi-object transaction 659 --force, -f force an action 660 --dry-run show total size of new objects without really creating them 661 --prepend value prefix to prepend to every copied object name, e.g.: 662 --prepend=abc - prefix all copied object names with "abc" 663 --prepend=abc/ - copy objects into a virtual directory "abc" (note trailing filepath separator) 664 --progress show progress bar(s) and progress of execution in real time 665 --refresh value interval for continuous monitoring; 666 valid time units: ns, us (or µs), ms, s (default), m, h 667 --wait wait for an asynchronous operation to finish (optionally, use '--timeout' to limit the waiting time) 668 --timeout value maximum time to wait for a job to finish; if omitted: wait forever or until Ctrl-C; 669 valid time units: ns, us (or µs), ms, s (default), m, h 670 --latest check in-cluster metadata and, possibly, GET, download, prefetch, or copy the latest object version 671 from the associated remote bucket: 672 - provides operation-level control over object versioning (and version synchronization) 673 without requiring to change bucket configuration 674 - the latter can be done using 'ais bucket props set BUCKET versioning' 675 - see also: 'ais ls --check-versions', 'ais cp', 'ais prefetch', 'ais get' 676 --sync synchronize destination bucket with its remote (e.g., Cloud or remote AIS) source; 677 the option is a stronger variant of the '--latest' (option) - in addition it entails 678 removing of the objects that no longer exist remotely 679 (see also: 'ais show bucket versioning' and the corresponding documentation) 680 --help, -h show help 681 682 ``` 683 684 ### Examples 685 686 #### Copy _non-existing_ remote bucket to a non-existing in-cluster destination 687 688 ```console 689 $ ais ls s3 690 No "s3://" buckets in the cluster. Use '--all' option to list matching remote buckets, if any. 691 692 $ ais cp s3://abc ais://nnn --all 693 Warning: destination ais://nnn doesn't exist and will be created with configuration copied from the source (s3://abc)) 694 Copying s3://abc => ais://nnn. To monitor the progress, run 'ais show job tco-JcTKbhvFy' 695 ``` 696 697 #### Copy AIS bucket 698 699 Copy AIS bucket `src_bucket` to AIS bucket `dst_bucket`. 700 701 ```console 702 $ ais cp ais://src_bucket ais://dst_bucket 703 Copying bucket "ais://bucket_name" to "ais://dst_bucket" in progress. 704 To check the status, run: ais show job xaction copy-bck ais://dst_bucket 705 ``` 706 707 #### Copy AIS bucket and wait until the job finishes 708 709 The same as above, but wait until copying is finished. 710 711 ```console 712 $ ais cp ais://src_bucket ais://dst_bucket --wait 713 ``` 714 715 #### Copy cloud bucket to another cloud bucket 716 717 Copy AWS bucket `src_bucket` to AWS bucket `dst_bucket`. 718 719 ```console 720 # Make sure that both buckets exist. 721 $ ais ls aws:// 722 AWS Buckets (2) 723 aws://src_bucket 724 aws://dst_bucket 725 $ ais cp aws://src_bucket aws://dst_bucket 726 Copying bucket "aws://src_bucket" to "aws://dst_bucket" in progress. 727 To check the status, run: ais show job xaction copy-bck aws://dst_bucket 728 ``` 729 730 ## Copy multiple objects 731 732 The same `ais cp` command can also copy multiple selected objects. Here's the corresponding excerpt from the inline help: 733 734 ```console 735 $ ais cp --help 736 NAME: 737 ais cp - (alias for "bucket cp") copy entire bucket or selected objects (to select multiple, use '--list' or '--template') 738 739 USAGE: 740 ais cp [command options] SRC_BUCKET[/OBJECT_NAME_or_TEMPLATE] DST_BUCKET 741 742 OPTIONS: 743 --list value comma-separated list of object or file names, e.g.: 744 --list 'o1,o2,o3' 745 --list "abc/1.tar, abc/1.cls, abc/1.jpeg" 746 or, when listing files and/or directories: 747 --list "/home/docs, /home/abc/1.tar, /home/abc/1.jpeg" 748 --template value template to match object or file names; may contain prefix (that could be empty) with zero or more ranges 749 (with optional steps and gaps), e.g.: 750 --template "" # (an empty or '*' template matches eveything) 751 --template 'dir/subdir/' 752 --template 'shard-{1000..9999}.tar' 753 --template "prefix-{0010..0013..2}-gap-{1..2}-suffix" 754 and similarly, when specifying files and directories: 755 --template '/home/dir/subdir/' 756 --template "/abc/prefix-{0010..9999..2}-suffix" 757 --prefix value select objects that have names starting with the specified prefix, e.g.: 758 '--prefix a/b/c' - matches names 'a/b/c/d', 'a/b/cdef', and similar; 759 '--prefix a/b/c/' - only matches objects from the virtual directory a/b/c/ 760 --all copy all objects from a remote bucket including those that are not present (not "cached") in the cluster 761 ... 762 ... 763 ``` 764 765 ### Examples 766 767 **1.** Copy objects `obj1.tar` and `obj1.info` from bucket `ais://bck1` to `ais://bck2`, and wait until the operation finishes 768 769 ```console 770 $ ais cp ais://bck1 ais://bck2 --list obj1.tar,obj1.info --wait 771 copying objects operation ("ais://bck1" => "ais://bck2") is in progress... 772 copying objects operation succeeded. 773 ``` 774 775 **2.** Copy objects matching Bash brace-expansion `obj{2..4}, do not wait for the operation is done. 776 777 ```console 778 $ ais cp ais://bck1 ais://bck2 --template "obj{2..4}" 779 copying objects operation ("ais://bck1" => "ais://bck2") is in progress... 780 To check the status, run: ais show job xaction copy-bck ais://bck2 781 ``` 782 783 **3.** Use `--sync` option to copy remote virtual subdirectory 784 785 ```console 786 $ ais cp gs://coco-dataset --sync --prefix d-tokens 787 Copying objects gs://coco-dataset. To monitor the progress, run 'ais show job tco-kJPUtYJld' 788 ``` 789 790 In the example, `--sync` synchronizes destination bucket with its remote (e.g., Cloud) source. 791 792 In particular, the option will make sure that aistore has the **latest** versions of remote objects _and_ may also entail **removing** of the objects that no longer exist remotely 793 794 ### See also 795 796 * [Out of band updates](/docs/out_of_band.md) 797 798 ## Example copying buckets and multi-objects with simultaneous synchronization 799 800 There's a [script](https://github.com/NVIDIA/aistore/blob/main/ais/test/scripts/cp-sync-remais-out-of-band.sh) that we use for testing. When run, it produces the following output: 801 802 ```console 803 $ ./ais/test/scripts/cp-sync-remais-out-of-band.sh --bucket gs://abc 804 805 1. generate and write 500 random shards => gs://abc 806 2. copy gs://abc => ais://dst-9408 807 3. remove 10 shards from the source 808 4. copy gs://abc => ais://dst-9408 w/ synchronization ('--sync' option) 809 5. remove another 10 shards 810 6. copy multiple objects using bash-expansion defined range and '--sync' 811 # 812 # out of band DELETE using remote AIS (remais) 813 # 814 7. use remote AIS cluster ("remais") to out-of-band remove 10 shards from the source 815 8. copy gs://abc => ais://dst-9408 w/ --sync 816 9. when copying, we always synchronize content of the in-cluster source as well 817 10. use remais to out-of-band remove 10 more shards from gs://abc source 818 11. copy a range of shards from gs://abc to ais://dst-9408, and compare 819 12. and again: when copying, we always synchronize content of the in-cluster source as well 820 # 821 # out of band ADD using remote AIS (remais) 822 # 823 13. use remais to out-of-band add (i.e., PUT) 17 new shards 824 14. copy a range of shards from gs://abc to ais://dst-9408, and check whether the destination has new shards 825 15. compare the contents but NOTE: as of v3.22, this part requires multi-object copy (using '--list' or '--template') 826 ``` 827 828 The [script](https://github.com/NVIDIA/aistore/blob/main/ais/test/scripts/cp-sync-remais-out-of-band.sh) executes a sequence of steps (above). 829 830 Notice a certain limitation (that also shows up as the last step #15): 831 832 * As of the version 3.22, aistore `cp` commands will always synchronize _deleted_ and _updated_ remote content. 833 834 * However, to see an out-of-band added content, you currently need to run [multi-object copy](#copy-multiple-objects), with multiple source objects specified using `--list` or `--template`. 835 836 * See `ais cp --help` for details. 837 838 ## Show bucket summary 839 840 `ais storage summary [command options] PROVIDER:[//BUCKET_NAME] - show bucket sizes and the respective percentages of used capacity on a per-bucket basis 841 842 `ais bucket summary` - same as above. 843 844 ### Options 845 846 ```console 847 NAME: 848 ais storage summary - show bucket sizes and %% of used capacity on a per-bucket basis 849 850 USAGE: 851 ais storage summary [command options] PROVIDER:[//BUCKET_NAME] 852 853 OPTIONS: 854 --refresh value interval for continuous monitoring; 855 valid time units: ns, us (or µs), ms, s (default), m, h 856 --count value used together with '--refresh' to limit the number of generated reports, e.g.: 857 '--refresh 10 --count 5' - run 5 times with 10s interval (default: 0) 858 --prefix value for each bucket, select only those objects (names) that start with the specified prefix, e.g.: 859 '--prefix a/b/c' - sum-up sizes of the virtual directory a/b/c and objects from the virtual directory 860 a/b that have names (relative to this directory) starting with the letter c 861 --cached list only those objects from a remote bucket that are present ("cached") 862 --units value show statistics and/or parse command-line specified sizes using one of the following _units of measurement_: 863 iec - IEC format, e.g.: KiB, MiB, GiB (default) 864 si - SI (metric) format, e.g.: KB, MB, GB 865 raw - do not convert to (or from) human-readable format 866 --verbose, -v verbose output 867 --dont-wait when _summarizing_ buckets do not wait for the respective job to finish - 868 use the job's UUID to query the results interactively 869 --no-headers, -H display tables without headers 870 --help, -h show help 871 ``` 872 873 If `BUCKET` is omitted, the command *applies* to all [AIS buckets](/docs/bucket.md#ais-bucket). 874 875 The output includes the total number of objects in a bucket, the bucket's size (bytes, megabytes, etc.), and the percentage of the total capacity used by the bucket. 876 877 A few additional words must be said about `--validate`. The option is provided to run integrity checks, namely: locations of objects, replicas, and EC slices in the bucket, the number of replicas (and whether this number agrees with the bucket configuration), and more. 878 879 > Location of each stored object must at any point in time correspond to the current cluster map and, within each storage target, to the target's [mountpaths](/docs/overview.md#terminology). A failure to abide by location rules is called *misplacement*; misplaced objects - if any - must be migrated to their proper locations via automated processes called `global rebalance` and `resilver`: 880 881 * [global rebalance and reslver](/docs/rebalance.md) 882 * [resilvering selected targets: advanced usage](/docs/resourcesvanced.md) 883 884 ### Notes 885 886 1. `--validate` may take considerable time to execute (depending, of course, on sizes of the datasets in question and the capabilities of the underlying hardware); 887 2. non-zero *misplaced* objects in the (validated) output is a direct indication that the cluster requires rebalancing and/or resilvering; 888 3. `--fast=false` is another command line option that may also significantly increase execution time; 889 4. by default, `--fast` is set to `true`, which also means that bucket summary executes a *faster* logic (that may have a certain minor speed/accuracy trade-off); 890 5. to obtain the most precise results, run the command with `--fast=false` - and prepare to wait. 891 892 ### Examples 893 894 ```console 895 # 1. show summary for a specific bucket 896 $ ais bucket summary ais://abc 897 NAME OBJECTS SIZE ON DISK USAGE(%) 898 ais://abc 10902 5.38GiB 1% 899 900 For min/avg/max object sizes, use `--fast=false`. 901 ``` 902 903 ```console 904 # 2. "summarize" all buckets(*) 905 $ ais bucket summary 906 NAME OBJECTS SIZE ON DISK USAGE(%) 907 ais://abc 10902 5.38GiB 1% 908 ais://nnn 49873 200.00MiB 0% 909 ``` 910 911 ```console 912 # 3. "summarize" all s3:// buckets; count both "cached" and remote objects: 913 $ ais bucket summary s3: --all 914 ``` 915 916 ```console 917 # 4. same as above with progress updates every 3 seconds: 918 $ ais bucket summary s3: --all --refresh 3 919 ``` 920 921 ```console 922 # 4. "summarize" a given gs:// bucket; start the job and exit without waiting for it to finish 923 # (see prompt below): 924 $ ais bucket summary gs://abc --all --dont-wait 925 926 Job summary[wl-s5lIWA] has started. To monitor, run 'ais storage summary gs://abc wl-s5lIWA --dont-wait' or 'ais show job wl-s5lIWA; 927 see '--help' for details' 928 ``` 929 930 ## Start N-way Mirroring 931 932 `ais start mirror BUCKET --copies <value>` 933 934 Start an extended action to bring a given bucket to a certain redundancy level (`value` copies). Read more about this feature [here](/docs/storage_svcs.md#n-way-mirror). 935 936 ### Options 937 938 | Flag | Type | Description | Default | 939 | --- | --- | --- | --- | 940 | `--copies` | `int` | Number of copies | `1` | 941 942 ## Start Erasure Coding 943 944 `ais ec-encode BUCKET --data-slices <value> --parity-slices <value>` 945 946 Start an extended action that enables data protection for a given bucket and encodes all its objects. 947 Erasure coding must be disabled for the bucket prior to running `ec-encode` extended action. 948 Read more about this feature [here](/docs/storage_svcs.md#erasure-coding). 949 950 ### Options 951 952 | Flag | Type | Description | 953 | --- | --- | --- | 954 | `--data-slices`, `--data`, `-d` | `int` | Number of data slices | 955 | `--parity-slices`, `--parity`, `-p` | `int` | Number of parity slices | 956 957 All options are required and must be greater than `0`. 958 959 ## Show bucket properties 960 961 Overall, the topic called "bucket properties" is rather involved and includes sub-topics "bucket property inhertance" and "cluster-wide global defaults". For background, please first see: 962 963 * [Default Bucket Properties](/docs/bucket.md#default-bucket-properties) 964 * [Inherited Bucket Properties and LRU](/docs/bucket.md#inherited-bucket-properties-and-lru) 965 * [Backend Provider](/docs/bucket.md#backend-provider) 966 * [Global cluster-wide configuration](/docs/configuration.md#cluster-and-node-configuration). 967 968 Now, as far as CLI, run the following to list [properties](/docs/bucket.md#properties-and-options) of the specified bucket. 969 By default, a certain compact form of bucket props sections is presented. 970 971 `ais bucket props show BUCKET [PROP_PREFIX]` 972 973 When `PROP_PREFIX` is set, only props that start with `PROP_PREFIX` will be displayed. 974 Useful `PROP_PREFIX` are: `access, checksum, ec, lru, mirror, provider, versioning`. 975 976 > `ais bucket show` is an alias for `ais show bucket` - both can be used interchangeably. 977 978 ### Options 979 980 | Flag | Type | Description | Default | 981 | --- | --- | --- | --- | 982 | `--json` | `bool` | Output in JSON format | `false` | 983 | `--compact`, `-c` | `bool` | Show list of properties in compact human-readable mode | `false` | 984 985 ### Examples 986 987 #### Show bucket props with provided section 988 989 Show only `lru` section of bucket props for `bucket_name` bucket. 990 991 ```console 992 $ ais bucket props show s3://bucket-name --compact 993 PROPERTY VALUE 994 access GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN 995 checksum Type: xxhash | Validate: Nothing 996 created 2024-01-31T15:42:59-08:00 997 ec Disabled 998 lru lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m 999 mirror Disabled 1000 present yes 1001 provider aws 1002 versioning Disabled 1003 1004 $ ais bucket props show s3://bucket_name lru --compact 1005 PROPERTY VALUE 1006 lru lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m 1007 1008 $ ais bucket props show s3://ais-abhishek lru 1009 PROPERTY VALUE 1010 lru.capacity_upd_time 10m 1011 lru.dont_evict_time 2h0m 1012 lru.enabled true 1013 ``` 1014 1015 ## Set bucket properties 1016 1017 `ais bucket props set [OPTIONS] BUCKET JSON_SPECIFICATION|KEY=VALUE [KEY=VALUE...]` 1018 1019 Set bucket properties. 1020 For the available options, see [bucket-properties](/docs/bucket.md#bucket-properties). 1021 1022 If JSON_SPECIFICATION is used, **all** properties of the bucket are set based on the values in the JSON object. 1023 1024 ### Options 1025 1026 | Flag | Type | Description | Default | 1027 | --- | --- | --- | --- | 1028 | `--force` | `bool` | Ignore non-critical errors | `false` | 1029 1030 When JSON specification is not used, some properties support user-friendly aliases: 1031 1032 | Property | Value alias | Description | 1033 | --- | --- | --- | 1034 | access | `ro` | Disables bucket modifications: denies PUT, DELETE, and ColdGET requests | 1035 | access | `rw` | Enables object modifications: allows PUT, DELETE, and ColdGET requests | 1036 | access | `su` | Enables full access: all `rw` permissions, bucket deletion, and changing bucket permissions | 1037 1038 ### Examples 1039 1040 #### Enable mirroring for a bucket 1041 1042 Set the `mirror.enabled` and `mirror.copies` properties to `true` and `2` respectively, for the bucket `bucket_name` 1043 1044 ```console 1045 $ ais bucket props set ais://bucket_name 'mirror.enabled=true' 'mirror.copies=2' 1046 Bucket props successfully updated 1047 "mirror.enabled" set to:"true" (was:"false") 1048 ``` 1049 1050 #### Make a bucket read-only 1051 1052 Set read-only access to the bucket `bucket_name`. 1053 All PUT and DELETE requests will fail. 1054 1055 ```console 1056 $ ais bucket props set ais://bucket_name 'access=ro' 1057 Bucket props successfully updated 1058 "access" set to:"GET,HEAD-OBJECT,HEAD-BUCKET,LIST-OBJECTS" (was:"<PREV_ACCESS_LIST>") 1059 ``` 1060 1061 #### Configure custom AWS S3 endpoint 1062 1063 When a bucket is hosted by an S3 compliant backend (such as, e.g., minio), we may want to specify an alternative S3 endpoint, 1064 so that AIS nodes use it when reading, writing, listing, and generally, performing all operations on remote S3 bucket(s). 1065 1066 Globally, S3 endpoint can be overridden for _all_ S3 buckets via "S3_ENDPOINT" environment. 1067 If you decide to make the change, you may need to restart AIS cluster while making sure that "S3_ENDPOINT" is available for the AIS nodes 1068 when they are starting up. 1069 1070 But it can be also be done - and will take precedence over the global setting - on a per-bucket basis. 1071 1072 Here are some examples: 1073 1074 ```console 1075 # Let's say, there exists a bucket called s3://abc: 1076 $ ais ls s3://abc 1077 NAME SIZE 1078 README.md 8.96KiB 1079 1080 # First, we override empty the endpoint property in the bucket's configuration. 1081 # To see that a non-empty value *applies* and works, we will use the default AWS S3 endpoint: https://s3.amazonaws.com 1082 $ ais bucket props set s3://abc extra.aws.endpoint=s3.amazonaws.com 1083 Bucket "aws://abc": property "extra.aws.endpoint=s3.amazonaws.com", nothing to do 1084 $ ais ls s3://abc 1085 NAME SIZE 1086 README.md 8.96KiB 1087 1088 # Second, set the endpoint=foo (or, it could be any other invalid value), and observe that the bucket becomes unreachable: 1089 $ ais bucket props set s3://abc extra.aws.endpoint=foo 1090 Bucket props successfully updated 1091 "extra.aws.endpoint" set to: "foo" (was: "s3.amazonaws.com") 1092 $ ais ls s3://abc 1093 RequestError: send request failed: dial tcp: lookup abc.foo: no such host 1094 1095 # Finally, revert the endpoint back to empty, and check that the bucket is visible again: 1096 $ ais bucket props set s3://abc extra.aws.endpoint="" 1097 Bucket props successfully updated 1098 "extra.aws.endpoint" set to: "" (was: "foo") 1099 $ ais ls s3://abc 1100 NAME SIZE 1101 README.md 8.96KiB 1102 ``` 1103 1104 > Global `export S3_ENDPOINT=...` override is static and readonly. Use it with extreme caution as it applies to all buckets. 1105 1106 > On the other hand, for any given `s3://bucket` its S3 endpoint can be set, unset, and otherwise changed at any time - at runtime. As shown above. 1107 1108 #### Connect/Disconnect AIS bucket to/from cloud bucket 1109 1110 Set backend bucket for AIS bucket `bucket_name` to the GCP cloud bucket `cloud_bucket`. 1111 Once the backend bucket is set, operations (get, put, list, etc.) with `ais://bucket_name` will be exactly as we would do with `gcp://cloud_bucket`. 1112 It's like a symlink to a cloud bucket. 1113 The only difference is that all objects will be cached into `ais://bucket_name` (and reflected in the cloud as well) instead of `gcp://cloud_bucket`. 1114 1115 ```console 1116 $ ais bucket props set ais://bucket_name backend_bck=gcp://cloud_bucket 1117 Bucket props successfully updated 1118 "backend_bck.name" set to: "cloud_bucket" (was: "") 1119 "backend_bck.provider" set to: "gcp" (was: "") 1120 ``` 1121 1122 To disconnect cloud bucket do: 1123 1124 ```console 1125 $ ais bucket props set ais://bucket_name backend_bck=none 1126 Bucket props successfully updated 1127 "backend_bck.name" set to: "" (was: "cloud_bucket") 1128 "backend_bck.provider" set to: "" (was: "gcp") 1129 ``` 1130 1131 #### Ignore non-critical errors 1132 1133 To create an erasure-encoded bucket or enable EC for an existing bucket, AIS requires at least `ec.data_slices + ec.parity_slices + 1` targets. 1134 At the same time, for small objects (size is less than `ec.objsize_limit`) it is sufficient to have only `ec.parity_slices + 1` targets. 1135 Option `--force` allows creating erasure-encoded buckets when the number of targets is not enough but the number exceeds `ec.parity_slices`. 1136 1137 Note that if the number of targets is less than `ec.data_slices + ec.parity_slices + 1`, the cluster accepts only objects smaller than `ec.objsize_limit`. 1138 Bigger objects are rejected on PUT. 1139 1140 In examples a cluster with 6 targets is used: 1141 1142 ```console 1143 $ # Creating a bucket 1144 $ ais create ais://bck --props "ec.enabled=true ec.data_slices=6 ec.parity_slices=4" 1145 Create bucket "ais://bck" failed: EC config (6 data, 4 parity) slices requires at least 11 targets (have 6) 1146 $ 1147 $ ais create ais://bck --props "ec.enabled=true ec.data_slices=6 ec.parity_slices=4" --force 1148 "ais://bck" bucket created 1149 $ 1150 $ # If the number of targets is less than or equal to ec.parity_slices even `--force` does not help 1151 $ 1152 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 8 1153 EC config (6 data, 8 parity)slices requires at least 15 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v". 1154 $ 1155 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 8 --force 1156 EC config (6 data, 8 parity)slices requires at least 15 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v". 1157 $ 1158 $ # Use force to enable EC if the number of target is sufficient to keep `ec.parity_slices+1` replicas 1159 $ 1160 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 4 1161 EC config (6 data, 8 parity)slices requires at least 11 targets (have 6). To show bucket properties, run "ais show bucket BUCKET -v". 1162 $ 1163 $ ais bucket props set ais://bck ec.enabled true ec.data_slices 6 ec.parity_slices 4 --force 1164 Bucket props successfully updated 1165 "ec.enabled" set to: "true" (was: "false") 1166 "ec.parity_slices" set to: "4" (was: "2") 1167 ``` 1168 1169 Once erasure encoding is enabled for a bucket, the number of data and parity slices cannot be modified. 1170 The minimum object size `ec.objsize_limit` can be changed on the fly. 1171 To avoid accidental modification when EC for a bucket is enabled, the option `--force` must be used. 1172 1173 ```console 1174 $ ais bucket props set ais://bck ec.enabled true 1175 Bucket props successfully updated 1176 "ec.enabled" set to: "true" (was: "false") 1177 $ 1178 $ ais bucket props set ais://bck ec.objsize_limit 320000 1179 P[dBbfp8080]: once enabled, EC configuration can be only disabled but cannot change. To show bucket properties, run "ais show bucket BUCKET -v". 1180 $ 1181 $ ais bucket props set ais://bck ec.objsize_limit 320000 --force 1182 Bucket props successfully updated 1183 "ec.objsize_limit" set to:"320000" (was:"262144") 1184 ``` 1185 1186 #### Set bucket properties with JSON 1187 1188 Set **all** bucket properties for `bucket_name` bucket based on the provided JSON specification. 1189 1190 ```bash 1191 $ ais bucket props set ais://bucket_name '{ 1192 "provider": "ais", 1193 "versioning": { 1194 "enabled": true, 1195 "validate_warm_get": false 1196 }, 1197 "checksum": { 1198 "type": "xxhash", 1199 "validate_cold_get": true, 1200 "validate_warm_get": false, 1201 "validate_obj_move": false, 1202 "enable_read_range": false 1203 }, 1204 "lru": { 1205 "dont_evict_time": "20m", 1206 "capacity_upd_time": "1m", 1207 "enabled": true 1208 }, 1209 "mirror": { 1210 "copies": 2, 1211 "burst_buffer": 512, 1212 "enabled": false 1213 }, 1214 "ec": { 1215 "objsize_limit": 256000, 1216 "data_slices": 2, 1217 "parity_slices": 2, 1218 "enabled": true 1219 }, 1220 "access": "255" 1221 }' 1222 "access" set to: "GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT" (was: "GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN") 1223 "ec.enabled" set to: "true" (was: "false") 1224 "ec.objsize_limit" set to: "256000" (was: "262144") 1225 "lru.capacity_upd_time" set to: "1m" (was: "10m") 1226 "lru.dont_evict_time" set to: "20m" (was: "1s") 1227 "lru.enabled" set to: "true" (was: "false") 1228 "mirror.enabled" set to: "false" (was: "true") 1229 1230 Bucket props successfully updated. 1231 ``` 1232 1233 ```console 1234 $ ais show bucket ais://bucket_name --compact 1235 PROPERTY VALUE 1236 access GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT 1237 checksum Type: xxhash | Validate: ColdGET 1238 created 2024-02-02T12:57:17-08:00 1239 ec 2:2 (250KiB) 1240 lru lru.dont_evict_time=20m, lru.capacity_upd_time=1m 1241 mirror Disabled 1242 present yes 1243 provider ais 1244 versioning Enabled | Validate on WarmGET: no 1245 ``` 1246 1247 If not all properties are mentioned in the JSON, the missing ones are set to zero values (empty / `false` / `nil`): 1248 1249 ```bash 1250 $ ais bucket props set ais://bucket-name '{ 1251 "mirror": { 1252 "enabled": true, 1253 "copies": 2 1254 }, 1255 "versioning": { 1256 "enabled": true, 1257 "validate_warm_get": true 1258 } 1259 }' 1260 "mirror.enabled" set to: "true" (was: "false") 1261 "versioning.validate_warm_get" set to: "true" (was: "false") 1262 1263 Bucket props successfully updated. 1264 1265 $ ais show bucket ais://bucket-name --compact 1266 PROPERTY VALUE 1267 access GET,HEAD-OBJECT,PUT,APPEND,DELETE-OBJECT,MOVE-OBJECT,PROMOTE,UPDATE-OBJECT,HEAD-BUCKET,LIST-OBJECTS,PATCH,SET-BUCKET-ACL,LIST-BUCKETS,SHOW-CLUSTER,CREATE-BUCKET,DESTROY-BUCKET,MOVE-BUCKET,ADMIN 1268 checksum Type: xxhash | Validate: Nothing 1269 created 2024-02-02T12:52:30-08:00 1270 ec Disabled 1271 lru lru.dont_evict_time=2h0m, lru.capacity_upd_time=10m 1272 mirror 2 copies 1273 present yes 1274 provider ais 1275 versioning Enabled | Validate on WarmGET: yes 1276 ``` 1277 1278 ## Show and set AWS-specific properties 1279 1280 AIStore supports AWS-specific configuration on a per s3 bucket basis. Any bucket that is backed up by an AWS S3 bucket (**) can be configured to use alternative: 1281 1282 * named AWS profiles (with alternative credentials and/or region) 1283 * alternative s3 endpoints 1284 1285 For background and usage examples, please see [AWS-specific bucket configuration](aws_profile_endpoint.md). 1286 1287 > (**) Terminology-wise, "s3 bucket" is a shortcut phrase indicating a bucket in an AIS cluster that either (A) has the same name (e.g. `s3://abc`) or (B) a differently named AIS bucket that has `backend_bck` property that specifies the s3 bucket in question. 1288 1289 ## Reset bucket properties to cluster defaults 1290 1291 `ais bucket props reset BUCKET` 1292 1293 Reset bucket properties to [cluster defaults](/docs/resourcesnfig.md). 1294 1295 ### Examples 1296 1297 ```console 1298 $ ais bucket props reset bucket_name 1299 Bucket props successfully reset 1300 ``` 1301 1302 ## Show bucket metadata 1303 1304 `ais show cluster bmd` 1305 1306 Show bucket metadata (BMD). 1307 1308 ### Examples 1309 1310 ```console 1311 $ ais show cluster bmd 1312 PROVIDER NAMESPACE NAME BACKEND COPIES EC(D/P, minsize) CREATED 1313 ais test 2 25 Mar 21 18:28 PDT 1314 ais validation 25 Mar 21 18:29 PDT 1315 ais train 25 Mar 21 18:28 PDT 1316 1317 Version: 9 1318 UUID: jcUfFDyTN 1319 ```