github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/batch.md (about) 1 --- 2 layout: post 3 title: BATCH 4 permalink: /docs/batch 5 redirect_from: 6 - /batch.md/ 7 - /docs/batch.md/ 8 --- 9 10 ## Introduction 11 12 Extended actions (_xactions_) are batch operations, or jobs, that run asynchronously, report satistics (viewable at runtime and later), can be waited upon, and can be stopped. 13 14 Terminology-wise, in the code we mostly call it _xaction_ by the name of the corresponding software abstraction. But elsewhere, it is simply a _job_ - the two terms are interchangeable. 15 16 > In the source code, all supported *xactions* are enumerated [here](https://github.com/NVIDIA/aistore/blob/main/api/apc/actmsg.go). 17 18 For users, there's an API to start, stop, and wait for a job: 19 20 * [Go API: xaction](https://github.com/NVIDIA/aistore/blob/main/api/xaction.go) 21 * [Python API: job](https://github.com/NVIDIA/aistore/blob/main/python/aistore/sdk/job.py) 22 23 In CLI, there's `ais job` command and its subcommands (`<TAB-TAB>` completions): 24 25 ```commandline 26 $ ais job 27 start stop wait rm show 28 29 $ ais job start 30 prefetch download lru rebalance resilver ec-encode copy-bck 31 blob-download dsort etl cleanup mirror warm-up-metadata move-bck 32 ``` 33 34 Not all supported jobs can be started via `ais start` or by the corresponding Go or Python API call. Example, the job to copy or (ETL) transform datasets has its own dedicated API (both Python and Go) and CLI. 35 36 > See e.g., `ais cp --help` 37 38 Complete and most recently updated list of supported jobs can be found in this [table of job descriptors](https://github.com/NVIDIA/aistore/blob/main/xact/api.go#L111-L116). 39 40 Last (but not the least) is - time. Job execution may take many seconds, sometimes minutes or hours. 41 42 Examples include erasure coding or n-way mirroring a dataset, resharding and reshuffling a dataset, and more. 43 44 Global rebalance gets (automatically) triggered by any membership changes (nodes joining, leaving, powercycling, etc.) that can be further visualized via `ais show rebalance` CLI. 45 46 Another example would be _primary election_. AIS proxies provide access points ("endpoints") for the frontend API. At any point in time there is a single **primary** proxy that also controls versioning and distribution of the current cluster map. When and if the primary fails, another proxy is majority-elected to perform the (primary) role. 47 48 This (election by simple majority) is also a _job_ that cannot be started via `ais start` or the corresponding API. Similar to global rebalance, it is _event-driven_. Similar to rebalance, there's a separate dedicated API to run it administratively. 49 50 > Rebalance and a few other AIS jobs have their own CLI extensions. Generally, though, you can always monitor *xactions* via `ais show job xaction` command that also supports verbose mode and other options. 51 52 AIS subsystems integrate subsystem-specific stats - e.g.: 53 54 * [dSort](/docs/dsort.md) 55 * [Downloader](/docs/downloader.md) 56 * [ETL](/docs/etl.md) 57 58 Related CLI documentation: 59 60 * [CLI: `ais show job`](/docs/cli/job.md) 61 * [CLI: multi-object operations](/docs/cli/object.md#operations-on-lists-and-ranges) 62 * [CLI: reading, writing, and listing archives](/docs/cli/object.md) 63 * [CLI: copying buckets](/docs/cli/bucket.md#copy-bucket) 64 65 ## Table of Contents 66 - [Operations on multiple selected objects](#operations-on-multiple-selected-objects) 67 - [List](#list) 68 - [Range](#range) 69 - [Examples](#examples) 70 71 ## Operations on multiple selected objects 72 73 AIStore provides APIs to operate on *batches* of objects: 74 75 | API Message (apc.ActionMsg) | Description | 76 | --- | --- | 77 | `apc.ActCopyObjects` | copy multiple objects | 78 | `apc.ActDeleteObjects` | delete --/-- | 79 | `apc.ActETLObjects` | etl (transform) --/-- | 80 | `apc.ActEvictObjects` | evict --/-- | 81 | `apc.ActPrefetchObjects` | prefetch --/-- | 82 | `apc.ActArchive` | archive --/-- | 83 84 For CLI documentation and examples, please see [Operations on Lists and Ranges](cli/object.md#operations-on-lists-and-ranges). 85 86 There are two distinct ways to specify the objects: **list** them (ie., the names) explicitly, or specify a **template**. 87 88 Supported template syntax includes 3 standalone variations - 3 alternative formats: 89 90 1. bash (or shell) brace expansion: 91 * `prefix-{0..100}-suffix` 92 * `prefix-{00001..00010..2}-gap-{001..100..2}-suffix` 93 2. at style: 94 * `prefix-@100-suffix` 95 * `prefix-@00001-gap-@100-suffix` 96 3. fmt style: 97 * `prefix-%06d-suffix` 98 99 In all cases, prefix and/or suffix are optional. 100 101 #### List 102 103 List APIs take a JSON array of object names, and initiate the operation on those objects. 104 105 | Parameter | Description | 106 | --- | --- | 107 | objnames | JSON array of object names | 108 109 #### Range 110 111 | Parameter | Description | 112 | --- | --- | 113 | template | The object name template with optional range parts. If a range is omitted the template is used as an object name prefix | 114 115 #### Examples 116 117 All the following examples assume that the action is `delete` and the bucket name is `bck`, so only the value part of the request is shown: 118 119 `"value": {"list": "["obj1","dir/obj2"]"}` - deletes objects `obj1` and `dir/obj2` from the bucket `bck` 120 121 `"value": {"template": "obj-{07..10}"}` - removes the following objects from `bck`(note leading zeroes in object names): 122 123 - obj-07 124 - obj-08 125 - obj-09 126 - obj-10 127 128 `"value": {"template": "dir-{0..1}/obj-{07..08}"}` - template can contain more than one range, this example removes the following objects from `bck`(note leading zeroes in object names): 129 130 - dir-0/obj-07 131 - dir-0/obj-08 132 - dir-1/obj-07 133 - dir-1/obj-08 134 135 `"value": {"template": "dir-10/"}` - the template defines no ranges, so the request deletes all objects which names start with `dir-10/`