github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/batch.md (about)

     1  ---
     2  layout: post
     3  title: BATCH
     4  permalink: /docs/batch
     5  redirect_from:
     6   - /batch.md/
     7   - /docs/batch.md/
     8  ---
     9  
    10  ## Introduction
    11  
    12  Extended actions (_xactions_) are batch operations, or jobs, that run asynchronously, report satistics (viewable at runtime and later), can be waited upon, and can be stopped.
    13  
    14  Terminology-wise, in the code we mostly call it _xaction_ by the name of the corresponding software abstraction. But elsewhere, it is simply a _job_ - the two terms are interchangeable.
    15  
    16  > In the source code, all supported *xactions* are enumerated [here](https://github.com/NVIDIA/aistore/blob/main/api/apc/actmsg.go).
    17  
    18  For users, there's an API to start, stop, and wait for a job:
    19  
    20  * [Go API: xaction](https://github.com/NVIDIA/aistore/blob/main/api/xaction.go)
    21  * [Python API: job](https://github.com/NVIDIA/aistore/blob/main/python/aistore/sdk/job.py)
    22  
    23  In CLI, there's `ais job` command and its subcommands (`<TAB-TAB>` completions):
    24  
    25  ```commandline
    26  $ ais job
    27  start   stop    wait    rm      show
    28  
    29  $ ais job start
    30  prefetch           download           lru                rebalance          resilver           ec-encode          copy-bck
    31  blob-download      dsort              etl                cleanup            mirror             warm-up-metadata   move-bck
    32  ```
    33  
    34  Not all supported jobs can be started via `ais start` or by the corresponding Go or Python API call. Example, the job to copy or (ETL) transform datasets has its own dedicated API (both Python and Go) and CLI.
    35  
    36  > See e.g., `ais cp --help`
    37  
    38  Complete and most recently updated list of supported jobs can be found in this [table of job descriptors](https://github.com/NVIDIA/aistore/blob/main/xact/api.go#L111-L116).
    39  
    40  Last (but not the least) is - time. Job execution may take many seconds, sometimes minutes or hours.
    41  
    42  Examples include erasure coding or n-way mirroring a dataset, resharding and reshuffling a dataset, and more.
    43  
    44  Global rebalance gets (automatically) triggered by any membership changes (nodes joining, leaving, powercycling, etc.) that can be further visualized via `ais show rebalance` CLI.
    45  
    46  Another example would be _primary election_. AIS proxies provide access points ("endpoints") for the frontend API. At any point in time there is a single **primary** proxy that also controls versioning and distribution of the current cluster map. When and if the primary fails, another proxy is majority-elected to perform the (primary) role.
    47  
    48  This (election by simple majority) is also a _job_ that cannot be started via `ais start` or the corresponding API. Similar to global rebalance, it is _event-driven_. Similar to rebalance, there's a separate dedicated API to run it administratively.
    49  
    50  > Rebalance and a few other AIS jobs have their own CLI extensions. Generally, though, you can always monitor *xactions* via `ais show job xaction` command that also supports verbose mode and other options.
    51  
    52  AIS subsystems integrate subsystem-specific stats - e.g.:
    53  
    54  * [dSort](/docs/dsort.md)
    55  * [Downloader](/docs/downloader.md)
    56  * [ETL](/docs/etl.md)
    57  
    58  Related CLI documentation:
    59  
    60  * [CLI: `ais show job`](/docs/cli/job.md)
    61  * [CLI: multi-object operations](/docs/cli/object.md#operations-on-lists-and-ranges)
    62  * [CLI: reading, writing, and listing archives](/docs/cli/object.md)
    63  * [CLI: copying buckets](/docs/cli/bucket.md#copy-bucket)
    64  
    65  ## Table of Contents
    66  - [Operations on multiple selected objects](#operations-on-multiple-selected-objects)
    67    - [List](#list)
    68    - [Range](#range)
    69    - [Examples](#examples)
    70  
    71  ## Operations on multiple selected objects
    72  
    73  AIStore provides APIs to operate on *batches* of objects:
    74  
    75  | API Message (apc.ActionMsg) | Description |
    76  | --- | --- |
    77  | `apc.ActCopyObjects`     | copy multiple objects |
    78  | `apc.ActDeleteObjects`   | delete --/-- |
    79  | `apc.ActETLObjects`      | etl (transform) --/-- |
    80  | `apc.ActEvictObjects`    | evict --/-- |
    81  | `apc.ActPrefetchObjects` | prefetch --/-- |
    82  | `apc.ActArchive`         | archive --/-- |
    83  
    84  For CLI documentation and examples, please see [Operations on Lists and Ranges](cli/object.md#operations-on-lists-and-ranges).
    85  
    86  There are two distinct ways to specify the objects: **list** them (ie., the names) explicitly, or specify a **template**.
    87  
    88  Supported template syntax includes 3 standalone variations - 3 alternative formats:
    89  
    90  1. bash (or shell) brace expansion:
    91     * `prefix-{0..100}-suffix`
    92     * `prefix-{00001..00010..2}-gap-{001..100..2}-suffix`
    93  2. at style:
    94     * `prefix-@100-suffix`
    95     * `prefix-@00001-gap-@100-suffix`
    96  3. fmt style:
    97     * `prefix-%06d-suffix`
    98  
    99  In all cases, prefix and/or suffix are optional.
   100  
   101  #### List
   102  
   103  List APIs take a JSON array of object names, and initiate the operation on those objects.
   104  
   105  | Parameter | Description |
   106  | --- | --- |
   107  | objnames | JSON array of object names |
   108  
   109  #### Range
   110  
   111  | Parameter | Description |
   112  | --- | --- |
   113  | template | The object name template with optional range parts. If a range is omitted the template is used as an object name prefix |
   114  
   115  #### Examples
   116  
   117  All the following examples assume that the action is `delete` and the bucket name is `bck`, so only the value part of the request is shown:
   118  
   119  `"value": {"list": "["obj1","dir/obj2"]"}` - deletes objects `obj1` and `dir/obj2` from the bucket `bck`
   120  
   121  `"value": {"template": "obj-{07..10}"}` - removes the following objects from `bck`(note leading zeroes in object names):
   122  
   123  - obj-07
   124  - obj-08
   125  - obj-09
   126  - obj-10
   127  
   128  `"value": {"template": "dir-{0..1}/obj-{07..08}"}` - template can contain more than one range, this example removes the following objects from `bck`(note leading zeroes in object names):
   129  
   130  - dir-0/obj-07
   131  - dir-0/obj-08
   132  - dir-1/obj-07
   133  - dir-1/obj-08
   134  
   135  `"value": {"template": "dir-10/"}` - the template defines no ranges, so the request deletes all objects which names start with `dir-10/`