github.com/minio/minio@v0.0.0-20240328213742-3f72439b8a27/docs/batch-jobs/README.md (about) 1 # MinIO Batch Job 2 MinIO Batch jobs is an MinIO object management feature that lets you manage objects at scale. Jobs currently supported by MinIO 3 4 - Replicate objects between buckets on multiple sites 5 6 Upcoming Jobs 7 8 - Copy objects from NAS to MinIO 9 - Copy objects from HDFS to MinIO 10 11 ## Replication Job 12 To perform replication via batch jobs, you create a job. The job consists of a job description YAML that describes 13 14 - Source location from where the objects must be copied from 15 - Target location from where the objects must be copied to 16 - Fine grained filtering is available to pick relevant objects from source to copy from 17 18 MinIO batch jobs framework also provides 19 20 - Retrying a failed job automatically driven by user input 21 - Monitoring job progress in real-time 22 - Send notifications upon completion or failure to user configured target 23 24 Following YAML describes the structure of a replication job, each value is documented and self-describing. 25 26 ```yaml 27 replicate: 28 apiVersion: v1 29 # source of the objects to be replicated 30 source: 31 type: TYPE # valid values are "minio" 32 bucket: BUCKET 33 prefix: PREFIX 34 # NOTE: if source is remote then target must be "local" 35 # endpoint: ENDPOINT 36 # credentials: 37 # accessKey: ACCESS-KEY 38 # secretKey: SECRET-KEY 39 # sessionToken: SESSION-TOKEN # Available when rotating credentials are used 40 41 # target where the objects must be replicated 42 target: 43 type: TYPE # valid values are "minio" 44 bucket: BUCKET 45 prefix: PREFIX 46 # NOTE: if target is remote then source must be "local" 47 # endpoint: ENDPOINT 48 # credentials: 49 # accessKey: ACCESS-KEY 50 # secretKey: SECRET-KEY 51 # sessionToken: SESSION-TOKEN # Available when rotating credentials are used 52 53 # optional flags based filtering criteria 54 # for all source objects 55 flags: 56 filter: 57 newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s) 58 olderThan: "7d" # match objects older than this value (e.g. 7d10h31s) 59 createdAfter: "date" # match objects created after "date" 60 createdBefore: "date" # match objects created before "date" 61 62 ## NOTE: tags are not supported when "source" is remote. 63 # tags: 64 # - key: "name" 65 # value: "pick*" # match objects with tag 'name', with all values starting with 'pick' 66 67 ## NOTE: metadata filter not supported when "source" is non MinIO. 68 # metadata: 69 # - key: "content-type" 70 # value: "image/*" # match objects with 'content-type', with all values starting with 'image/' 71 72 notify: 73 endpoint: "https://notify.endpoint" # notification endpoint to receive job status events 74 token: "Bearer xxxxx" # optional authentication token for the notification endpoint 75 76 retry: 77 attempts: 10 # number of retries for the job before giving up 78 delay: "500ms" # least amount of delay between each retry 79 ``` 80 81 You can create and run multiple 'replication' jobs at a time there are no predefined limits set. 82 83 ## Batch Jobs Terminology 84 85 ### Job 86 A job is the basic unit of work for MinIO Batch Job. A job is a self describing YAML, once this YAML is submitted and evaluated - MinIO performs the requested actions on each of the objects obtained under the described criteria in job YAML file. 87 88 ### Type 89 Type describes the job type, such as replicating objects between MinIO sites. Each job performs a single type of operation across all objects that match the job description criteria. 90 91 ## Batch Jobs via Commandline 92 [mc](http://github.com/minio/mc) provides 'mc batch' command to create, start and manage submitted jobs. 93 94 ``` 95 NAME: 96 mc batch - manage batch jobs 97 98 USAGE: 99 mc batch COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...] 100 101 COMMANDS: 102 generate generate a new batch job definition 103 start start a new batch job 104 list, ls list all current batch jobs 105 status summarize job events on MinIO server in real-time 106 describe describe job definition for a job 107 ``` 108 109 ### Generate a job yaml 110 ``` 111 mc batch generate alias/ replicate 112 ``` 113 114 ### Start the batch job (returns back the JID) 115 ``` 116 mc batch start alias/ ./replicate.yaml 117 Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2022-09-26 17:19:06.296974771 -0700 PDT' 118 ``` 119 120 ### List all batch jobs 121 ``` 122 mc batch list alias/ 123 ID TYPE USER STARTED 124 E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago 125 ``` 126 127 ### List all 'replicate' batch jobs 128 ``` 129 mc batch list alias/ --type replicate 130 ID TYPE USER STARTED 131 E24HH4nNMcgY5taynaPfxu replicate minioadmin 1 minute ago 132 ``` 133 134 ### Real-time 'status' for a batch job 135 ``` 136 mc batch status myminio/ E24HH4nNMcgY5taynaPfxu 137 ●∙∙ 138 Objects: 28766 139 Versions: 28766 140 Throughput: 3.0 MiB/s 141 Transferred: 406 MiB 142 Elapsed: 2m14.227222868s 143 CurrObjName: share/doc/xml-core/examples/foo.xmlcatalogs 144 ``` 145 146 ### 'describe' the batch job yaml. 147 ``` 148 mc batch describe myminio/ E24HH4nNMcgY5taynaPfxu 149 replicate: 150 apiVersion: v1 151 ... 152 ```