github.com/minio/minio@v0.0.0-20240328213742-3f72439b8a27/docs/batch-jobs/README.md (about)

     1  # MinIO Batch Job
     2  MinIO Batch jobs is an MinIO object management feature that lets you manage objects at scale. Jobs currently supported by MinIO
     3  
     4  - Replicate objects between buckets on multiple sites
     5  
     6  Upcoming Jobs
     7  
     8  - Copy objects from NAS to MinIO
     9  - Copy objects from HDFS to MinIO
    10  
    11  ## Replication Job
    12  To perform replication via batch jobs, you create a job. The job consists of a job description YAML that describes
    13  
    14  - Source location from where the objects must be copied from
    15  - Target location from where the objects must be copied to
    16  - Fine grained filtering is available to pick relevant objects from source to copy from
    17  
    18  MinIO batch jobs framework also provides
    19  
    20  - Retrying a failed job automatically driven by user input
    21  - Monitoring job progress in real-time
    22  - Send notifications upon completion or failure to user configured target
    23  
    24  Following YAML describes the structure of a replication job, each value is documented and self-describing.
    25  
    26  ```yaml
    27  replicate:
    28    apiVersion: v1
    29    # source of the objects to be replicated
    30    source:
    31  	type: TYPE # valid values are "minio"
    32  	bucket: BUCKET
    33  	prefix: PREFIX
    34  	# NOTE: if source is remote then target must be "local"
    35  	# endpoint: ENDPOINT
    36  	# credentials:
    37  	#   accessKey: ACCESS-KEY
    38  	#   secretKey: SECRET-KEY
    39  	#   sessionToken: SESSION-TOKEN # Available when rotating credentials are used
    40  
    41    # target where the objects must be replicated
    42    target:
    43  	type: TYPE # valid values are "minio"
    44  	bucket: BUCKET
    45  	prefix: PREFIX
    46  	# NOTE: if target is remote then source must be "local"
    47  	# endpoint: ENDPOINT
    48  	# credentials:
    49  	#   accessKey: ACCESS-KEY
    50  	#   secretKey: SECRET-KEY
    51  	#   sessionToken: SESSION-TOKEN # Available when rotating credentials are used
    52  
    53    # optional flags based filtering criteria
    54    # for all source objects
    55    flags:
    56  	filter:
    57  	  newerThan: "7d" # match objects newer than this value (e.g. 7d10h31s)
    58  	  olderThan: "7d" # match objects older than this value (e.g. 7d10h31s)
    59  	  createdAfter: "date" # match objects created after "date"
    60  	  createdBefore: "date" # match objects created before "date"
    61  
    62  	  ## NOTE: tags are not supported when "source" is remote.
    63  	  # tags:
    64  	  #   - key: "name"
    65  	  #     value: "pick*" # match objects with tag 'name', with all values starting with 'pick'
    66  
    67  	  ## NOTE: metadata filter not supported when "source" is non MinIO.
    68  	  # metadata:
    69  	  #   - key: "content-type"
    70  	  #     value: "image/*" # match objects with 'content-type', with all values starting with 'image/'
    71  
    72  	notify:
    73  	  endpoint: "https://notify.endpoint" # notification endpoint to receive job status events
    74  	  token: "Bearer xxxxx" # optional authentication token for the notification endpoint
    75  
    76  	retry:
    77  	  attempts: 10 # number of retries for the job before giving up
    78  	  delay: "500ms" # least amount of delay between each retry
    79  ```
    80  
    81  You can create and run multiple 'replication' jobs at a time there are no predefined limits set.
    82  
    83  ## Batch Jobs Terminology
    84  
    85  ### Job
    86  A job is the basic unit of work for MinIO Batch Job. A job is a self describing YAML, once this YAML is submitted and evaluated - MinIO performs the requested actions on each of the objects obtained under the described criteria in job YAML file.
    87  
    88  ### Type
    89  Type describes the job type, such as replicating objects between MinIO sites. Each job performs a single type of operation across all objects that match the job description criteria.
    90  
    91  ## Batch Jobs via Commandline
    92  [mc](http://github.com/minio/mc) provides 'mc batch' command to create, start and manage submitted jobs.
    93  
    94  ```
    95  NAME:
    96    mc batch - manage batch jobs
    97  
    98  USAGE:
    99    mc batch COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]
   100  
   101  COMMANDS:
   102    generate  generate a new batch job definition
   103    start     start a new batch job
   104    list, ls  list all current batch jobs
   105    status    summarize job events on MinIO server in real-time
   106    describe  describe job definition for a job
   107  ```
   108  
   109  ### Generate a job yaml
   110  ```
   111  mc batch generate alias/ replicate
   112  ```
   113  
   114  ### Start the batch job (returns back the JID)
   115  ```
   116  mc batch start alias/ ./replicate.yaml
   117  Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2022-09-26 17:19:06.296974771 -0700 PDT'
   118  ```
   119  
   120  ### List all batch jobs
   121  ```
   122  mc batch list alias/
   123  ID                      TYPE            USER            STARTED
   124  E24HH4nNMcgY5taynaPfxu  replicate       minioadmin      1 minute ago
   125  ```
   126  
   127  ### List all 'replicate' batch jobs
   128  ```
   129  mc batch list alias/ --type replicate
   130  ID                      TYPE            USER            STARTED
   131  E24HH4nNMcgY5taynaPfxu  replicate       minioadmin      1 minute ago
   132  ```
   133  
   134  ### Real-time 'status' for a batch job
   135  ```
   136  mc batch status myminio/ E24HH4nNMcgY5taynaPfxu
   137  ●∙∙
   138  Objects:        28766
   139  Versions:       28766
   140  Throughput:     3.0 MiB/s
   141  Transferred:    406 MiB
   142  Elapsed:        2m14.227222868s
   143  CurrObjName:    share/doc/xml-core/examples/foo.xmlcatalogs
   144  ```
   145  
   146  ### 'describe' the batch job yaml.
   147  ```
   148  mc batch describe myminio/ E24HH4nNMcgY5taynaPfxu
   149  replicate:
   150    apiVersion: v1
   151  ...
   152  ```