github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/job-specification/multiregion.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: multiregion Stanza - Job Specification
     4  description: |-
     5    The "multiregion" stanza specifies that a job will be deployed to multiple federated
     6    regions.
     7  ---
     8  
     9  # `multiregion` Stanza
    10  
    11  <Placement groups={[['job', 'multiregion']]} />
    12  
    13  <EnterpriseAlert />
    14  
    15  The `multiregion` stanza specifies that a job will be deployed to multiple
    16  [federated regions]. If omitted, the job will be deployed to a single region—the
    17  one specified by the `region` field or the `-region` command line flag to
    18  `nomad job run`.
    19  
    20  Federated Nomad clusters are members of the same gossip cluster but not the
    21  same raft cluster; they don't share their data stores. Each region in a
    22  multiregion deployment gets an independent copy of the job, parameterized with
    23  the values of the `region` stanza. Nomad regions coordinate to rollout each
    24  region's deployment using rules determined by the `strategy` stanza.
    25  
    26  ```hcl
    27  job "docs" {
    28    multiregion {
    29  
    30      strategy {
    31        max_parallel = 1
    32        on_failure   = "fail_all"
    33      }
    34  
    35      region "west" {
    36        count = 2
    37        datacenters = ["west-1"]
    38        meta {
    39          my-key = "my-value-west"
    40        }
    41      }
    42  
    43      region "east" {
    44        count = 5
    45        datacenters = ["east-1", "east-2"]
    46        meta {
    47          my-key = "my-value-east"
    48        }
    49      }
    50    }
    51  }
    52  ```
    53  
    54  ## Multiregion Deployment States
    55  
    56  A single region deployment using one of the various [upgrade strategies]
    57  begins in the `running` state, and ends in the `successful` state, the
    58  `canceled` state (if another deployment supersedes it before it it's
    59  complete), or the `failed` state. A failed single region deployment may
    60  automatically revert to the previous version of the job if its `update`
    61  stanza has the [`auto_revert`][update-auto-revert] setting.
    62  
    63  In a multiregion deployment, regions begin in the `pending` state. This allows
    64  Nomad to determine that all regions have accepted the job before
    65  continuing. At this point up to `max_parallel` regions will enter `running` at
    66  a time. When each region completes its local deployment, it enters a `blocked`
    67  state where it waits until the last region has completed the deployment. The
    68  final region will unblock the regions to mark them as `successful`.
    69  
    70  ## Parameterized Dispatch
    71  
    72  Job dispatching is region specific. While a [parameterized job] can be 
    73  registered in multiple [federated regions] like any other job, a parameterized
    74  job operates much like a function definition that takes variable input.
    75  Operators are expected to invoke the job by invoking [`job dispatch`] 
    76  from the CLI or the [HTTP API] and provide the appropriate dispatch options
    77  for that region.
    78  
    79  ## Periodic Time Zones
    80  
    81  Multiregion periodic jobs share [time zone] configuration, with UTC being the
    82  default. Operators should be mindful of this when registering multiregion jobs.
    83  For example, a periodic configuration that specifies the job should run every
    84  night at midnight New York time, may result in an undesirable execution time
    85  if one of the target regions is set to Tokyo time.
    86  
    87  ## `multiregion` Parameters
    88  
    89  - `strategy` <code>([Strategy](#strategy-parameters): nil)</code> - Specifies
    90    a rollout strategy for the regions.
    91  
    92  - `region` <code>([Region](#region-parameters): nil)</code> - Specifies the
    93    parameters for a specific region. This can be specified multiple times to
    94    define the set of regions for the multiregion deployment. Regions are
    95    ordered; depending on the rollout strategy Nomad may roll out to each region
    96    in order or to several at a time.
    97  
    98  ~> **Note:** Regions can be added, but regions that are removed will not be
    99  stopped and will be ignored by the deployment. This behavior may change before
   100  multiregion deployments are considered GA.
   101  
   102  ### `strategy` Parameters
   103  
   104  - `max_parallel` `(int: <optional>)` - Specifies the maximum number
   105    of region deployments that a multiregion will have in a running state at a
   106    time. By default, Nomad will deploy all regions simultaneously.
   107  
   108  - `on_failure` `(string: <optional>)` - Specifies the behavior when a region
   109    deployment fails. Available options are `"fail_all"`, `"fail_local"`, or
   110    the default (empty `""`). This field and its interactions with the job's
   111    [`update` stanza] is described in the [examples] below.
   112  
   113    Each region within a multiregion deployment follows the `auto_revert`
   114    strategy of its own `update` stanza (if any). The multiregion `on_failure`
   115    field tells Nomad how many other regions should be marked as failed when one
   116    region's deployment fails:
   117  
   118    - The default behavior is that the failed region and all regions that come
   119      after it in order are marked as failed.
   120  
   121    - If `on_failure: "fail_all"` is set, all regions will be marked as
   122      failed. If all regions have already completed their deployments, it's
   123      possible that a region may transition from `blocked` to `successful` while
   124      another region is failing. This successful region cannot be rolled back.
   125  
   126    - If `on_failure: "fail_local"` is set, only the failed region will be marked
   127      as failed. The remaining regions will move on to `blocked` status. At this
   128      point, you'll need to manually unblock regions to mark them successful
   129      with the [`nomad deployment unblock`] command or correct the conditions
   130      that led to the failure and resubmit the job.
   131  
   132  ~> For `system` jobs, only [`max_parallel`](#max_parallel) is enforced. The
   133  `system` scheduler will be updated to support `on_failure` when the
   134  [`update` stanza] is fully supported for system jobs in a future release.
   135  
   136  ### `region` Parameters
   137  
   138  The name of a region must match the name of one of the [federated regions].
   139  
   140  - `count` `(int: <optional>)` - Specifies a count override for task groups in
   141    the region. If a task group specifies a `count = 0`, its count will be
   142    replaced with this value. If a task group specifies its own `count` or omits
   143    the `count` field, this value will be ignored. This value must be
   144    non-negative.
   145  
   146  - `datacenters` `(array<string>: <optional>)` - A list of
   147    datacenters in the region which are eligible for task placement. If not
   148    provided, the `datacenters` field of the job will be used.
   149  
   150  - `meta` - `Meta: nil` - The meta stanza allows for user-defined arbitrary
   151    key-value pairs. The meta specified for each region will be merged with the
   152    meta stanza at the job level.
   153  
   154  As described above, the parameters for each region replace the default values
   155  for the field with the same name for each region.
   156  
   157  ## `multiregion` Examples
   158  
   159  The following examples only show the `multiregion` stanza and the other
   160  stanzas it might be interacting with.
   161  
   162  ### Max Parallel
   163  
   164  This example shows the use of `max_parallel`. This job will deploy first to
   165  the "north" and "south" regions. If either "north" finishes and enters the
   166  `blocked` state, then "east" will be next. At most 2 regions will be in a
   167  `running` state at any given time.
   168  
   169  ```hcl
   170  multiregion {
   171  
   172    strategy {
   173      max_parallel = 2
   174    }
   175  
   176    region "north" {}
   177    region "south" {}
   178    region "east" {}
   179    region "west" {}
   180  }
   181  ```
   182  
   183  ### Rollback Regions
   184  
   185  This example shows the default value of `on_failure`. Because `max_parallel = 1`,
   186  the "north" region will deploy first, followed by "south", and so on. But
   187  supposing the "east" region failed, both the "east" region and the "west"
   188  region would be marked `failed`. Because the job has an `update` stanza with
   189  `auto_revert=true`, both regions would then rollback to the previous job
   190  version. The "north" and "south" regions would remain `blocked` until an
   191  operator intervenes.
   192  
   193  ```hcl
   194  multiregion {
   195  
   196    strategy {
   197      on_failure = ""
   198      max_parallel = 1
   199    }
   200  
   201    region "north" {}
   202    region "south" {}
   203    region "east" {}
   204    region "west" {}
   205  }
   206  
   207  update {
   208    auto_revert = true
   209  }
   210  ```
   211  
   212  ### Override Counts
   213  
   214  This example shows how the `count` field override the default `count` of the
   215  task group. The job the deploys 2 "worker" and 1 "controller" allocations to
   216  the "west" region, and 5 "worker" and 1 "controller" task groups to the "east"
   217  region.
   218  
   219  ```hcl
   220  multiregion {
   221  
   222      region "west" {
   223        count = 2
   224      }
   225  
   226      region "east" {
   227        count = 5
   228      }
   229    }
   230  }
   231  
   232  group "worker" {
   233    count = 0
   234  }
   235  
   236  group "controller" {
   237    count = 1
   238  }
   239  ```
   240  
   241  ### Merging Meta
   242  
   243  This example shows how the `meta` is merged with the `meta` field of the job,
   244  group, and task. A task in "west" will have the values
   245  `first-key="regional-west"`, `second-key="group-level"`, whereas a task in
   246  "east" will have the values `first-key="job-level"`,
   247  `second-key="group-level"`.
   248  
   249  ```hcl
   250  multiregion {
   251  
   252      region "west" {
   253        meta {
   254          first-key = "regional-west"
   255          second-key = "regional-west"
   256        }
   257      }
   258  
   259      region "east" {
   260        meta {
   261          second-key = "regional-east"
   262        }
   263      }
   264    }
   265  }
   266  
   267  meta {
   268    first-key = "job-level"
   269  }
   270  
   271  group "worker" {
   272    meta {
   273      second-key = "group-level"
   274    }
   275  }
   276  ```
   277  
   278  [federated regions]: https://learn.hashicorp.com/tutorials/nomad/federation
   279  [`update` stanza]: /docs/job-specification/update
   280  [update-auto-revert]: /docs/job-specification/update#auto_revert
   281  [examples]: #multiregion-examples
   282  [upgrade strategies]: https://learn.hashicorp.com/collections/nomad/job-updates
   283  [`nomad deployment unblock`]: /docs/commands/deployment/unblock
   284  [parameterized job]: /docs/job-specification/parameterized
   285  [`job dispatch`]: /docs/commands/job/dispatch
   286  [HTTP API]: /api-docs/jobs#dispatch-job
   287  [time zone]: /docs/job-specification/periodic#time_zone