github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/job-specification/restart.mdx

github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/job-specification/restart.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: restart Stanza - Job Specification
     4  description: The "restart" stanza configures a group's behavior on task failure.
     5  ---
     6  
     7  # `restart` Stanza
     8  
     9  <Placement
    10    groups={[
    11      ['job', 'group', 'restart'],
    12      ['job', 'group', 'task', 'restart'],
    13    ]}
    14  />
    15  
    16  The `restart` stanza configures a task's behavior on task failure. Restarts
    17  happen on the client that is running the task.
    18  
    19  ```hcl
    20  job "docs" {
    21    group "example" {
    22      restart {
    23        attempts = 3
    24        delay    = "30s"
    25      }
    26    }
    27  }
    28  ```
    29  
    30  If specified at the group level, the configuration is inherited by all
    31  tasks in the group, including any [sidecar tasks][sidecar_task]. If
    32  also present on the task, the policy is merged with the restart policy
    33  from the encapsulating task group.
    34  
    35  For example, assuming that the task group restart policy is:
    36  
    37  ```hcl
    38  restart {
    39    interval = "30m"
    40    attempts = 2
    41    delay    = "15s"
    42    mode     = "fail"
    43  }
    44  ```
    45  
    46  and the individual task restart policy is:
    47  
    48  ```hcl
    49  restart {
    50    attempts = 5
    51  }
    52  ```
    53  
    54  then the effective restart policy for the task will be:
    55  
    56  ```hcl
    57  restart {
    58    interval = "30m"
    59    attempts = 5
    60    delay    = "15s"
    61    mode     = "fail"
    62  }
    63  ```
    64  
    65  Because sidecar tasks don't accept a `restart` block, it's recommended
    66  that you set the `restart` for jobs with sidecar tasks at the task
    67  level, so that the Connect sidecar can inherit the default `restart`.
    68  
    69  ## `restart` Parameters
    70  
    71  - `attempts` `(int: <varies>)` - Specifies the number of restarts allowed in the
    72    configured interval. Defaults vary by job type, see below for more
    73    information.
    74  
    75  - `delay` `(string: "15s")` - Specifies the duration to wait before restarting a
    76    task. This is specified using a label suffix like "30s" or "1h". A random
    77    jitter of up to 25% is added to the delay.
    78  
    79  - `interval` `(string: <varies>)` - Specifies the duration which begins when the
    80    first task starts and ensures that only `attempts` number of restarts happens
    81    within it. If more than `attempts` number of failures happen, behavior is
    82    controlled by `mode`. This is specified using a label suffix like "30s" or
    83    "1h". Defaults vary by job type, see below for more information.
    84  
    85  - `mode` `(string: "fail")` - Controls the behavior when the task fails more
    86    than `attempts` times in an interval. For a detailed explanation of these
    87    values and their behavior, please see the [mode values section](#mode-values).
    88  
    89  ### `restart` Parameter Defaults
    90  
    91  The values for many of the `restart` parameters vary by job type. Here are the
    92  defaults by job type:
    93  
    94  - The default batch restart policy is:
    95  
    96    ```hcl
    97    restart {
    98      attempts = 3
    99      delay    = "15s"
   100      interval = "24h"
   101      mode     = "fail"
   102    }
   103    ```
   104  
   105  - The default service and system job restart policy is:
   106  
   107    ```hcl
   108    restart {
   109      interval = "30m"
   110      attempts = 2
   111      delay    = "15s"
   112      mode     = "fail"
   113    }
   114    ```
   115  
   116  ### `mode` Values
   117  
   118  This section details the specific values for the "mode" parameter in the Nomad
   119  job specification for constraints. The mode is always specified as a string:
   120  
   121  ```hcl
   122  restart {
   123    mode = "..."
   124  }
   125  ```
   126  
   127  - `"delay"` - Instructs the client to wait until another `interval`
   128    before restarting the task.
   129  
   130  - `"fail"` - Instructs the client not to attempt to restart the task
   131    once the number of `attempts` have been used. This is the default
   132    behavior. This mode is useful for non-idempotent jobs which are
   133    unlikely to succeed after a few failures. The allocation will be
   134    marked as failed and the scheduler will attempt to reschedule the
   135    allocation according to the
   136    [`reschedule`] stanza.
   137  
   138  ### `restart` Examples
   139  
   140  With the following `restart` block, a failing task will restart 3
   141  times with 15 seconds between attempts, and then wait 10 minutes
   142  before attempting another 3 attempts. The task restart will never fail
   143  the entire allocation.
   144  
   145  ```hcl
   146  restart {
   147    attempts = 3
   148    delay    = "15s"
   149    interval = "10m"
   150    mode     = "delay"
   151  }
   152  ```
   153  
   154  With the following `restart` block, a task that fails after 1
   155  minute, after 2 minutes, and after 3 minutes will be restarted each
   156  time. If it fails again before 10 minutes, the entire allocation will
   157  be marked as failed and the scheduler will follow the group's
   158  [`reschedule`] specification, possibly resulting in a new evaluation.
   159  
   160  ```hcl
   161  restart {
   162    attempts = 3
   163    delay    = "15s"
   164    interval = "10m"
   165    mode     = "fail"
   166  }
   167  ```
   168  
   169  [sidecar_task]: /docs/job-specification/sidecar_task
   170  [`reschedule`]: /docs/job-specification/reschedule