github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/tools/autoscaling/policy.mdx

github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/tools/autoscaling/policy.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: Scaling Policies
     4  description: >
     5    Scaling policies describe the target resource desired state and how to
     6    perform calculations to ensure the current state reaches the desired.
     7  ---
     8  
     9  # Nomad Autoscaler Scaling Policies
    10  
    11  Nomad Autoscaler scaling policies can be configured via the [`scaling` stanza][jobspec_scaling_stanza]
    12  or by configuration files stored on disk. The options available differ whether
    13  you are performing horizontal application/cluster scaling or Dynamic Application
    14  Sizing.
    15  
    16  ## Top Level Options
    17  
    18  - `enabled` - A boolean flag that allows operators to administratively disable a
    19    policy from active evaluation.
    20  
    21  - `min` - The minimum running count of the targeted resource. This can be 0 or any
    22    positive integer.
    23  
    24  - `max` - The maximum running count of the targeted resource. This can be 0 or any
    25    positive integer.
    26  
    27  ## Task Group and Cluster Scaling `policy` Options
    28  
    29  The following options are available when using the Nomad Autoscaler to perform
    30  horizontal application scaling or horizontal cluster scaling.
    31  
    32  - `cooldown` - A time interval after a scaling action during which no additional
    33    scaling will be performed on the resource. It should be provided as a duration
    34    (e.g.: `"5s"`, `"1m"`). If omitted the configuration value
    35    [policy_default_cooldown][policy_default_cooldown_agent] from the agent will
    36    be used.
    37  
    38  - `evaluation_interval` - Defines how often the policy is evaluated by the
    39    Autoscaler. It should be provided as a duration (e.g.: `"5s"`, `"1m"`). If
    40    omitted the configuration value [default_evaluation_interval][eval_interval_agent]
    41    from the agent will be used.
    42  
    43  - `on_check_error` - Defines how to handle errors during check evaluation.
    44    Possible values are `"fail"` or `"ignore"`. If set to `"fail"` the policy
    45    evaluation will stop if any [`check`](#check) returns an error and no
    46    scaling action will take place. If set to `"ignore"` any errors returned by a
    47    `check` will be ignored when computing the scaling action. This value may be
    48    overridden individually by setting [`on_error`](#on_error). Defaults to
    49    `"ignore"`.
    50  
    51  - `target` - Defines where the autoscaling target is running. Detailed information
    52    on the configuration options can be found on the [Target Plugins][target_plugin_docs]
    53    page.
    54  
    55  - `check` - Specifies one or more checks to be executed when determining if a
    56    scaling action is required.
    57  
    58  ## `check` Options
    59  
    60  - `source` - The APM plugin that should handle the metric query. If omitted,
    61    this defaults to using the Nomad APM.
    62  
    63  - `query` - The query to run against the specified APM. Currently this query
    64    should return a single value. Detailed information on the configuration options
    65    can be found on the [APM Plugins][apm_plugin_docs] page.
    66  
    67  - `query_window` - Defines how far back to query the APM for metrics. It should
    68    be provided as a duration (e.g.: `"5s"`, `"1m"`). Defaults to `1m`.
    69  
    70  - `group` - Specifies which checks should treated as correlated when the policy
    71    is evaluated. Refer to [Check Grouping][concepts_grouping] for more
    72    information.
    73  
    74  - `on_error` - Defines how to handle errors during the `check` evaluation.
    75    Possible values are `"fail"` or `"ignore"`. If set to `"fail"` the policy
    76    evaluation will stop in case an error occurs and not scaling action will take
    77    place. If set to `"ignore"` the result of this `check` will not be taken into
    78    considation when computing the scaling action. If not set the value of
    79    [`on_check_error`](#on_check_error) will be used.
    80  
    81  - `strategy` - The strategy to use, and it's configuration when calculating the
    82    desired state based on the current count and the metric returned by the APM.
    83    Detailed information on the configuration options can be found on the
    84    [Strategy Plugins][strategy_plugin_docs] page. Strategies for
    85    [Dynamic Application Sizing][das] are not allowed in this case.
    86  
    87  ### Example in a Job
    88  
    89  A full example of a policy document that can be written into the Nomad task group
    90  `scaling` stanza can be seen below.
    91  
    92  ```hcl
    93  job "example" {
    94    group "app" {
    95      scaling {
    96        min     = 2
    97        max     = 10
    98        enabled = true
    99  
   100        policy {
   101          evaluation_interval = "5s"
   102          cooldown            = "1m"
   103  
   104          check "active_connections" {
   105            source = "prometheus"
   106            query  = "scalar(open_connections_example_cache)"
   107  
   108            strategy "target-value" {
   109              target = 10
   110            }
   111          }
   112        }
   113      }
   114    }
   115  }
   116  ```
   117  
   118  ### Example in a File
   119  
   120  An example of a policy document that can be placed in a file within the
   121  `policy_dir` can be seen below. Multiple policies can be defined in the same
   122  file using multiple `scaling` blocks.
   123  
   124  ```hcl
   125  scaling "aws_cluster_policy" {
   126    enabled = true
   127    min     = 1
   128    max     = 2
   129  
   130    policy {
   131      cooldown            = "2m"
   132      evaluation_interval = "1m"
   133  
   134      check "cpu_allocated_percentage" {
   135        source = "prometheus"
   136        query  = "..."
   137  
   138        strategy "target-value" {
   139          target = 70
   140        }
   141      }
   142  
   143      check "mem_allocated_percentage" {
   144        source = "prometheus"
   145        query  = "..."
   146  
   147        strategy "target-value" {
   148          target = 70
   149        }
   150      }
   151  
   152      target "aws-asg" {
   153        dry-run             = "false"
   154        aws_asg_name        = "hashistack-nomad_client"
   155        node_class          = "hashistack"
   156        node_drain_deadline = "5m"
   157      }
   158    }
   159  }
   160  
   161  scaling "azure_cluster_policy" {
   162    enabled = true
   163    min     = 1
   164    max     = 2
   165  
   166    policy {
   167      ...
   168      target "azure-vmss" {
   169        resource_group      = "hashistack"
   170        vm_scale_set        = "clients"
   171        node_class          = "hashistack"
   172        node_drain_deadline = "5m"
   173      }
   174    }
   175  }
   176  ```
   177  
   178  ## Task (DAS) `policy` Options
   179  
   180  <EnterpriseAlert>
   181    This functionality only exists in Nomad Autoscaler Enterprise. This is not
   182    present in the open source version of Nomad Autoscaler.
   183  </EnterpriseAlert>
   184  
   185  The following options are available when using the Nomad Autoscaler Enterprise
   186  to perform Dynamic Application Sizing recommendations for task resources. When
   187  using the [`scaling` stanza][jobspec_scaling_stanza] for Dynamic Application
   188  Sizing, the stanza requires a label to identify which resource it relates to. It
   189  currently supports `cpu` and `mem` labels, examples of which can be seen below.
   190  
   191  - `cooldown` - A time interval after a scaling action during which no additional
   192    scaling will be performed on the resource. It should be provided as a duration
   193    (e.g.: `"5s"`, `"1m"`). If omitted the configuration value
   194    [policy_default_cooldown][policy_default_cooldown_agent] from the agent will
   195    be used.
   196  
   197  - `evaluation_interval` - Defines how often the policy is evaluated by the
   198    Autoscaler. It should be provided as a duration (e.g.: `"5s"`, `"1m"`). If
   199    omitted the configuration value [default_evaluation_interval][eval_interval_agent]
   200    from the agent will be used.
   201  
   202  - `target` - Defines where the autoscaling target is running. Detailed information
   203    on the configuration options can be found on the [Target Plugins][target_plugin_docs]
   204    page.
   205  
   206  - `check` - Specifies one check to be executed when determining if a recommendation
   207    is required. Only one check is permitted per scaling block within Dynamic
   208    Application Sizing.
   209  
   210  ## `check` Options
   211  
   212  - `strategy` - The strategy to use, and it's configuration when calculating the
   213    desired state based on the current value and the available historic data. Detailed
   214    information on the configuration options can be found on the
   215    [Strategy Plugins][strategy_plugin_docs] page. Only [Dynamic Application Sizing][das]
   216    strategies are allowed.
   217  
   218  ### Example
   219  
   220  The following examples are minimal blocks which can be used to configure CPU and
   221  Memory based sizing recommendations for a Nomad job task.
   222  
   223  ```hcl
   224  scaling "cpu" {
   225    policy {
   226      check "96pct" {
   227        strategy "app-sizing-percentile" {
   228          percentile = "96"
   229        }
   230      }
   231    }
   232  }
   233  
   234  scaling "mem" {
   235    policy {
   236      check "max" {
   237        strategy "app-sizing-max" {}
   238      }
   239    }
   240  }
   241  ```
   242  
   243  [concepts_grouping]: /tools/autoscaling/concepts/policy-eval/checks#check-grouping
   244  [das]: /tools/autoscaling#dynamic-application-sizing
   245  [policy_default_cooldown_agent]: /tools/autoscaling/agent#default_cooldown
   246  [eval_interval_agent]: /tools/autoscaling/agent#default_evaluation_interval
   247  [target_plugin_docs]: /tools/autoscaling/plugins/target
   248  [strategy_plugin_docs]: /tools/autoscaling/plugins/strategy
   249  [apm_plugin_docs]: /tools/autoscaling/plugins/apm
   250  [jobspec_scaling_stanza]: /docs/job-specification/scaling