github.com/thanos-io/thanos@v0.32.5/docs/components/rule.md (about)

     1  # Rule (aka Ruler)
     2  
     3  ***NOTE:** It is recommended to keep deploying rules inside the relevant Prometheus servers locally. Use ruler only on specific cases. Read details [below](#risk) why.*
     4  
     5  *The rule component should in particular not be used to circumvent solving rule deployment properly at the configuration management level.*
     6  
     7  The `thanos rule` command evaluates Prometheus recording and alerting rules against chosen query API via repeated `--query` (or FileSD via `--query.sd`). If more than one query is passed, round robin balancing is performed.
     8  
     9  By default, rule evaluation results are written back to disk in the Prometheus 2.0 storage format. Rule nodes at the same time participate in the system as source store nodes, which means that they expose StoreAPI and upload their generated TSDB blocks to an object store.
    10  
    11  Rule also has a stateless mode which sends rule evaluation results to some remote storages via remote write for better scalability. This way, rule nodes only work as a data producer and the remote receive nodes work as source store nodes. It means that Thanos Rule in this mode does *not* expose the StoreAPI.
    12  
    13  You can think of Rule as a simplified Prometheus that does not require a sidecar and does not scrape and do PromQL evaluation (no QueryAPI).
    14  
    15  The data of each Rule node can be labeled to satisfy the clusters labeling scheme. High-availability pairs can be run in parallel and should be distinguished by the designated replica label, just like regular Prometheus servers. Read more about Ruler in HA [here](#ruler-ha)
    16  
    17  ```bash
    18  thanos rule \
    19      --data-dir             "/path/to/data" \
    20      --eval-interval        "30s" \
    21      --rule-file            "/path/to/rules/*.rules.yaml" \
    22      --alert.query-url      "http://0.0.0.0:9090" \ # This tells what query URL to link to in UI.
    23      --alertmanagers.url    "http://alert.thanos.io" \
    24      --query                "query.example.org" \
    25      --query                "query2.example.org" \
    26      --objstore.config-file "bucket.yml" \
    27      --label                'monitor_cluster="cluster1"' \
    28      --label                'replica="A"'
    29  ```
    30  
    31  ## Risk
    32  
    33  Ruler has conceptual tradeoffs that might not be favorable for most use cases. The main tradeoff is its dependence on query reliability. For Prometheus it is unlikely to have alert/recording rule evaluation failure as evaluation is local.
    34  
    35  For Ruler the read path is distributed, since most likely Ruler is querying Thanos Querier which gets data from remote Store APIs.
    36  
    37  This means that **query failure** are more likely to happen, that's why clear strategy on what will happen to alert and during query unavailability is the key.
    38  
    39  ## Configuring Rules
    40  
    41  Rule files use YAML, the syntax of a rule file is:
    42  
    43  ```
    44  groups:
    45    [ - <rule_group> ]
    46  ```
    47  
    48  A simple example rules file would be:
    49  
    50  ```
    51  groups:
    52    - name: example
    53      rules:
    54      - record: job:http_inprogress_requests:sum
    55        expr: sum(http_inprogress_requests) by (job)
    56  ```
    57  
    58  <rule_group>
    59  
    60  ```
    61  # The name of the group. Must be unique within a file.
    62  name: <string>
    63  
    64  # How often rules in the group are evaluated.
    65  [ interval: <duration> | default = global.evaluation_interval ]
    66  
    67  rules:
    68    [ - <rule> ... ]
    69  ```
    70  
    71  Thanos supports two types of rules which may be configured and then evaluated at regular intervals: recording rules and alerting rules.
    72  
    73  ### Recording Rules
    74  
    75  Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. Querying the precomputed result will then often be much faster than executing the original expression every time it is needed. This is especially useful for dashboards, which need to query the same expression repeatedly every time they refresh.
    76  
    77  Recording and alerting rules exist in a rule group. Rules within a group are run sequentially at a regular interval.
    78  
    79  The syntax for recording rules is:
    80  
    81  ```
    82  # The name of the time series to output to. Must be a valid metric name.
    83  record: <string>
    84  
    85  # The PromQL expression to evaluate. Every evaluation cycle this is
    86  # evaluated at the current time, and the result recorded as a new set of
    87  # time series with the metric name as given by 'record'.
    88  expr: <string>
    89  
    90  # Labels to add or overwrite before storing the result.
    91  labels:
    92    [ <labelname>: <labelvalue> ]
    93  ```
    94  
    95  Note: If you make use of recording rules, make sure that you expose your Ruler instance as a store in the Thanos Querier so that the new time series can be queried as part of Thanos Query. One of the ways you can do this is by adding a new `--store <thanos-ruler-ip>` command-line argument to the Thanos Query command.
    96  
    97  ### Alerting Rules
    98  
    99  The syntax for alerting rules is:
   100  
   101  ```
   102  # The name of the alert. Must be a valid metric name.
   103  alert: <string>
   104  
   105  # The PromQL expression to evaluate. Every evaluation cycle this is
   106  # evaluated at the current time, and all resultant time series become
   107  # pending/firing alerts.
   108  expr: <string>
   109  
   110  # Alerts are considered firing once they have been returned for this long.
   111  # Alerts which have not yet fired for long enough are considered pending.
   112  [ for: <duration> | default = 0s ]
   113  
   114  # Labels to add or overwrite for each alert.
   115  labels:
   116    [ <labelname>: <tmpl_string> ]
   117  
   118  # Annotations to add to each alert.
   119  annotations:
   120    [ <labelname>: <tmpl_string> ]
   121  ```
   122  
   123  ## Partial Response
   124  
   125  See [this](query.md#partial-response) on initial info.
   126  
   127  Rule allows you to specify rule groups with additional fields that control PartialResponseStrategy e.g:
   128  
   129  ```yaml
   130  groups:
   131  - name: "warn strategy"
   132    partial_response_strategy: "warn"
   133    rules:
   134    - alert: "some"
   135      expr: "up"
   136  - name: "abort strategy"
   137    partial_response_strategy: "abort"
   138    rules:
   139    - alert: "some"
   140      expr: "up"
   141  - name: "by default strategy is abort"
   142    rules:
   143    - alert: "some"
   144      expr: "up"
   145  ```
   146  
   147  It is recommended to keep partial response as `abort` for alerts and that is the default as well.
   148  
   149  Essentially, for alerting, having partial response can result in symptoms being missed by Rule's alert.
   150  
   151  ## Must have: essential Ruler alerts!
   152  
   153  To be sure that alerting works it is essential to monitor Ruler and alert from another **Scraper (Prometheus + sidecar)** that sits in same cluster.
   154  
   155  The most important metrics to alert on are:
   156  
   157  * `thanos_alert_sender_alerts_dropped_total`. If greater than 0, it means that alerts triggered by Rule are not being sent to alertmanager which might indicate connection, incompatibility or misconfiguration problems.
   158  
   159  * `prometheus_rule_evaluation_failures_total`. If greater than 0, it means that that rule failed to be evaluated, which results in either gap in rule or potentially ignored alert. This metric might indicate problems on the queryAPI endpoint you use. Alert heavily on this if this happens for longer than your alert thresholds. `strategy` label will tell you if failures comes from rules that tolerate [partial response](#partial-response) or not.
   160  
   161  * `prometheus_rule_group_last_duration_seconds > prometheus_rule_group_interval_seconds` If the difference is positive, it means that rule evaluation took more time than the scheduled interval, and data for some intervals could be missing. It can indicate that your query backend (e.g Querier) takes too much time to evaluate the query, i.e. that it is not fast enough to fill the rule. This might indicate other problems like slow StoreAPis or too complex query expression in rule.
   162  
   163  * `thanos_rule_evaluation_with_warnings_total`. If you choose to use Rules and Alerts with [partial response strategy's](#partial-response) value as "warn", this metric will tell you how many evaluation ended up with some kind of warning. To see the actual warnings see WARN log level. This might suggest that those evaluations return partial response and might not be accurate.
   164  
   165  Those metrics are important for vanilla Prometheus as well, but even more important when we rely on (sometimes WAN) network.
   166  
   167  // TODO(bwplotka): Rereview them after recent changes in metrics.
   168  
   169  See [alerts](https://github.com/thanos-io/thanos/blob/e3b0baf7de9dde1887253b1bb19d78ae71a01bf8/examples/alerts/alerts.md#ruler) for more example alerts for ruler.
   170  
   171  NOTE: It is also recommended to set a mocked Alert on Ruler that checks if Query is up. This might be something simple like `vector(1)` query, just to check if Querier is live.
   172  
   173  ## Performance.
   174  
   175  As rule nodes outsource query processing to query nodes, they should generally experience little load. If necessary, functional sharding can be applied by splitting up the sets of rules between HA pairs. Rules are processed with deduplicated data according to the replica label configured on query nodes.
   176  
   177  ## External labels
   178  
   179  It is *mandatory* to add certain external labels to indicate the ruler origin (e.g `label='replica="A"'` or for `cluster`). Otherwise running multiple ruler replicas will be not possible, resulting in clash during compaction.
   180  
   181  NOTE: It is advised to put different external labels than labels given by other sources we are recording or alerting against.
   182  
   183  For example:
   184  
   185  * Ruler is in cluster `mon1` and we have Prometheus in cluster `eu1`
   186  * By default we could try having consistent labels so we have `cluster=eu1` for Prometheus and `cluster=mon1` for Ruler.
   187  * We configure `ScraperIsDown` alert that monitors service from `work1` cluster.
   188  * When triggered this alert results in `ScraperIsDown{cluster=mon1}` since external labels always *replace* source labels.
   189  
   190  This effectively drops the important metadata and makes it impossible to tell in what exactly `cluster` the `ScraperIsDown` alert found problem without falling back to manual query.
   191  
   192  ## Ruler UI
   193  
   194  On HTTP address Ruler exposes its UI that shows mainly Alerts and Rules page (similar to Prometheus Alerts page). Each alert is linked to the query that the alert is performing, which you can click to navigate to the configured `alert.query-url`.
   195  
   196  ## Ruler HA
   197  
   198  Ruler aims to use a similar approach to the one that Prometheus has. You can configure external labels, as well as relabelling.
   199  
   200  In case of Ruler in HA you need to make sure you have the following labelling setup:
   201  
   202  * Labels that identify the HA group ruler and replica label with different value for each ruler instance, e.g: `cluster="eu1", replica="A"` and `cluster=eu1, replica="B"` by using `--label` flag.
   203  * Labels that need to be dropped just before sending to alermanager in order for alertmanager to deduplicate alerts e.g `--alert.label-drop="replica"`.
   204  
   205  Advanced relabelling configuration is possible with the `--alert.relabel-config` and `--alert.relabel-config-file` flags. The configuration format is identical to the [`alert_relabel_configs`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs) field of Prometheus. Note that Thanos Ruler drops the labels listed in `--alert.label-drop` before alert relabelling.
   206  
   207  ## Stateless Ruler via Remote Write
   208  
   209  Stateless ruler enables nearly indefinite horizontal scalability. Ruler doesn't have a fully functional TSDB for storing evaluation results, but uses a WAL only storage and sends data to some remote storage via remote write.
   210  
   211  The WAL only storage reuses the upstream [Prometheus agent](https://prometheus.io/blog/2021/11/16/agent/) and it is compatible with the old TSDB data. For more design purpose of this mode, please refer to the [proposal](https://thanos.io/tip/proposals-done/202005-scalable-rule-storage.md/).
   212  
   213  Stateless mode can be enabled by providing [Prometheus remote write config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) in file via `--remote-write.config` or inlined `--remote-write.config-file` flag. For example:
   214  
   215  ```bash
   216  thanos rule \
   217      --data-dir                  "/path/to/data" \
   218      --eval-interval             "30s" \
   219      --rule-file                 "/path/to/rules/*.rules.yaml" \
   220      --alert.query-url           "http://0.0.0.0:9090" \ # This tells what query URL to link to in UI.
   221      --alertmanagers.url         "http://alert.thanos.io" \
   222      --query                     "query.example.org" \
   223      --query                     "query2.example.org" \
   224      --objstore.config-file      "bucket.yml" \
   225      --label                     'monitor_cluster="cluster1"' \
   226      --label                     'replica="A"' \
   227      --remote-write.config-file  'rw-config.yaml'
   228  ```
   229  
   230  Where `rw-config.yaml` could look as follows:
   231  
   232  ```yaml
   233  remote_write:
   234  - url: http://e2e_test_rule_remote_write-receive-1:8081/api/v1/receive
   235    name: thanos-receiver
   236    follow_redirects: false
   237  - url: https://e2e_test_rule_remote_write-receive-2:443/api/v1/receive
   238    remote_timeout: 30s
   239    follow_redirects: true
   240    queue_config:
   241      capacity: 120000
   242      max_shards: 50
   243      min_shards: 1
   244      max_samples_per_send: 40000
   245      batch_send_deadline: 5s
   246      min_backoff: 5s
   247      max_backoff: 5m
   248  ```
   249  
   250  You can pass this in file using `--remote-write.config-file=` or inline it using `--remote-write.config=`.
   251  
   252  **NOTE:**
   253  1. `metadata_config` is not supported in this mode and will be ignored if provided in the remote write configuration.
   254  2. Ruler won't expose Store API for querying data if stateless mode is enabled. If the remote storage is thanos receiver then you can use that to query rule evaluation results.
   255  
   256  ## Flags
   257  
   258  ```$ mdox-exec="thanos rule --help"
   259  usage: thanos rule [<flags>]
   260  
   261  Ruler evaluating Prometheus rules against given Query nodes, exposing Store API
   262  and storing old blocks in bucket.
   263  
   264  Flags:
   265        --alert.label-drop=ALERT.LABEL-DROP ...
   266                                   Labels by name to drop before sending
   267                                   to alertmanager. This allows alert to be
   268                                   deduplicated on replica label (repeated).
   269                                   Similar Prometheus alert relabelling
   270        --alert.query-url=ALERT.QUERY-URL
   271                                   The external Thanos Query URL that would be set
   272                                   in all alerts 'Source' field
   273        --alert.relabel-config=<content>
   274                                   Alternative to 'alert.relabel-config-file' flag
   275                                   (mutually exclusive). Content of YAML file that
   276                                   contains alert relabelling configuration.
   277        --alert.relabel-config-file=<file-path>
   278                                   Path to YAML file that contains alert
   279                                   relabelling configuration.
   280        --alertmanagers.config=<content>
   281                                   Alternative to 'alertmanagers.config-file'
   282                                   flag (mutually exclusive). Content
   283                                   of YAML file that contains alerting
   284                                   configuration. See format details:
   285                                   https://thanos.io/tip/components/rule.md/#configuration.
   286                                   If defined, it takes precedence
   287                                   over the '--alertmanagers.url' and
   288                                   '--alertmanagers.send-timeout' flags.
   289        --alertmanagers.config-file=<file-path>
   290                                   Path to YAML file that contains alerting
   291                                   configuration. See format details:
   292                                   https://thanos.io/tip/components/rule.md/#configuration.
   293                                   If defined, it takes precedence
   294                                   over the '--alertmanagers.url' and
   295                                   '--alertmanagers.send-timeout' flags.
   296        --alertmanagers.sd-dns-interval=30s
   297                                   Interval between DNS resolutions of
   298                                   Alertmanager hosts.
   299        --alertmanagers.send-timeout=10s
   300                                   Timeout for sending alerts to Alertmanager
   301        --alertmanagers.url=ALERTMANAGERS.URL ...
   302                                   Alertmanager replica URLs to push firing
   303                                   alerts. Ruler claims success if push to
   304                                   at least one alertmanager from discovered
   305                                   succeeds. The scheme should not be empty
   306                                   e.g `http` might be used. The scheme may be
   307                                   prefixed with 'dns+' or 'dnssrv+' to detect
   308                                   Alertmanager IPs through respective DNS
   309                                   lookups. The port defaults to 9093 or the
   310                                   SRV record's value. The URL path is used as a
   311                                   prefix for the regular Alertmanager API path.
   312        --data-dir="data/"         data directory
   313        --eval-interval=1m         The default evaluation interval to use.
   314        --for-grace-period=10m     Minimum duration between alert and restored
   315                                   "for" state. This is maintained only for alerts
   316                                   with configured "for" time greater than grace
   317                                   period.
   318        --for-outage-tolerance=1h  Max time to tolerate prometheus outage for
   319                                   restoring "for" state of alert.
   320        --grpc-address="0.0.0.0:10901"
   321                                   Listen ip:port address for gRPC endpoints
   322                                   (StoreAPI). Make sure this address is routable
   323                                   from other components.
   324        --grpc-grace-period=2m     Time to wait after an interrupt received for
   325                                   GRPC Server.
   326        --grpc-server-max-connection-age=60m
   327                                   The grpc server max connection age. This
   328                                   controls how often to re-establish connections
   329                                   and redo TLS handshakes.
   330        --grpc-server-tls-cert=""  TLS Certificate for gRPC server, leave blank to
   331                                   disable TLS
   332        --grpc-server-tls-client-ca=""
   333                                   TLS CA to verify clients against. If no
   334                                   client CA is specified, there is no client
   335                                   verification on server side. (tls.NoClientCert)
   336        --grpc-server-tls-key=""   TLS Key for the gRPC server, leave blank to
   337                                   disable TLS
   338        --hash-func=               Specify which hash function to use when
   339                                   calculating the hashes of produced files.
   340                                   If no function has been specified, it does not
   341                                   happen. This permits avoiding downloading some
   342                                   files twice albeit at some performance cost.
   343                                   Possible values are: "", "SHA256".
   344    -h, --help                     Show context-sensitive help (also try
   345                                   --help-long and --help-man).
   346        --http-address="0.0.0.0:10902"
   347                                   Listen host:port for HTTP endpoints.
   348        --http-grace-period=2m     Time to wait after an interrupt received for
   349                                   HTTP Server.
   350        --http.config=""           [EXPERIMENTAL] Path to the configuration file
   351                                   that can enable TLS or authentication for all
   352                                   HTTP endpoints.
   353        --label=<name>="<value>" ...
   354                                   Labels to be applied to all generated metrics
   355                                   (repeated). Similar to external labels for
   356                                   Prometheus, used to identify ruler and its
   357                                   blocks as unique source.
   358        --log.format=logfmt        Log format to use. Possible options: logfmt or
   359                                   json.
   360        --log.level=info           Log filtering level.
   361        --log.request.decision=    Deprecation Warning - This flag would
   362                                   be soon deprecated, and replaced with
   363                                   `request.logging-config`. Request Logging
   364                                   for logging the start and end of requests. By
   365                                   default this flag is disabled. LogFinishCall:
   366                                   Logs the finish call of the requests.
   367                                   LogStartAndFinishCall: Logs the start and
   368                                   finish call of the requests. NoLogCall: Disable
   369                                   request logging.
   370        --objstore.config=<content>
   371                                   Alternative to 'objstore.config-file'
   372                                   flag (mutually exclusive). Content of
   373                                   YAML file that contains object store
   374                                   configuration. See format details:
   375                                   https://thanos.io/tip/thanos/storage.md/#configuration
   376        --objstore.config-file=<file-path>
   377                                   Path to YAML file that contains object
   378                                   store configuration. See format details:
   379                                   https://thanos.io/tip/thanos/storage.md/#configuration
   380        --query=<query> ...        Addresses of statically configured query
   381                                   API servers (repeatable). The scheme may be
   382                                   prefixed with 'dns+' or 'dnssrv+' to detect
   383                                   query API servers through respective DNS
   384                                   lookups.
   385        --query.config=<content>   Alternative to 'query.config-file' flag
   386                                   (mutually exclusive). Content of YAML
   387                                   file that contains query API servers
   388                                   configuration. See format details:
   389                                   https://thanos.io/tip/components/rule.md/#configuration.
   390                                   If defined, it takes precedence over the
   391                                   '--query' and '--query.sd-files' flags.
   392        --query.config-file=<file-path>
   393                                   Path to YAML file that contains query API
   394                                   servers configuration. See format details:
   395                                   https://thanos.io/tip/components/rule.md/#configuration.
   396                                   If defined, it takes precedence over the
   397                                   '--query' and '--query.sd-files' flags.
   398        --query.default-step=1s    Default range query step to use. This is
   399                                   only used in stateless Ruler and alert state
   400                                   restoration.
   401        --query.http-method=POST   HTTP method to use when sending queries.
   402                                   Possible options: [GET, POST]
   403        --query.sd-dns-interval=30s
   404                                   Interval between DNS resolutions.
   405        --query.sd-files=<path> ...
   406                                   Path to file that contains addresses of query
   407                                   API servers. The path can be a glob pattern
   408                                   (repeatable).
   409        --query.sd-interval=5m     Refresh interval to re-read file SD files.
   410                                   (used as a fallback)
   411        --remote-write.config=<content>
   412                                   Alternative to 'remote-write.config-file'
   413                                   flag (mutually exclusive). Content
   414                                   of YAML config for the remote-write
   415                                   configurations, that specify servers
   416                                   where samples should be sent to (see
   417                                   https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
   418                                   This automatically enables stateless mode
   419                                   for ruler and no series will be stored in the
   420                                   ruler's TSDB. If an empty config (or file) is
   421                                   provided, the flag is ignored and ruler is run
   422                                   with its own TSDB.
   423        --remote-write.config-file=<file-path>
   424                                   Path to YAML config for the remote-write
   425                                   configurations, that specify servers
   426                                   where samples should be sent to (see
   427                                   https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
   428                                   This automatically enables stateless mode
   429                                   for ruler and no series will be stored in the
   430                                   ruler's TSDB. If an empty config (or file) is
   431                                   provided, the flag is ignored and ruler is run
   432                                   with its own TSDB.
   433        --request.logging-config=<content>
   434                                   Alternative to 'request.logging-config-file'
   435                                   flag (mutually exclusive). Content
   436                                   of YAML file with request logging
   437                                   configuration. See format details:
   438                                   https://thanos.io/tip/thanos/logging.md/#configuration
   439        --request.logging-config-file=<file-path>
   440                                   Path to YAML file with request logging
   441                                   configuration. See format details:
   442                                   https://thanos.io/tip/thanos/logging.md/#configuration
   443        --resend-delay=1m          Minimum amount of time to wait before resending
   444                                   an alert to Alertmanager.
   445        --restore-ignored-label=RESTORE-IGNORED-LABEL ...
   446                                   Label names to be ignored when restoring alerts
   447                                   from the remote storage. This is only used in
   448                                   stateless mode.
   449        --rule-file=rules/ ...     Rule files that should be used by rule
   450                                   manager. Can be in glob format (repeated).
   451                                   Note that rules are not automatically detected,
   452                                   use SIGHUP or do HTTP POST /-/reload to re-read
   453                                   them.
   454        --shipper.upload-compacted
   455                                   If true shipper will try to upload compacted
   456                                   blocks as well. Useful for migration purposes.
   457                                   Works only if compaction is disabled on
   458                                   Prometheus. Do it once and then disable the
   459                                   flag when done.
   460        --store.limits.request-samples=0
   461                                   The maximum samples allowed for a single
   462                                   Series request, The Series call fails if
   463                                   this limit is exceeded. 0 means no limit.
   464                                   NOTE: For efficiency the limit is internally
   465                                   implemented as 'chunks limit' considering each
   466                                   chunk contains a maximum of 120 samples.
   467        --store.limits.request-series=0
   468                                   The maximum series allowed for a single Series
   469                                   request. The Series call fails if this limit is
   470                                   exceeded. 0 means no limit.
   471        --tracing.config=<content>
   472                                   Alternative to 'tracing.config-file' flag
   473                                   (mutually exclusive). Content of YAML file
   474                                   with tracing configuration. See format details:
   475                                   https://thanos.io/tip/thanos/tracing.md/#configuration
   476        --tracing.config-file=<file-path>
   477                                   Path to YAML file with tracing
   478                                   configuration. See format details:
   479                                   https://thanos.io/tip/thanos/tracing.md/#configuration
   480        --tsdb.block-duration=2h   Block duration for TSDB block.
   481        --tsdb.no-lockfile         Do not create lockfile in TSDB data directory.
   482                                   In any case, the lockfiles will be deleted on
   483                                   next startup.
   484        --tsdb.retention=48h       Block retention time on local disk.
   485        --tsdb.wal-compression     Compress the tsdb WAL.
   486        --version                  Show application version.
   487        --web.disable-cors         Whether to disable CORS headers to be set by
   488                                   Thanos. By default Thanos sets CORS headers to
   489                                   be allowed by all.
   490        --web.external-prefix=""   Static prefix for all HTML links and redirect
   491                                   URLs in the bucket web UI interface.
   492                                   Actual endpoints are still served on / or the
   493                                   web.route-prefix. This allows thanos bucket
   494                                   web UI to be served behind a reverse proxy that
   495                                   strips a URL sub-path.
   496        --web.prefix-header=""     Name of HTTP request header used for dynamic
   497                                   prefixing of UI links and redirects.
   498                                   This option is ignored if web.external-prefix
   499                                   argument is set. Security risk: enable
   500                                   this option only if a reverse proxy in
   501                                   front of thanos is resetting the header.
   502                                   The --web.prefix-header=X-Forwarded-Prefix
   503                                   option can be useful, for example, if Thanos
   504                                   UI is served via Traefik reverse proxy with
   505                                   PathPrefixStrip option enabled, which sends the
   506                                   stripped prefix value in X-Forwarded-Prefix
   507                                   header. This allows thanos UI to be served on a
   508                                   sub-path.
   509        --web.route-prefix=""      Prefix for API and UI endpoints. This allows
   510                                   thanos UI to be served on a sub-path. This
   511                                   option is analogous to --web.route-prefix of
   512                                   Prometheus.
   513  
   514  ```
   515  
   516  ## Configuration
   517  
   518  ### Alertmanager
   519  
   520  The `--alertmanagers.config` and `--alertmanagers.config-file` flags allow specifying multiple Alertmanagers. Those entries are treated as a single HA group. This means that alert send failure is claimed only if the Ruler fails to send to all instances.
   521  
   522  The configuration format is the following:
   523  
   524  ```yaml
   525  alertmanagers:
   526  - http_config:
   527      basic_auth:
   528        username: ""
   529        password: ""
   530        password_file: ""
   531      bearer_token: ""
   532      bearer_token_file: ""
   533      proxy_url: ""
   534      tls_config:
   535        ca_file: ""
   536        cert_file: ""
   537        key_file: ""
   538        server_name: ""
   539        insecure_skip_verify: false
   540    static_configs: []
   541    file_sd_configs:
   542    - files: []
   543      refresh_interval: 0s
   544    scheme: http
   545    path_prefix: ""
   546    timeout: 10s
   547    api_version: v1
   548  ```
   549  
   550  Supported values for `api_version` are `v1` or `v2`.
   551  
   552  ### Query API
   553  
   554  The `--query.config` and `--query.config-file` flags allow specifying multiple query endpoints. Those entries are treated as a single HA group. This means that query failure is claimed only if the Ruler fails to query all instances.
   555  
   556  The configuration format is the following:
   557  
   558  ```yaml
   559  - http_config:
   560      basic_auth:
   561        username: ""
   562        password: ""
   563        password_file: ""
   564      bearer_token: ""
   565      bearer_token_file: ""
   566      proxy_url: ""
   567      tls_config:
   568        ca_file: ""
   569        cert_file: ""
   570        key_file: ""
   571        server_name: ""
   572        insecure_skip_verify: false
   573    static_configs: []
   574    file_sd_configs:
   575    - files: []
   576      refresh_interval: 0s
   577    scheme: http
   578    path_prefix: ""
   579  ```