github.com/thanos-io/thanos@v0.32.5/docs/storage.md (about)

     1  # Object Storage & Data Format
     2  
     3  Thanos uses object storage as primary storage for metrics and metadata related to them. In this document you can learn how to configure your object storage and what is the data layout and format for primary Thanos components that are "block" aware, like: `sidecar` `compact`, `receive` and `store gateway`.
     4  
     5  ## Configuring Access to Object Storage
     6  
     7  Thanos supports any object stores that can be implemented against Thanos [objstore.Bucket interface](https://github.com/thanos-io/objstore/blob/main/objstore.go).
     8  
     9  All clients can be configured using `--objstore.config-file` to reference to the configuration file or `--objstore.config` to put yaml config directly.
    10  
    11  ### How to use our special `config` flags?
    12  
    13  **You can either pass YAML file defined below in `--objstore.config-file` or pass the YAML content directly using `--objstore.config`** We recommend the latter as it gives an explicit static view of configuration for each component. It also saves you the fuss of creating and managing additional file.
    14  
    15  Don't be afraid of multiline flags!
    16  
    17  In Kubernetes it is as easy as (on Thanos sidecar example):
    18  
    19  ```yaml
    20        - args:
    21          - sidecar
    22          - |
    23            --objstore.config=type: GCS
    24            config:
    25              bucket: <bucket>
    26          - --prometheus.url=http://localhost:9090
    27          - |
    28            --tracing.config=type: STACKDRIVER
    29            config:
    30              service_name: ""
    31              project_id: <project>
    32              sample_factor: 16
    33          - --tsdb.path=/prometheus-data
    34  ```
    35  
    36  ### Supported Clients
    37  
    38  Current object storage client implementations:
    39  
    40  | Provider                                                                                  | Maturity           | Aimed For             | Auto-tested on CI | Maintainers                      |
    41  |-------------------------------------------------------------------------------------------|--------------------|-----------------------|-------------------|----------------------------------|
    42  | [Google Cloud Storage](#gcs)                                                              | Stable             | Production Usage      | yes               | @bwplotka                        |
    43  | [AWS/S3](#s3) (and all S3-compatible storages e.g disk-based [Minio](https://min.io/))    | Stable             | Production Usage      | yes               | @bwplotka                        |
    44  | [Azure Storage Account](#azure)                                                           | Stable             | Production Usage      | no                | @vglafirov                       |
    45  | [OpenStack Swift](#openstack-swift)                                                       | Beta (working PoC) | Production Usage      | yes               | @FUSAKLA                         |
    46  | [Tencent COS](#tencent-cos)                                                               | Beta               | Production Usage      | no                | @jojohappy,@hanjm                |
    47  | [AliYun OSS](#aliyun-oss)                                                                 | Beta               | Production Usage      | no                | @shaulboozhiao,@wujinhu          |
    48  | [Local Filesystem](#filesystem)                                                           | Stable             | Testing and Demo only | yes               | @bwplotka                        |
    49  | [Oracle Cloud Infrastructure Object Storage](#oracle-cloud-infrastructure-object-storage) | Beta               | Production Usage      | yes               | @aarontams,@gaurav-05,@ericrrath |
    50  
    51  **Missing support to some object storage?** Check out [how to add your client section](#how-to-add-a-new-client-to-thanos)
    52  
    53  NOTE: Currently Thanos requires strong consistency (write-read) for object store implementation for singleton Compaction purposes.
    54  
    55  #### S3
    56  
    57  Thanos uses the [minio client](https://github.com/minio/minio-go) library to upload Prometheus data into AWS S3.
    58  
    59  You can configure an S3 bucket as an object store with YAML, either by passing the configuration directly to the `--objstore.config` parameter, or (preferably) by passing the path to a configuration file to the `--objstore.config-file` option.
    60  
    61  NOTE: Minio client was mainly for AWS S3, but it can be configured against other S3-compatible object storages e.g Ceph
    62  
    63  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=s3.Config"
    64  type: S3
    65  config:
    66    bucket: ""
    67    endpoint: ""
    68    region: ""
    69    aws_sdk_auth: false
    70    access_key: ""
    71    insecure: false
    72    signature_version2: false
    73    secret_key: ""
    74    session_token: ""
    75    put_user_metadata: {}
    76    http_config:
    77      idle_conn_timeout: 1m30s
    78      response_header_timeout: 2m
    79      insecure_skip_verify: false
    80      tls_handshake_timeout: 10s
    81      expect_continue_timeout: 1s
    82      max_idle_conns: 100
    83      max_idle_conns_per_host: 100
    84      max_conns_per_host: 0
    85      tls_config:
    86        ca_file: ""
    87        cert_file: ""
    88        key_file: ""
    89        server_name: ""
    90        insecure_skip_verify: false
    91      disable_compression: false
    92    trace:
    93      enable: false
    94    list_objects_version: ""
    95    bucket_lookup_type: auto
    96    part_size: 67108864
    97    sse_config:
    98      type: ""
    99      kms_key_id: ""
   100      kms_encryption_context: {}
   101      encryption_key: ""
   102    sts_endpoint: ""
   103  prefix: ""
   104  ```
   105  
   106  At a minimum, you will need to provide a value for the `bucket`, `endpoint`, `access_key`, and `secret_key` keys. The rest of the keys are optional.
   107  
   108  However if you set `aws_sdk_auth: true` Thanos will use the default authentication methods of the AWS SDK for go based on [known environment variables](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html) (`AWS_PROFILE`, `AWS_WEB_IDENTITY_TOKEN_FILE` ... etc) and known AWS config files (~/.aws/config). If you turn this on, then the `bucket` and `endpoint` are the required config keys.
   109  
   110  The field `prefix` can be used to transparently use prefixes in your S3 bucket. This allows you to separate blocks coming from different sources into paths with different prefixes, making it easier to understand what's going on (i.e. you don't have to use Thanos tooling to know from where which blocks came).
   111  
   112  The AWS region to endpoint mapping can be found in this [link](https://docs.aws.amazon.com/general/latest/gr/s3.html).
   113  
   114  Make sure you use a correct signature version. Currently AWS requires signature v4, so it needs `signature_version2: false`. If you don't specify it, you will get an `Access Denied` error. On the other hand, several S3 compatible APIs use `signature_version2: true`.
   115  
   116  You can configure the timeout settings for the HTTP client by setting the `http_config.idle_conn_timeout` and `http_config.response_header_timeout` keys. As a rule of thumb, if you are seeing errors like `timeout awaiting response headers` in your logs, you may want to increase the value of `http_config.response_header_timeout`.
   117  
   118  Please refer to the documentation of [the Transport type](https://golang.org/pkg/net/http/#Transport) in the `net/http` package for detailed information on what each option does.
   119  
   120  `part_size` is specified in bytes and refers to the minimum file size used for multipart uploads, as some custom S3 implementations may have different requirements. A value of `0` means to use a default 128 MiB size.
   121  
   122  Set `list_objects_version: "v1"` for S3 compatible APIs that don't support ListObjectsV2 (e.g. some versions of Ceph). Default value (`""`) is equivalent to `"v2"`.
   123  
   124  `http_config.tls_config` allows configuring TLS connections. Please refer to the document of [tls_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config) for detailed information on what each option does.
   125  
   126  `bucket_lookup_type` can be `auto`, `virtual-hosted` or `path`. Read more about it [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html).
   127  
   128  For debug and testing purposes you can set
   129  
   130  * `insecure: true` to switch to plain insecure HTTP instead of HTTPS
   131  
   132  * `http_config.insecure_skip_verify: true` to disable TLS certificate verification (if your S3 based storage is using a self-signed certificate, for example)
   133  
   134  * `trace.enable: true` to enable the minio client's verbose logging. Each request and response will be logged into the debug logger, so debug level logging must be enabled for this functionality.
   135  
   136  ##### S3 Server-Side Encryption
   137  
   138  SSE can be configued using the `sse_config`. [SSE-S3](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html), [SSE-KMS](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html), and [SSE-C](https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html) are supported.
   139  
   140  * If type is set to `SSE-S3` you do not need to configure other options.
   141  
   142  * If type is set to `SSE-KMS` you must set `kms_key_id`. The `kms_encryption_context` is optional, as [AWS provides a default encryption context](https://docs.aws.amazon.com/kms/latest/developerguide/services-s3.html#s3-encryption-context).
   143  
   144  * If type is set to `SSE-C` you must provide a path to the encryption key using `encryption_key`.
   145  
   146  If the SSE Config block is set but the `type` is not one of `SSE-S3`, `SSE-KMS`, or `SSE-C`, an error is raised.
   147  
   148  You will also need to apply the following AWS IAM policy for the user to access the KMS key:
   149  
   150  ```json
   151  {
   152      "Version": "2012-10-17",
   153      "Statement": [
   154          {
   155              "Sid": "KMSAccess",
   156              "Effect": "Allow",
   157              "Action": [
   158                  "kms:GenerateDataKey",
   159                  "kms:Encrypt",
   160                  "kms:Decrypt"
   161              ],
   162              "Resource": "arn:aws:kms:<region>:<account>:key/<KMS key id>"
   163          }
   164      ]
   165  }
   166  ```
   167  
   168  ##### Credentials
   169  
   170  By default Thanos will try to retrieve credentials from the following sources:
   171  
   172  1. From config file if BOTH `access_key` and `secret_key` are present.
   173  2. From the standard AWS environment variable - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
   174  3. From `~/.aws/credentials`
   175  4. IAM credentials retrieved from an instance profile.
   176  
   177  NOTE: Getting access key from config file and secret key from other method (and vice versa) is not supported.
   178  
   179  ##### AWS Policies
   180  
   181  Example working AWS IAM policy for user:
   182  
   183  * For deployment (policy for Thanos services):
   184  
   185  ```json
   186  {
   187      "Version": "2012-10-17",
   188      "Statement": [
   189          {
   190              "Sid": "Statement",
   191              "Effect": "Allow",
   192              "Action": [
   193                  "s3:ListBucket",
   194                  "s3:GetObject",
   195                  "s3:DeleteObject",
   196                  "s3:PutObject"
   197              ],
   198              "Resource": [
   199                  "arn:aws:s3:::<bucket>/*",
   200                  "arn:aws:s3:::<bucket>"
   201              ]
   202          }
   203      ]
   204  }
   205  ```
   206  
   207  (No bucket policy)
   208  
   209  To test the policy, set env vars for S3 access for *empty, not used* bucket as well as:
   210  
   211  ```
   212  THANOS_TEST_OBJSTORE_SKIP=GCS,AZURE,SWIFT,COS,ALIYUNOSS,BOS,OCI,OBS
   213  THANOS_ALLOW_EXISTING_BUCKET_USE=true
   214  ```
   215  
   216  And run: `GOCACHE=off go test -v -run TestObjStore_AcceptanceTest_e2e ./pkg/...`
   217  
   218  * For testing (policy to run e2e tests):
   219  
   220  We need access to CreateBucket and DeleteBucket and access to all buckets:
   221  
   222  ```json
   223  {
   224      "Version": "2012-10-17",
   225      "Statement": [
   226          {
   227              "Sid": "Statement",
   228              "Effect": "Allow",
   229              "Action": [
   230                  "s3:ListBucket",
   231                  "s3:GetObject",
   232                  "s3:DeleteObject",
   233                  "s3:PutObject",
   234                  "s3:CreateBucket",
   235                  "s3:DeleteBucket"
   236              ],
   237              "Resource": [
   238                  "arn:aws:s3:::<bucket>/*",
   239                  "arn:aws:s3:::<bucket>"
   240              ]
   241          }
   242      ]
   243  }
   244  ```
   245  
   246  With this policy you should be able to run set `THANOS_TEST_OBJSTORE_SKIP=GCS,AZURE,SWIFT,COS,ALIYUNOSS,BOS,OCI,OBS` and unset `S3_BUCKET` and run all tests using `make test`.
   247  
   248  Details about AWS policies: https://docs.aws.amazon.com/AmazonS3/latest/dev/using-with-s3-actions.html
   249  
   250  ##### STS Endpoint
   251  
   252  If you want to use IAM credential retrieved from an instance profile, Thanos needs to authenticate through AWS STS. For this purposes you can specify your own STS Endpoint.
   253  
   254  By default Thanos will use endpoint: https://sts.amazonaws.com and AWS region corresponding endpoints.
   255  
   256  ##### S3 Storage Class
   257  
   258  By default, the `STANDARD` S3 storage class will be used. To specify a storage class, add it to the `put_user_metadata` section of the config file.
   259  
   260  For example, the config file below specifies storage class of `STANDARD_IA`.
   261  
   262  ```yaml
   263  type: S3
   264  prefix: thanos-test-standard-ia
   265  config:
   266    endpoint: s3.us-east-1.amazonaws.com
   267    region: us-east-1
   268    bucket: MY_BUCKET
   269    put_user_metadata:
   270      X-Amz-Storage-Class: STANDARD_IA
   271    trace:
   272      enable: true
   273  ```
   274  
   275  #### GCS
   276  
   277  To configure Google Cloud Storage bucket as an object store you need to set `bucket` with GCS bucket name and configure Google Application credentials.
   278  
   279  For example:
   280  
   281  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=gcs.Config"
   282  type: GCS
   283  config:
   284    bucket: ""
   285    service_account: ""
   286  prefix: ""
   287  ```
   288  
   289  ##### Using GOOGLE_APPLICATION_CREDENTIALS
   290  
   291  Application credentials are configured via JSON file and only the bucket needs to be specified, the client looks for:
   292  
   293  1. A JSON file whose path is specified by the `GOOGLE_APPLICATION_CREDENTIALS` environment variable.
   294  2. A JSON file in a location known to the gcloud command-line tool. On Windows, this is `%APPDATA%/gcloud/application_default_credentials.json`. On other systems, `$HOME/.config/gcloud/application_default_credentials.json`.
   295  3. On Google App Engine it uses the `appengine.AccessToken` function.
   296  4. On Google Compute Engine and Google App Engine Managed VMs, it fetches credentials from the metadata server. (In this final case any provided scopes are ignored.)
   297  
   298  You can read more on how to get application credential json file in [https://cloud.google.com/docs/authentication/production](https://cloud.google.com/docs/authentication/production)
   299  
   300  ##### Using inline a Service Account
   301  
   302  Another possibility is to inline the ServiceAccount into the Thanos configuration and only maintain one file. This feature was added, so that the Prometheus Operator only needs to take care of one secret file.
   303  
   304  ```yaml
   305  type: GCS
   306  config:
   307    bucket: "thanos"
   308    service_account: |-
   309      {
   310        "type": "service_account",
   311        "project_id": "project",
   312        "private_key_id": "abcdefghijklmnopqrstuvwxyz12345678906666",
   313        "private_key": "-----BEGIN PRIVATE KEY-----\...\n-----END PRIVATE KEY-----\n",
   314        "client_email": "project@thanos.iam.gserviceaccount.com",
   315        "client_id": "123456789012345678901",
   316        "auth_uri": "https://accounts.google.com/o/oauth2/auth",
   317        "token_uri": "https://oauth2.googleapis.com/token",
   318        "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
   319        "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/thanos%40gitpods.iam.gserviceaccount.com"
   320      }
   321  ```
   322  
   323  ##### GCS Policies
   324  
   325  **Note:** GCS Policies should be applied at the project level, not at the bucket level
   326  
   327  For deployment:
   328  
   329  `Storage Object Creator` and `Storage Object Viewer`
   330  
   331  For testing:
   332  
   333  `Storage Object Admin` for ability to create and delete temporary buckets.
   334  
   335  To test the policy is working as expected, exec into the sidecar container, eg:
   336  
   337  ```sh
   338  kubectl exec -it -n <namespace> <prometheus with sidecar pod name> -c <sidecar container name> -- /bin/sh
   339  ```
   340  
   341  Then test that you can at least list objects in the bucket, eg:
   342  
   343  ```sh
   344  thanos tools bucket ls --objstore.config="${OBJSTORE_CONFIG}"
   345  ```
   346  
   347  #### Azure
   348  
   349  To use Azure Storage as Thanos object store, you need to precreate storage account from Azure portal or using Azure CLI. Follow the instructions from Azure Storage Documentation: [https://docs.microsoft.com/en-us/azure/storage/common/storage-quickstart-create-account](https://docs.microsoft.com/en-us/azure/storage/common/storage-quickstart-create-account?tabs=portal)
   350  
   351  To configure Azure Storage account as an object store you need to provide a path to Azure storage config file in flag `--objstore.config-file`.
   352  
   353  Config file format is the following:
   354  
   355  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=azure.Config"
   356  type: AZURE
   357  config:
   358    storage_account: ""
   359    storage_account_key: ""
   360    storage_connection_string: ""
   361    container: ""
   362    endpoint: ""
   363    user_assigned_id: ""
   364    max_retries: 0
   365    reader_config:
   366      max_retry_requests: 0
   367    pipeline_config:
   368      max_tries: 0
   369      try_timeout: 0s
   370      retry_delay: 0s
   371      max_retry_delay: 0s
   372    http_config:
   373      idle_conn_timeout: 0s
   374      response_header_timeout: 0s
   375      insecure_skip_verify: false
   376      tls_handshake_timeout: 0s
   377      expect_continue_timeout: 0s
   378      max_idle_conns: 0
   379      max_idle_conns_per_host: 0
   380      max_conns_per_host: 0
   381      tls_config:
   382        ca_file: ""
   383        cert_file: ""
   384        key_file: ""
   385        server_name: ""
   386        insecure_skip_verify: false
   387      disable_compression: false
   388    msi_resource: ""
   389  prefix: ""
   390  ```
   391  
   392  If `msi_resource` is used, authentication is done via system-assigned managed identity. The value for Azure should be `https://<storage-account-name>.blob.core.windows.net`.
   393  
   394  If `user_assigned_id` is used, authentication is done via user-assigned managed identity. When using `user_assigned_id` the `msi_resource` defaults to `https://<storage_account>.<endpoint>`
   395  
   396  The generic `max_retries` will be used as value for the `pipeline_config`'s `max_tries` and `reader_config`'s `max_retry_requests`. For more control, `max_retries` could be ignored (0) and one could set specific retry values.
   397  
   398  #### OpenStack Swift
   399  
   400  Thanos uses [ncw/swift](https://github.com/ncw/swift) client to upload Prometheus data into [OpenStack Swift](https://docs.openstack.org/swift/latest/).
   401  
   402  Below is an example configuration file for thanos to use OpenStack swift container as an object store. Note that if the `name` of a user, project or tenant is used one must also specify its domain by ID or name. Various examples for OpenStack authentication can be found in the [official documentation](https://developer.openstack.org/api-ref/identity/v3/index.html?expanded=password-authentication-with-scoped-authorization-detail#password-authentication-with-unscoped-authorization).
   403  
   404  By default, OpenStack Swift has a limit for maximum file size of 5 GiB. Thanos index files are often larger than that. To resolve this issue, Thanos uses [Static Large Objects (SLO)](https://docs.openstack.org/swift/latest/overview_large_objects.html) which are uploaded as segments. These are by default put into the `segments` directory of the same container. The default limit for using SLO is 1 GiB which is also the maximum size of the segment. If you don't want to use the same container for the segments (best practise is to use `<container_name>_segments` to avoid polluting listing of the container objects) you can use the `large_file_segments_container_name` option to override the default and put the segments to other container. *In rare cases you can switch to [Dynamic Large Objects (DLO)](https://docs.openstack.org/swift/latest/overview_large_objects.html) by setting the `use_dynamic_large_objects` to true, but use it with caution since it even more relies on eventual consistency.*
   405  
   406  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=swift.Config"
   407  type: SWIFT
   408  config:
   409    auth_version: 0
   410    auth_url: ""
   411    username: ""
   412    user_domain_name: ""
   413    user_domain_id: ""
   414    user_id: ""
   415    password: ""
   416    domain_id: ""
   417    domain_name: ""
   418    application_credential_id: ""
   419    application_credential_name: ""
   420    application_credential_secret: ""
   421    project_id: ""
   422    project_name: ""
   423    project_domain_id: ""
   424    project_domain_name: ""
   425    region_name: ""
   426    container_name: ""
   427    large_object_chunk_size: 1073741824
   428    large_object_segments_container_name: ""
   429    retries: 3
   430    connect_timeout: 10s
   431    timeout: 5m
   432    use_dynamic_large_objects: false
   433  prefix: ""
   434  ```
   435  
   436  #### Tencent COS
   437  
   438  To use Tencent COS as storage store, you should apply a Tencent Account to create an object storage bucket at first. Note that detailed from Tencent Cloud Documents: [https://cloud.tencent.com/document/product/436](https://cloud.tencent.com/document/product/436)
   439  
   440  To configure Tencent Account to use COS as storage store you need to set these parameters in yaml format stored in a file:
   441  
   442  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=cos.Config"
   443  type: COS
   444  config:
   445    bucket: ""
   446    region: ""
   447    app_id: ""
   448    endpoint: ""
   449    secret_key: ""
   450    secret_id: ""
   451    http_config:
   452      idle_conn_timeout: 1m30s
   453      response_header_timeout: 2m
   454      insecure_skip_verify: false
   455      tls_handshake_timeout: 10s
   456      expect_continue_timeout: 1s
   457      max_idle_conns: 100
   458      max_idle_conns_per_host: 100
   459      max_conns_per_host: 0
   460      tls_config:
   461        ca_file: ""
   462        cert_file: ""
   463        key_file: ""
   464        server_name: ""
   465        insecure_skip_verify: false
   466      disable_compression: false
   467  prefix: ""
   468  ```
   469  
   470  The `secret_key` and `secret_id` field is required. The `http_config` field is optional for optimize HTTP transport settings. There are two ways to configure the required bucket information:
   471  1. Provide the values of `bucket`, `region` and `app_id` keys.
   472  2. Provide the values of `endpoint` key with url format when you want to specify vpc internal endpoint. Please refer to the document of [endpoint](https://intl.cloud.tencent.com/document/product/436/6224) for more detail.
   473  
   474  Set the flags `--objstore.config-file` to reference to the configuration file.
   475  
   476  #### AliYun OSS
   477  
   478  In order to use AliYun OSS object storage, you should first create a bucket with proper Storage Class , ACLs and get the access key on the AliYun cloud. Go to [https://www.alibabacloud.com/product/oss](https://www.alibabacloud.com/product/oss) for more detail.
   479  
   480  To use AliYun OSS object storage, please specify following yaml configuration file in `objstore.config*` flag.
   481  
   482  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=oss.Config"
   483  type: ALIYUNOSS
   484  config:
   485    endpoint: ""
   486    bucket: ""
   487    access_key_id: ""
   488    access_key_secret: ""
   489  prefix: ""
   490  ```
   491  
   492  Use --objstore.config-file to reference to this configuration file.
   493  
   494  #### Baidu BOS
   495  
   496  In order to use Baidu BOS object storage, you should apply for a Baidu Account and create an object storage bucket first. Refer to [Baidu Cloud Documents](https://cloud.baidu.com/doc/BOS/index.html) for more details. To use Baidu BOS object storage, please specify the following yaml configuration file in `--objstore.config*` flag.
   497  
   498  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=bos.Config"
   499  type: BOS
   500  config:
   501    bucket: ""
   502    endpoint: ""
   503    access_key: ""
   504    secret_key: ""
   505  prefix: ""
   506  ```
   507  
   508  #### Filesystem
   509  
   510  This storage type is used when user wants to store and access the bucket in the local filesystem. We treat filesystem the same way we would treat object storage, so all optimization for remote bucket applies even though, we might have the files locally.
   511  
   512  NOTE: This storage type is experimental and might be inefficient. It is NOT advised to use it as the main storage for metrics in production environment. Particularly there is no planned support for distributed filesystems like NFS. This is mainly useful for testing and demos.
   513  
   514  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=filesystem.Config"
   515  type: FILESYSTEM
   516  config:
   517    directory: ""
   518  prefix: ""
   519  ```
   520  
   521  #### Oracle Cloud Infrastructure Object Storage
   522  
   523  To configure Oracle Cloud Infrastructure (OCI) Object Storage as Thanos Object Store, you need to provide appropriate authentication credentials to your OCI tenancy. The OCI object storage client implementation for Thanos supports either the default keypair or instance principal authentication.
   524  
   525  ##### API Signing Key
   526  
   527  The default API signing key authentication provider leverages same [configuration as the OCI CLI](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/cliconcepts.htm) which is usually stored in at `$HOME/.oci/config` or via variable names starting with the string `OCI_CLI`. If the same configuration is found in multiple places the provider will prefer the first one.
   528  
   529  The following example configures the provider to look for an existing API signing key for authentication:
   530  
   531  ```yaml
   532  type: OCI
   533  config:
   534    provider: "default"
   535    bucket: ""
   536    compartment_ocid: ""
   537    part_size: ""                   // Optional part size to override the OCI default of 128 MiB, value is in bytes.
   538    max_request_retries: ""         // Optional maximum number of retries for a request.
   539    request_retry_interval: ""      // Optional sleep duration in seconds between retry requests.
   540    http_config:
   541      idle_conn_timeout: 1m30s      // Optional maximum amount of time an idle (keep-alive) connection will remain idle before closing itself. Zero means no limit.
   542      response_header_timeout: 2m   // Optional amount of time to wait for a server's response headers after fully writing the request.
   543      tls_handshake_timeout: 10s    // Optional maximum amount of time waiting to wait for a TLS handshake. Zero means no timeout.
   544      expect_continue_timeout: 1s   // Optional amount of time to wait for a server's first response headers. Zero means no timeout and causes the body to be sent immediately.
   545      insecure_skip_verify: false   // Optional. If true, crypto/tls accepts any certificate presented by the server and any host name in that certificate.
   546      max_idle_conns: 100           // Optional maximum number of idle (keep-alive) connections across all hosts. Zero means no limit.
   547      max_idle_conns_per_host: 100  // Optional maximum idle (keep-alive) connections to keep per-host. If zero, DefaultMaxIdleConnsPerHost=2 is used.
   548      max_conns_per_host: 0         // Optional maximum total number of connections per host.
   549      disable_compression: false    // Optional. If true, prevents the Transport from requesting compression.
   550      client_timeout: 90s           // Optional time limit for requests made by the HTTP Client.
   551  ```
   552  
   553  ##### Instance Principal Provider
   554  
   555  For Example:
   556  
   557  ```yaml
   558  type: OCI
   559  config:
   560    provider: "instance-principal"
   561    bucket: ""
   562    compartment_ocid: ""
   563  ```
   564  
   565  You can also include any of the optional configuration just like the example in `Default Provider`.
   566  
   567  ##### Raw Provider
   568  
   569  For Example:
   570  
   571  ```yaml
   572  type: OCI
   573  config:
   574    provider: "raw"
   575    bucket: ""
   576    compartment_ocid: ""
   577    tenancy_ocid: ""
   578    user_ocid: ""
   579    region: ""
   580    fingerprint: ""
   581    privatekey: ""
   582    passphrase: ""         // Optional passphrase to encrypt the private API Signing key
   583  ```
   584  
   585  You can also include any of the optional configuration just like the example in `Default Provider`.
   586  
   587  ##### OCI Policies
   588  
   589  Regardless of the method you use for authentication (raw, instance-principal), you need the following 2 policies in order for Thanos (sidecar or receive) to be able to write TSDB to OCI object storage. The difference lies in whom you are giving the permissions.
   590  
   591  For using instance-principal and dynamic group:
   592  
   593  ```
   594  Allow dynamic-group thanos to read buckets in compartment id ocid1.compartment.oc1..a
   595  Allow dynamic-group thanos to manage objects in compartment id ocid1.compartment.oc1..a
   596  ```
   597  
   598  For using raw provider and an IAM group:
   599  
   600  ```
   601  Allow group thanos to read buckets in compartment id ocid1.compartment.oc1..a
   602  Allow group thanos to manage objects in compartment id ocid1.compartment.oc1..a
   603  ```
   604  
   605  ### How to add a new client to Thanos?
   606  
   607  objstore.go
   608  
   609  Following checklist allows adding new Go code client to supported providers:
   610  
   611  1. Create new directory under `pkg/objstore/<provider>`
   612  2. Implement [objstore.Bucket interface](https://github.com/thanos-io/objstore/blob/main//objstore.go)
   613  3. Add `NewTestBucket` constructor for testing purposes, that creates and deletes temporary bucket.
   614  4. Use created `NewTestBucket` in [ForeachStore method](https://github.com/thanos-io/objstore/blob/main/objtesting/foreach.go) to ensure we can run tests against new provider. (In PR)
   615  5. RUN the [TestObjStoreAcceptanceTest](https://github.com/thanos-io/objstore/blob/main//objtesting/acceptance_e2e_test.go) against your provider to ensure it fits. Fix any found error until test passes. (In PR)
   616  6. Add client implementation to the factory in [factory](https://github.com/thanos-io/objstore/blob/main/client/factory.go) code. (Using as small amount of flags as possible in every command)
   617  7. Add client struct config to [bucketcfggen](../scripts/cfggen/main.go) to allow config auto generation.
   618  
   619  At that point, anyone can use your provider by spec.
   620  
   621  Check the checklist in [thanos-io/objstore](https://github.com/thanos-io/objstore#how-to-add-a-new-client-to-thanos) for more comprehensive information!
   622  
   623  ## Data in Object Storage
   624  
   625  Thanos supports writing and reading data in native Prometheus `TSDB blocks` in [TSDB format](https://github.com/prometheus/prometheus/tree/master/tsdb/docs/format). This is the format used by [Prometheus](https://prometheus.io) TSDB database for persisting data on the local disk. With the efficient index and [chunk](design.md#chunk) binary formats, it also fits well to be used directly from object storage using range GET API.
   626  
   627  Following sections explain this format in details with the additional files and entries that Thanos system supports.
   628  
   629  ### TSDB Block
   630  
   631  Official docs for Prometheus TSDB format can be found [here](https://github.com/prometheus/prometheus/tree/master/tsdb/docs/format), but this section lists the most important elements here.
   632  
   633  TSDB Block means particularly a set of Blobs (files) in a single directory (or `prefix` if we talk in Object Storage terms) named with [ULID](https://github.com/ulid/spec) e.g `01ARZ3NDEKTSV4RRFFQ69G5FAV`.
   634  
   635  **Those files contain series (labels with compressed samples) for particular time duration (e.g 2h) from particular `Source` (e.g Prometheus or Thanos Receive)**
   636  
   637  In Thanos system, all files are **strictly immutable**. (NOTE: In Prometheus too, but with some caveats like tombstones). This means that any modification like `rewrite` `deletion` or `compaction` has to be done by creating a new block and removing (with delay!) old one.
   638  
   639  > NOTE: Any other not-known file present in this directory is ignored when reading the data. However, those can be removed when the block is being deleted from object storage/disk.
   640  
   641  Example block file structure (on the local filesystem) can look like this:
   642  
   643  ```
   644  01DN3SK96XDAEKRB1AN30AAW6E:
   645  total 2209344
   646  drwxr-xr-x 2 bwplotka bwplotka       4096 Dec 10  2019 chunks
   647  -rw-r--r-- 1 bwplotka bwplotka 1962383742 Dec 10  2019 index
   648  -rw-r--r-- 1 bwplotka bwplotka       6761 Dec 10  2019 meta.json
   649  -rw-r--r-- 1 bwplotka bwplotka        111 Dec 10  2019 deletion-mark.json    # <-- Optional marker.
   650  -rw-r--r-- 1 bwplotka bwplotka        124 Dec 10  2019 no-compact-mark.json  # <-- Optional marker.
   651  
   652  01DN3SK96XDAEKRB1AN30AAW6E/chunks:
   653  total 8202452
   654  -rw-r--r-- 1 bwplotka bwplotka 536870490 Dec 10  2019 000001
   655  -rw-r--r-- 1 bwplotka bwplotka 536869843 Dec 10  2019 000002
   656  -rw-r--r-- 1 bwplotka bwplotka 536869848 Dec 10  2019 000003
   657  -rw-r--r-- 1 bwplotka bwplotka 536868209 Dec 10  2019 000004
   658  -rw-r--r-- 1 bwplotka bwplotka 536869517 Dec 10  2019 000005
   659  -rw-r--r-- 1 bwplotka bwplotka 536870654 Dec 10  2019 000006
   660  -rw-r--r-- 1 bwplotka bwplotka 536855168 Dec 10  2019 000007
   661  -rw-r--r-- 1 bwplotka bwplotka 536859441 Dec 10  2019 000008
   662  -rw-r--r-- 1 bwplotka bwplotka 536862863 Dec 10  2019 000009
   663  -rw-r--r-- 1 bwplotka bwplotka 536868432 Dec 10  2019 000010
   664  -rw-r--r-- 1 bwplotka bwplotka 536861395 Dec 10  2019 000011
   665  -rw-r--r-- 1 bwplotka bwplotka 536870859 Dec 10  2019 000012
   666  -rw-r--r-- 1 bwplotka bwplotka 536854971 Dec 10  2019 000013
   667  -rw-r--r-- 1 bwplotka bwplotka 536846973 Dec 10  2019 000014
   668  -rw-r--r-- 1 bwplotka bwplotka 536866732 Dec 10  2019 000015
   669  -rw-r--r-- 1 bwplotka bwplotka 346266827 Dec 10  2019 000016
   670  ```
   671  
   672  Let's look at each file one by one.
   673  
   674  #### Metadata file (meta.json)
   675  
   676  > NOTE: Currently supported meta.json version: v1 Currently supported meta.json Thanos section version: v1
   677  
   678  This file is an important entry that described the block and its data.
   679  
   680  This file allows you to find for example:
   681  
   682  * The block ID (`ulid`)
   683  * Duration of the block (`minTime` and `maxTime`)
   684  * Important statistics (`stats.numSeries`)
   685  * How many times block was re-compacted (`compaction.level`)
   686  * What initial smaller blocks IDs are part of this block (`compaction.sources`)
   687  * What smaller (including intermittent) blocks IDs are part of this block (`compaction.parents`)
   688  * Thanos Section (only visible for blocks generated by Thanos components like `sidecar`, `receive` or `compact`):
   689    * External Labels for block (identifying producers) (`thanos.labels`)
   690    * Downsampling resolution if downsampling was done on this block (`thanos.downsample.resolution`). `0` means no downsampling.
   691    * What component created block (`thanos.source`)
   692    * Files and its sizes that are part of this block (`thanos.files`)
   693  
   694  > NOTE: In theory, you can modify this data manually. However, components like Compactor and Store Gateway currently infinitely cache that meta.json, (sometimes on disk if configured), so manual cache removal and restart might be needed.
   695  
   696  Example meta.json file:
   697  
   698  ```json
   699  {
   700  	"ulid": "01DN3SK96XDAEKRB1AN30AAW6E",
   701  	"minTime": 1567641600000,
   702  	"maxTime": 1568851200000,
   703  	"stats": {
   704  		"numSamples": 5397517846,
   705  		"numSeries": 8377876,
   706  		"numChunks": 67874256
   707  	},
   708  	"compaction": {
   709  		"level": 4,
   710  		"sources": [
   711  			"01DKZNX70TQQ0R025G66ZF1V5P",
   712  			"01DKZWS55317K7JGVMCSBR68Z2",  // Trimmed items for readability.
   713  			"01DN3GH4A71RD6NYQ2VZPBQTFH"
   714  		],
   715  		"parents": [
   716  			{
   717  				"ulid": "01DM4WK3F9ZGW19W16MZJJFF6T",
   718  				"minTime": 1567641600000,
   719  				"maxTime": 1567814400000
   720  			},
   721  			{
   722  				"ulid": "01DMA1BXHK3G2KDKAPMBTVATRT",
   723  				"minTime": 1567814400000,
   724  				"maxTime": 1567987200000
   725  			},
   726  			{
   727  				"ulid": "01DMF65TY6JSTCDVTPZ094B5D6",
   728  				"minTime": 1567987200000,
   729  				"maxTime": 1568160000000
   730  			},
   731  			{
   732  				"ulid": "01DMMB0SK28FKC55RNK7ZZWS1A",
   733  				"minTime": 1568160000000,
   734  				"maxTime": 1568332800000
   735  			},
   736  			{
   737  				"ulid": "01DMSFSXNE8Y76G5KCQ2BABYFA",
   738  				"minTime": 1568332800000,
   739  				"maxTime": 1568505600000
   740  			},
   741  			{
   742  				"ulid": "01DMYMM5SW0FPJSHQQQM05FBN9",
   743  				"minTime": 1568505600000,
   744  				"maxTime": 1568678400000
   745  			},
   746  			{
   747  				"ulid": "01DN3SDE1M9W1JG7JFSM5QFP2Y",
   748  				"minTime": 1568678400000,
   749  				"maxTime": 1568851200000
   750  			}
   751  		]
   752  	},
   753  	"version": 1,
   754  	"thanos": {
   755  		"labels": {
   756              "cluster": "eu1",
   757  			"monitor": "prometheus",
   758              "tenant": "team-a",
   759  			"replica": "1"
   760  		},
   761  		"downsample": {
   762  			"resolution": 0
   763  		},
   764  		"source": "compactor",
   765         	"files": [
   766         			{
   767         				"rel_path": "index",
   768         				"size_bytes": 1313
   769         			}, // Trimmed items for readability.
   770         	],
   771          "version": 1
   772  	}
   773  }
   774  ```
   775  
   776  Format in Go code can be found [here](https://github.com/thanos-io/thanos/blob/main/pkg/block/metadata/meta.go).
   777  
   778  ##### External Labels
   779  
   780  External labels are extremely important block metadata. They are stored in `meta.json` in `thanos.labels` section and allows to identify the producer and owner of those blocks. This information will be used further by different Thanos components:
   781  
   782  * Those labels will be visible when data is queried. You can aggregate across those in PromQL etc.
   783  * [Querier](components/query.md) to filter out store APIs to touch during query requests.
   784  * Many object storage readers like [compactor](components/compact.md) and [store gateway](components/store.md) which groups the blocks by external labels. This grouping allows horizontal scalability like sharding or concurrency.
   785  * Some of those labels can be chosen as **replication** labels. Querier and Compactor will then deduplicate such blocks identified by same HA groups.
   786  * Some of those labels can be chosen as **tenancy** labels. This allows read, write and storage isolation mechanism.
   787  
   788  The `meta.json` and `thanos.labels` labels are filled during block upload/creation. For example:
   789  
   790  * Each produced TSDB block by Prometheus is labelled with Prometheus [external labels](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file) by `sidecar` before upload to object storage.
   791  * Each produced TSDB block by `compact` is labelled with whatever source blocks had. The exception is the deduplication process that removes the chosen replica flag(s).
   792  * Each produced TSDB block by `receive` is labelled with labels given labels in repeated [receive](components/receive.md) `--labels` flag.
   793  
   794  The recommended information that should be given in those labels:
   795  
   796  Example Prometheus useful external labels:
   797  
   798  * Replication information e.g `replica="0"`
   799  * Cluster, environment, zone, so target origin e.g `cluster="eu-1-production"` or `cluster="1",env="production",region="us-west1"`
   800  * Tenancy information e.g `tenant="organizationABC"`
   801  
   802  > NOTE: Be careful with receive external flags. Remote Write clients can stream any labels. If some label will duplicate with the external label of receive, it will be masked with what receiver has specified. This is why it's recommended to have `receive_` prefix to all receive labels. (e.g to not confuse with Prometheus replicas)
   803  
   804  Example Receive useful external labels:
   805  
   806  * Replication information e.g `receive_replica="0"` (to not confuse with Prometheus `replica` often stated).
   807  * Cluster, environment, zone, so target origin e.g `receive_cluster="eu-west1-production-1"` or `receive_cluster="1",receive_env="production",receive_region="us-west1"`
   808  * Tenancy information e.g `tenant="organizationABC"`
   809  
   810  #### Index Format (index)
   811  
   812  > NOTE: Currently supported index file versions: v1 and v2
   813  
   814  > This file stores the index created to allow efficient lookup for series and its samples.
   815  
   816  **All entries are sorted lexicographically unless stated otherwise.**
   817  
   818  From high level it allows to find:
   819  
   820  * Label names
   821  * Label values for label name
   822  * All series labels
   823  * Given (or all) series' chunk reference. This can be used to find [chunk](design.md#chunk) with samples in the [chunk files](#chunks-file-format)
   824  
   825  The following describes the format of the `index` file found in each block directory. It is terminated by a table of contents which serves as an entry point into the index.
   826  
   827  ```
   828  ┌────────────────────────────┬─────────────────────┐
   829  │ magic(0xBAAAD700) <4b>     │ version(1) <1 byte> │
   830  ├────────────────────────────┴─────────────────────┤
   831  │ ┌──────────────────────────────────────────────┐ │
   832  │ │                 Symbol Table                 │ │
   833  │ ├──────────────────────────────────────────────┤ │
   834  │ │                    Series                    │ │
   835  │ ├──────────────────────────────────────────────┤ │
   836  │ │                 Label Index 1                │ │
   837  │ ├──────────────────────────────────────────────┤ │
   838  │ │                      ...                     │ │
   839  │ ├──────────────────────────────────────────────┤ │
   840  │ │                 Label Index N                │ │
   841  │ ├──────────────────────────────────────────────┤ │
   842  │ │                   Postings 1                 │ │
   843  │ ├──────────────────────────────────────────────┤ │
   844  │ │                      ...                     │ │
   845  │ ├──────────────────────────────────────────────┤ │
   846  │ │                   Postings N                 │ │
   847  │ ├──────────────────────────────────────────────┤ │
   848  │ │               Label Offset Table             │ │
   849  │ ├──────────────────────────────────────────────┤ │
   850  │ │             Postings Offset Table            │ │
   851  │ ├──────────────────────────────────────────────┤ │
   852  │ │                      TOC                     │ │
   853  │ └──────────────────────────────────────────────┘ │
   854  └──────────────────────────────────────────────────┘
   855  ```
   856  
   857  When the index is written, an arbitrary number of padding bytes may be added between the lined out main sections above. When sequentially scanning through the file, any zero bytes after a section's specified length must be skipped.
   858  
   859  Most of the sections described below start with a `len` field. It always specifies the number of bytes just before the trailing CRC32 checksum. The checksum is always calculated over those `len` bytes.
   860  
   861  ##### Symbol Table
   862  
   863  The symbol table holds a sorted list of deduplicated strings that occurred in label pairs of the stored series. They can be referenced from subsequent sections and significantly reduce the total index size.
   864  
   865  The section contains a sequence of the string entries, each prefixed with the string's length in raw bytes. All strings are utf-8 encoded. Strings are referenced by sequential indexing. The strings are sorted in lexicographically ascending order.
   866  
   867  ```
   868  ┌────────────────────┬─────────────────────┐
   869  │ len <4b>           │ #symbols <4b>       │
   870  ├────────────────────┴─────────────────────┤
   871  │ ┌──────────────────────┬───────────────┐ │
   872  │ │ len(str_1) <uvarint> │ str_1 <bytes> │ │
   873  │ ├──────────────────────┴───────────────┤ │
   874  │ │                . . .                 │ │
   875  │ ├──────────────────────┬───────────────┤ │
   876  │ │ len(str_n) <uvarint> │ str_n <bytes> │ │
   877  │ └──────────────────────┴───────────────┘ │
   878  ├──────────────────────────────────────────┤
   879  │ CRC32 <4b>                               │
   880  └──────────────────────────────────────────┘
   881  ```
   882  
   883  ##### Series
   884  
   885  The section contains a sequence of series that hold the label set of the series as well as its [chunks](design.md#chunk) within the block. The series are sorted lexicographically by their label sets. Each series section is aligned to 16 bytes. The ID for a series is the `offset/16`. This serves as the series' ID in all subsequent references. Thereby, a sorted list of series IDs implies a lexicographically sorted list of series label sets.
   886  
   887  ```
   888  ┌───────────────────────────────────────┐
   889  │ ┌───────────────────────────────────┐ │
   890  │ │   series_1                        │ │
   891  │ ├───────────────────────────────────┤ │
   892  │ │                 . . .             │ │
   893  │ ├───────────────────────────────────┤ │
   894  │ │   series_n                        │ │
   895  │ └───────────────────────────────────┘ │
   896  └───────────────────────────────────────┘
   897  ```
   898  
   899  Every series entry first holds its number of labels, followed by tuples of symbol table references that contain the label name and value. The label pairs are lexicographically sorted. After the labels, the number of indexed [chunks](design.md#chunk) is encoded, followed by a sequence of metadata entries containing the chunks minimum (`mint`) and maximum (`maxt`) timestamp and a reference to its position in the chunk file. The `mint` is the time of the first sample and `maxt` is the time of the last sample in the chunk. Holding the time range data in the index allows dropping chunks irrelevant to queried time ranges without accessing them directly.
   900  
   901  `mint` of the first [chunk](design.md#chunk) is stored, it's `maxt` is stored as a delta and the `mint` and `maxt` are encoded as deltas to the previous time for subsequent chunks. Similarly, the reference of the first chunk is stored and the next ref is stored as a delta to the previous one.
   902  
   903  ```
   904  ┌──────────────────────────────────────────────────────────────────────────┐
   905  │ len <uvarint>                                                            │
   906  ├──────────────────────────────────────────────────────────────────────────┤
   907  │ ┌──────────────────────────────────────────────────────────────────────┐ │
   908  │ │                     labels count <uvarint64>                         │ │
   909  │ ├──────────────────────────────────────────────────────────────────────┤ │
   910  │ │              ┌────────────────────────────────────────────┐          │ │
   911  │ │              │ ref(l_i.name) <uvarint32>                  │          │ │
   912  │ │              ├────────────────────────────────────────────┤          │ │
   913  │ │              │ ref(l_i.value) <uvarint32>                 │          │ │
   914  │ │              └────────────────────────────────────────────┘          │ │
   915  │ │                             ...                                      │ │
   916  │ ├──────────────────────────────────────────────────────────────────────┤ │
   917  │ │                     chunks count <uvarint64>                         │ │
   918  │ ├──────────────────────────────────────────────────────────────────────┤ │
   919  │ │              ┌────────────────────────────────────────────┐          │ │
   920  │ │              │ c_0.mint <varint64>                        │          │ │
   921  │ │              ├────────────────────────────────────────────┤          │ │
   922  │ │              │ c_0.maxt - c_0.mint <uvarint64>            │          │ │
   923  │ │              ├────────────────────────────────────────────┤          │ │
   924  │ │              │ ref(c_0.data) <uvarint64>                  │          │ │
   925  │ │              └────────────────────────────────────────────┘          │ │
   926  │ │              ┌────────────────────────────────────────────┐          │ │
   927  │ │              │ c_i.mint - c_i-1.maxt <uvarint64>          │          │ │
   928  │ │              ├────────────────────────────────────────────┤          │ │
   929  │ │              │ c_i.maxt - c_i.mint <uvarint64>            │          │ │
   930  │ │              ├────────────────────────────────────────────┤          │ │
   931  │ │              │ ref(c_i.data) - ref(c_i-1.data) <varint64> │          │ │
   932  │ │              └────────────────────────────────────────────┘          │ │
   933  │ │                             ...                                      │ │
   934  │ └──────────────────────────────────────────────────────────────────────┘ │
   935  ├──────────────────────────────────────────────────────────────────────────┤
   936  │ CRC32 <4b>                                                               │
   937  └──────────────────────────────────────────────────────────────────────────┘
   938  ```
   939  
   940  ##### Label Index
   941  
   942  A label index section indexes the existing (combined) values for one or more label names. The `#names` field determines the number of indexed label names, followed by the total number of entries in the `#entries` field. The body holds #entries / #names tuples of symbol table references, each tuple being of `#names` length. The value tuples are sorted in lexicographically increasing order. This is no longer used.
   943  
   944  ```
   945  ┌───────────────┬────────────────┬────────────────┐
   946  │ len <4b>      │ #names <4b>    │ #entries <4b>  │
   947  ├───────────────┴────────────────┴────────────────┤
   948  │ ┌─────────────────────────────────────────────┐ │
   949  │ │ ref(value_0) <4b>                           │ │
   950  │ ├─────────────────────────────────────────────┤ │
   951  │ │ ...                                         │ │
   952  │ ├─────────────────────────────────────────────┤ │
   953  │ │ ref(value_n) <4b>                           │ │
   954  │ └─────────────────────────────────────────────┘ │
   955  │                      . . .                      │
   956  ├─────────────────────────────────────────────────┤
   957  │ CRC32 <4b>                                      │
   958  └─────────────────────────────────────────────────┘
   959  ```
   960  
   961  For instance, a single label name with 4 different values will be encoded as:
   962  
   963  ```
   964  ┌────┬───┬───┬──────────────┬──────────────┬──────────────┬──────────────┬───────┐
   965  │ 24 │ 1 │ 4 │ ref(value_0) | ref(value_1) | ref(value_2) | ref(value_3) | CRC32 |
   966  └────┴───┴───┴──────────────┴──────────────┴──────────────┴──────────────┴───────┘
   967  ```
   968  
   969  The sequence of label index sections is finalized by a [label offset table](#label-offset-table) containing label offset entries that points to the beginning of each label index section for a given label name.
   970  
   971  ##### Postings
   972  
   973  Postings sections store monotonically increasing lists of series references that contain a given label pair associated with the list.
   974  
   975  ```
   976  ┌────────────────────┬────────────────────┐
   977  │ len <4b>           │ #entries <4b>      │
   978  ├────────────────────┴────────────────────┤
   979  │ ┌─────────────────────────────────────┐ │
   980  │ │ ref(series_1) <4b>                  │ │
   981  │ ├─────────────────────────────────────┤ │
   982  │ │ ...                                 │ │
   983  │ ├─────────────────────────────────────┤ │
   984  │ │ ref(series_n) <4b>                  │ │
   985  │ └─────────────────────────────────────┘ │
   986  ├─────────────────────────────────────────┤
   987  │ CRC32 <4b>                              │
   988  └─────────────────────────────────────────┘
   989  ```
   990  
   991  The sequence of postings sections is finalized by a [postings offset table](#postings-offset-table) containing postings offset entries that points to the beginning of each postings section for a given label pair.
   992  
   993  ##### Label Offset Table
   994  
   995  A label offset table stores a sequence of label offset entries. Every label offset entry holds the label name and the offset to its values in the label index section. They are used to track label index sections. This is no longer used.
   996  
   997  ```
   998  ┌─────────────────────┬──────────────────────┐
   999  │ len <4b>            │ #entries <4b>        │
  1000  ├─────────────────────┴──────────────────────┤
  1001  │ ┌────────────────────────────────────────┐ │
  1002  │ │  n = 1 <1b>                            │ │
  1003  │ ├──────────────────────┬─────────────────┤ │
  1004  │ │ len(name) <uvarint>  │ name <bytes>    │ │
  1005  │ ├──────────────────────┴─────────────────┤ │
  1006  │ │  offset <uvarint64>                    │ │
  1007  │ └────────────────────────────────────────┘ │
  1008  │                    . . .                   │
  1009  ├────────────────────────────────────────────┤
  1010  │  CRC32 <4b>                                │
  1011  └────────────────────────────────────────────┘
  1012  ```
  1013  
  1014  ##### Postings Offset Table
  1015  
  1016  A postings offset table stores a sequence of postings offset entries, sorted by label name and value. Every postings offset entry holds the label name/value pair and the offset to its series list in the postings section. They are used to track postings sections. They are partially read into memory when an index file is loaded.
  1017  
  1018  ```
  1019  ┌─────────────────────┬──────────────────────┐
  1020  │ len <4b>            │ #entries <4b>        │
  1021  ├─────────────────────┴──────────────────────┤
  1022  │ ┌────────────────────────────────────────┐ │
  1023  │ │  n = 2 <1b>                            │ │
  1024  │ ├──────────────────────┬─────────────────┤ │
  1025  │ │ len(name) <uvarint>  │ name <bytes>    │ │
  1026  │ ├──────────────────────┼─────────────────┤ │
  1027  │ │ len(value) <uvarint> │ value <bytes>   │ │
  1028  │ ├──────────────────────┴─────────────────┤ │
  1029  │ │  offset <uvarint64>                    │ │
  1030  │ └────────────────────────────────────────┘ │
  1031  │                    . . .                   │
  1032  ├────────────────────────────────────────────┤
  1033  │  CRC32 <4b>                                │
  1034  └────────────────────────────────────────────┘
  1035  ```
  1036  
  1037  ##### TOC
  1038  
  1039  The table of contents serves as an entry point to the entire index and points to various sections in the file. If a reference is zero, it indicates the respective section does not exist and empty results should be returned upon lookup.
  1040  
  1041  ```
  1042  ┌─────────────────────────────────────────┐
  1043  │ ref(symbols) <8b>                       │
  1044  ├─────────────────────────────────────────┤
  1045  │ ref(series) <8b>                        │
  1046  ├─────────────────────────────────────────┤
  1047  │ ref(label indices start) <8b>           │
  1048  ├─────────────────────────────────────────┤
  1049  │ ref(label offset table) <8b>            │
  1050  ├─────────────────────────────────────────┤
  1051  │ ref(postings start) <8b>                │
  1052  ├─────────────────────────────────────────┤
  1053  │ ref(postings offset table) <8b>         │
  1054  ├─────────────────────────────────────────┤
  1055  │ CRC32 <4b>                              │
  1056  └─────────────────────────────────────────┘
  1057  ```
  1058  
  1059  #### Chunks File Format
  1060  
  1061  > NOTE: Currently supported index file versions: v1.
  1062  
  1063  > NOTE: Don't confuse with `chunks format` (XOR encoded, Gorilla compressed set of samples). Overall chunk files are containing multiple series chunks (:
  1064  
  1065  The following describes the format of a chunks file, which is created in the `chunks/` directory of a block. The maximum size per segment file is 512MiB.
  1066  
  1067  [Chunks](design.md#chunk) in the files are referenced from the index by uint64 composed of in-file offset (lower 4 bytes) and segment sequence number (upper 4 bytes).
  1068  
  1069  ```
  1070  ┌──────────────────────────────┐
  1071  │  magic(0x85BD40DD) <4 byte>  │
  1072  ├──────────────────────────────┤
  1073  │    version(1) <1 byte>       │
  1074  ├──────────────────────────────┤
  1075  │    padding(0) <3 byte>       │
  1076  ├──────────────────────────────┤
  1077  │ ┌──────────────────────────┐ │
  1078  │ │         Chunk 1          │ │
  1079  │ ├──────────────────────────┤ │
  1080  │ │          ...             │ │
  1081  │ ├──────────────────────────┤ │
  1082  │ │         Chunk N          │ │
  1083  │ └──────────────────────────┘ │
  1084  └──────────────────────────────┘
  1085  ```
  1086  
  1087  ##### Chunk
  1088  
  1089  ```
  1090  ┌───────────────┬───────────────────┬──────────────┬────────────────┐
  1091  │ len <uvarint> │ encoding <1 byte> │ data <bytes> │ CRC32 <4 byte> │
  1092  └───────────────┴───────────────────┴──────────────┴────────────────┘
  1093  ```
  1094  
  1095  #### Tombstones
  1096  
  1097  Thanos ignores any tombstones files. They are also deleted by sidecar on upload.