github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/storage/_index.md

github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/storage/_index.md (about)

     1  ---
     2  title: Storage
     3  weight: 1010
     4  ---
     5  # Storage
     6  
     7  Unlike other logging systems, Grafana Loki is built around the idea of only indexing
     8  metadata about your logs: labels (just like Prometheus labels). Log data itself
     9  is then compressed and stored in chunks in object stores such as S3 or GCS, or
    10  even locally on the filesystem. A small index and highly compressed chunks
    11  simplifies the operation and significantly lowers the cost of Loki.
    12  
    13  Until Loki 2.0, index data was stored in a separate index.
    14  
    15  Loki 2.0 brings an index mechanism named 'boltdb-shipper' and is what we now call Single Store Loki.
    16  This index type only requires one store, the object store, for both the index and chunks.
    17  More detailed information can be found on the [operations page]({{< relref "../operations/storage/boltdb-shipper.md" >}}).
    18  
    19  Some more storage details can also be found in the [operations section]({{< relref "../operations/storage/_index.md" >}}).
    20  
    21  ## Implementations - Chunks
    22  
    23  ### Cassandra
    24  
    25  Cassandra is a popular database and one of Loki's possible chunk stores and is production safe.
    26  
    27  ### GCS
    28  
    29  GCS is a hosted object store offered by Google. It is a good candidate for a managed object store, especially when you're already running on GCP, and is production safe.
    30  
    31  ### File System
    32  
    33  The file system is the simplest backend for chunks, although it's also susceptible to data loss as it's unreplicated. This is common for single binary deployments though, as well as for those trying out loki or doing local development on the project. It is similar in concept to many Prometheus deployments where a single Prometheus is responsible for monitoring a fleet.
    34  
    35  ### S3
    36  
    37  S3 is AWS's hosted object store. It is a good candidate for a managed object store, especially when you're already running on AWS, and is production safe.
    38  
    39  ### Notable Mentions
    40  
    41  You may use any substitutable services, such as those that implement the S3 API like [MinIO](https://min.io/).
    42  
    43  ## Implementations - Index
    44  
    45  ### Single-Store
    46  
    47  Also known as "boltdb-shipper" during development (and is still the schema `store` name). The single store configurations for Loki utilize the chunk store for both chunks and the index, requiring just one store to run Loki.
    48  
    49  As of 2.0, this is the recommended index storage type, performance is comparable to a dedicated index type while providing a much less expensive and less complicated deployment.
    50  
    51  ### Cassandra
    52  
    53  Cassandra can also be utilized for the index store and aside from the [boltdb-shipper](../operations/storage/boltdb-shipper/), it's the only non-cloud offering that can be used for the index that's horizontally scalable and has configurable replication. It's a good candidate when you already run Cassandra, are running on-prem, or do not wish to use a managed cloud offering.
    54  
    55  ### BigTable
    56  
    57  Bigtable is a cloud database offered by Google. It is a good candidate for a managed index store if you're already using it (due to it's heavy fixed costs) or wish to run in GCP.
    58  
    59  ### DynamoDB
    60  
    61  DynamoDB is a cloud database offered by AWS. It is a good candidate for a managed index store, especially if you're already running in AWS.
    62  
    63  #### Rate Limiting
    64  
    65  DynamoDB is susceptible to rate limiting, particularly due to overconsuming what is called [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html). This can be controlled via the [provisioning](#provisioning) configs in the table manager.
    66  
    67  ### BoltDB
    68  
    69  BoltDB is an embedded database on disk. It is not replicated and thus cannot be used for high availability or clustered Loki deployments, but is commonly paired with a `filesystem` chunk store for proof of concept deployments, trying out Loki, and development. The [boltdb-shipper](../operations/storage/boltdb-shipper/) aims to support clustered deployments using `boltdb` as an index.
    70  
    71  ### Azure Storage Account
    72  
    73  An Azure storage account contains all of your Azure Storage data objects: blobs, file shares, queues, tables, and disks.
    74  
    75  ## Schema Configs
    76  
    77  Loki aims to be backwards compatible and over the course of its development has had many internal changes that facilitate better and more efficient storage/querying. Loki allows incrementally upgrading to these new storage _schemas_ and can query across them transparently. This makes upgrading a breeze. For instance, this is what it looks like when migrating from the v10 -> v11 schemas starting 2020-07-01:
    78  
    79  ```yaml
    80  schema_config:
    81    configs:
    82      - from: 2019-07-01
    83        store: boltdb
    84        object_store: filesystem
    85        schema: v10
    86        index:
    87          prefix: index_
    88          period: 168h
    89      - from: 2020-07-01
    90        store: boltdb
    91        object_store: filesystem
    92        schema: v11
    93        index:
    94          prefix: index_
    95          period: 168h
    96  ```
    97  
    98  For all data ingested before 2020-07-01, Loki used the v10 schema and then switched after that point to the more effective v11. This dramatically simplifies upgrading, ensuring it's simple to take advantages of new storage optimizations. These configs should be immutable for as long as you care about retention.
    99  
   100  ## Table Manager
   101  
   102  One of the subcomponents in Loki is the `table-manager`. It is responsible for pre-creating and expiring index tables. This helps partition the writes and reads in loki across a set of distinct indices in order to prevent unbounded growth.
   103  
   104  ```yaml
   105  table_manager:
   106    # The retention period must be a multiple of the index / chunks
   107    # table "period" (see period_config).
   108    retention_deletes_enabled: true
   109    # This is 15 weeks retention, based on the 168h (1week) period durations used in the rest of the examples.
   110    retention_period: 2520h
   111  ```
   112  
   113  For more information, see the [table manager](../operations/storage/table-manager/) documentation.
   114  
   115  ### Provisioning
   116  
   117  In the case of AWS DynamoDB, you'll likely want to tune the provisioned throughput for your tables as well. This is to prevent your tables being rate limited on one hand and assuming unnecessary cost on the other. By default Loki uses a [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html) strategy for DynamoDB tables like so:
   118  
   119  ```
   120  table_manager:
   121    index_tables_provisioning:
   122      # Read/write throughput requirements for the current table
   123      # (the table which would handle writes/reads for data timestamped at the current time)
   124      provisioned_write_throughput: <int> | default = 3000
   125      provisioned_read_throughput: <int> | default = 300
   126  
   127      # Read/write throughput requirements for non-current tables
   128      inactive_write_throughput: <int> | default = 1
   129      inactive_read_throughput: <int> | Default = 300
   130  ```
   131  
   132  Note, there are a few other DynamoDB provisioning options including DynamoDB autoscaling and on-demand capacity. See the [provisioning configuration](../configuration/#provision_config) documentation for more information.
   133  
   134  ## Upgrading Schemas
   135  
   136  When a new schema is released and you want to gain the advantages it provides, you can! Loki can transparently query & merge data from across schema boundaries so there is no disruption of service and upgrading is easy.
   137  
   138  First, you'll want to create a new [period_config](../configuration#period_config) entry in your [schema_config](../configuration#schema_config). The important thing to remember here is to set this at some point in the _future_ and then roll out the config file changes to Loki. This allows the table manager to create the required table in advance of writes and ensures that existing data isn't queried as if it adheres to the new schema.
   139  
   140  As an example, let's say it's 2020-07-14 and we want to start using the `v11` schema on the 20th:
   141  ```yaml
   142  schema_config:
   143    configs:
   144      - from: 2019-07-14
   145        store: boltdb
   146        object_store: filesystem
   147        schema: v10
   148        index:
   149          prefix: index_
   150          period: 168h
   151      - from: 2020-07-20
   152        store: boltdb
   153        object_store: filesystem
   154        schema: v11
   155        index:
   156          prefix: index_
   157          period: 168h
   158  ```
   159  
   160  It's that easy; we just created a new entry starting on the 20th.
   161  
   162  ## Retention
   163  
   164  With the exception of the `filesystem` chunk store, Loki will not delete old chunk stores. This is generally handled instead by configuring TTLs (time to live) in the chunk store of your choice (bucket lifecycles in S3/GCS, and TTLs in Cassandra). Neither will Loki currently delete old data when your local disk fills when using the `filesystem` chunk store -- deletion is only determined by retention duration.
   165  
   166  We're interested in adding targeted deletion in future Loki releases (think tenant or stream level granularity) and may include other strategies as well.
   167  
   168  For more information, see the [retention configuration](../operations/storage/retention/) documentation.
   169  
   170  
   171  ## Examples
   172  
   173  ### Single machine/local development (boltdb+filesystem)
   174  
   175  [The repo contains a working example](https://github.com/grafana/loki/blob/master/cmd/loki/loki-local-config.yaml), you may want to checkout a tag of the repo to make sure you get a compatible example.
   176  
   177  ### GCP deployment (GCS Single Store)
   178  
   179  ```yaml
   180  storage_config:
   181    boltdb_shipper:
   182      active_index_directory: /loki/boltdb-shipper-active
   183      cache_location: /loki/boltdb-shipper-cache
   184      cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space
   185      shared_store: gcs
   186    gcs:
   187        bucket_name: <bucket>
   188  
   189  schema_config:
   190    configs:
   191      - from: 2020-07-01
   192        store: boltdb-shipper
   193        object_store: gcs
   194        schema: v11
   195        index:
   196          prefix: index_
   197          period: 24h
   198  ```
   199  
   200  ### AWS deployment (S3 Single Store)
   201  
   202  ```yaml
   203  storage_config:
   204    boltdb_shipper:
   205      active_index_directory: /loki/boltdb-shipper-active
   206      cache_location: /loki/boltdb-shipper-cache
   207      cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space
   208      shared_store: s3
   209    aws:
   210      s3: s3://<access_key>:<uri-encoded-secret-access-key>@<region>
   211      bucketnames: <bucket1,bucket2>
   212  
   213  schema_config:
   214    configs:
   215      - from: 2020-07-01
   216        store: boltdb-shipper
   217        object_store: aws
   218        schema: v11
   219        index:
   220          prefix: index_
   221          period: 24h
   222  ```
   223  
   224  If you don't wish to hard-code S3 credentials, you can also configure an EC2
   225  instance role by changing the `storage_config` section:
   226  
   227  ```yaml
   228  storage_config:
   229    aws:
   230      s3: s3://region
   231      bucketnames: <bucket1,bucket2>
   232      dynamodb:
   233        dynamodb_url: dynamodb://region
   234  ```
   235  
   236  ### On prem deployment (Cassandra+Cassandra)
   237  
   238  **Keeping this for posterity, but this is likely not a common config. Cassandra should work and could be faster in some situations but is likely much more expensive.**
   239  
   240  ```yaml
   241  storage_config:
   242    cassandra:
   243      addresses: <comma-separated-IPs-or-hostnames>
   244      keyspace: <keyspace>
   245      auth: <true|false>
   246      username: <username> # only applicable when auth=true
   247      password: <password> # only applicable when auth=true
   248  
   249  schema_config:
   250    configs:
   251      - from: 2020-07-01
   252        store: cassandra
   253        object_store: cassandra
   254        schema: v11
   255        index:
   256          prefix: index_
   257          period: 168h
   258        chunks:
   259          prefix: chunk_
   260          period: 168h
   261  
   262  ```
   263  
   264  ### On prem deployment (MinIO Single Store)
   265  
   266  We configure MinIO by using the AWS config because MinIO implements the S3 API:
   267  
   268  ```yaml
   269  storage_config:
   270    aws:
   271      # Note: use a fully qualified domain name, like localhost.
   272      # full example: http://loki:supersecret@localhost.:9000
   273      s3: http<s>://<username>:<secret>@<fqdn>:<port>
   274      s3forcepathstyle: true
   275    boltdb_shipper:
   276      active_index_directory: /loki/boltdb-shipper-active
   277      cache_location: /loki/boltdb-shipper-cache
   278      cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space
   279      shared_store: s3
   280  
   281  schema_config:
   282    configs:
   283      - from: 2020-07-01
   284        store: boltdb-shipper
   285        object_store: aws
   286        schema: v11
   287        index:
   288          prefix: index_
   289          period: 24h
   290  ```
   291  
   292  ### Azure Storage Account
   293  
   294  ```yaml
   295  schema_config:
   296    configs:
   297    - from: "2020-12-11"
   298      index:
   299        period: 24h
   300        prefix: index_
   301      object_store: azure
   302      schema: v11
   303      store: boltdb-shipper
   304  storage_config:
   305    azure:
   306      # Your Azure storage account name
   307      account_name: <account-name>
   308      # For the account-key, see docs: https://docs.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?tabs=azure-portal
   309      account_key: <account-key>
   310      # See https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers
   311      container_name: <container-name>
   312      use_managed_identity: <true|false>
   313      # Providing a user assigned ID will override use_managed_identity
   314      user_assigned_id: <user-assigned-identity-id>
   315      request_timeout: 0    
   316    boltdb_shipper:
   317      active_index_directory: /data/loki/boltdb-shipper-active
   318      cache_location: /data/loki/boltdb-shipper-cache
   319      cache_ttl: 24h
   320      shared_store: azure
   321    filesystem:
   322      directory: /data/loki/chunks
   323  ```