github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/storage/_index.md (about) 1 --- 2 title: Storage 3 weight: 1010 4 --- 5 # Storage 6 7 Unlike other logging systems, Grafana Loki is built around the idea of only indexing 8 metadata about your logs: labels (just like Prometheus labels). Log data itself 9 is then compressed and stored in chunks in object stores such as S3 or GCS, or 10 even locally on the filesystem. A small index and highly compressed chunks 11 simplifies the operation and significantly lowers the cost of Loki. 12 13 Until Loki 2.0, index data was stored in a separate index. 14 15 Loki 2.0 brings an index mechanism named 'boltdb-shipper' and is what we now call Single Store Loki. 16 This index type only requires one store, the object store, for both the index and chunks. 17 More detailed information can be found on the [operations page]({{< relref "../operations/storage/boltdb-shipper.md" >}}). 18 19 Some more storage details can also be found in the [operations section]({{< relref "../operations/storage/_index.md" >}}). 20 21 ## Implementations - Chunks 22 23 ### Cassandra 24 25 Cassandra is a popular database and one of Loki's possible chunk stores and is production safe. 26 27 ### GCS 28 29 GCS is a hosted object store offered by Google. It is a good candidate for a managed object store, especially when you're already running on GCP, and is production safe. 30 31 ### File System 32 33 The file system is the simplest backend for chunks, although it's also susceptible to data loss as it's unreplicated. This is common for single binary deployments though, as well as for those trying out loki or doing local development on the project. It is similar in concept to many Prometheus deployments where a single Prometheus is responsible for monitoring a fleet. 34 35 ### S3 36 37 S3 is AWS's hosted object store. It is a good candidate for a managed object store, especially when you're already running on AWS, and is production safe. 38 39 ### Notable Mentions 40 41 You may use any substitutable services, such as those that implement the S3 API like [MinIO](https://min.io/). 42 43 ## Implementations - Index 44 45 ### Single-Store 46 47 Also known as "boltdb-shipper" during development (and is still the schema `store` name). The single store configurations for Loki utilize the chunk store for both chunks and the index, requiring just one store to run Loki. 48 49 As of 2.0, this is the recommended index storage type, performance is comparable to a dedicated index type while providing a much less expensive and less complicated deployment. 50 51 ### Cassandra 52 53 Cassandra can also be utilized for the index store and aside from the [boltdb-shipper](../operations/storage/boltdb-shipper/), it's the only non-cloud offering that can be used for the index that's horizontally scalable and has configurable replication. It's a good candidate when you already run Cassandra, are running on-prem, or do not wish to use a managed cloud offering. 54 55 ### BigTable 56 57 Bigtable is a cloud database offered by Google. It is a good candidate for a managed index store if you're already using it (due to it's heavy fixed costs) or wish to run in GCP. 58 59 ### DynamoDB 60 61 DynamoDB is a cloud database offered by AWS. It is a good candidate for a managed index store, especially if you're already running in AWS. 62 63 #### Rate Limiting 64 65 DynamoDB is susceptible to rate limiting, particularly due to overconsuming what is called [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html). This can be controlled via the [provisioning](#provisioning) configs in the table manager. 66 67 ### BoltDB 68 69 BoltDB is an embedded database on disk. It is not replicated and thus cannot be used for high availability or clustered Loki deployments, but is commonly paired with a `filesystem` chunk store for proof of concept deployments, trying out Loki, and development. The [boltdb-shipper](../operations/storage/boltdb-shipper/) aims to support clustered deployments using `boltdb` as an index. 70 71 ### Azure Storage Account 72 73 An Azure storage account contains all of your Azure Storage data objects: blobs, file shares, queues, tables, and disks. 74 75 ## Schema Configs 76 77 Loki aims to be backwards compatible and over the course of its development has had many internal changes that facilitate better and more efficient storage/querying. Loki allows incrementally upgrading to these new storage _schemas_ and can query across them transparently. This makes upgrading a breeze. For instance, this is what it looks like when migrating from the v10 -> v11 schemas starting 2020-07-01: 78 79 ```yaml 80 schema_config: 81 configs: 82 - from: 2019-07-01 83 store: boltdb 84 object_store: filesystem 85 schema: v10 86 index: 87 prefix: index_ 88 period: 168h 89 - from: 2020-07-01 90 store: boltdb 91 object_store: filesystem 92 schema: v11 93 index: 94 prefix: index_ 95 period: 168h 96 ``` 97 98 For all data ingested before 2020-07-01, Loki used the v10 schema and then switched after that point to the more effective v11. This dramatically simplifies upgrading, ensuring it's simple to take advantages of new storage optimizations. These configs should be immutable for as long as you care about retention. 99 100 ## Table Manager 101 102 One of the subcomponents in Loki is the `table-manager`. It is responsible for pre-creating and expiring index tables. This helps partition the writes and reads in loki across a set of distinct indices in order to prevent unbounded growth. 103 104 ```yaml 105 table_manager: 106 # The retention period must be a multiple of the index / chunks 107 # table "period" (see period_config). 108 retention_deletes_enabled: true 109 # This is 15 weeks retention, based on the 168h (1week) period durations used in the rest of the examples. 110 retention_period: 2520h 111 ``` 112 113 For more information, see the [table manager](../operations/storage/table-manager/) documentation. 114 115 ### Provisioning 116 117 In the case of AWS DynamoDB, you'll likely want to tune the provisioned throughput for your tables as well. This is to prevent your tables being rate limited on one hand and assuming unnecessary cost on the other. By default Loki uses a [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html) strategy for DynamoDB tables like so: 118 119 ``` 120 table_manager: 121 index_tables_provisioning: 122 # Read/write throughput requirements for the current table 123 # (the table which would handle writes/reads for data timestamped at the current time) 124 provisioned_write_throughput: <int> | default = 3000 125 provisioned_read_throughput: <int> | default = 300 126 127 # Read/write throughput requirements for non-current tables 128 inactive_write_throughput: <int> | default = 1 129 inactive_read_throughput: <int> | Default = 300 130 ``` 131 132 Note, there are a few other DynamoDB provisioning options including DynamoDB autoscaling and on-demand capacity. See the [provisioning configuration](../configuration/#provision_config) documentation for more information. 133 134 ## Upgrading Schemas 135 136 When a new schema is released and you want to gain the advantages it provides, you can! Loki can transparently query & merge data from across schema boundaries so there is no disruption of service and upgrading is easy. 137 138 First, you'll want to create a new [period_config](../configuration#period_config) entry in your [schema_config](../configuration#schema_config). The important thing to remember here is to set this at some point in the _future_ and then roll out the config file changes to Loki. This allows the table manager to create the required table in advance of writes and ensures that existing data isn't queried as if it adheres to the new schema. 139 140 As an example, let's say it's 2020-07-14 and we want to start using the `v11` schema on the 20th: 141 ```yaml 142 schema_config: 143 configs: 144 - from: 2019-07-14 145 store: boltdb 146 object_store: filesystem 147 schema: v10 148 index: 149 prefix: index_ 150 period: 168h 151 - from: 2020-07-20 152 store: boltdb 153 object_store: filesystem 154 schema: v11 155 index: 156 prefix: index_ 157 period: 168h 158 ``` 159 160 It's that easy; we just created a new entry starting on the 20th. 161 162 ## Retention 163 164 With the exception of the `filesystem` chunk store, Loki will not delete old chunk stores. This is generally handled instead by configuring TTLs (time to live) in the chunk store of your choice (bucket lifecycles in S3/GCS, and TTLs in Cassandra). Neither will Loki currently delete old data when your local disk fills when using the `filesystem` chunk store -- deletion is only determined by retention duration. 165 166 We're interested in adding targeted deletion in future Loki releases (think tenant or stream level granularity) and may include other strategies as well. 167 168 For more information, see the [retention configuration](../operations/storage/retention/) documentation. 169 170 171 ## Examples 172 173 ### Single machine/local development (boltdb+filesystem) 174 175 [The repo contains a working example](https://github.com/grafana/loki/blob/master/cmd/loki/loki-local-config.yaml), you may want to checkout a tag of the repo to make sure you get a compatible example. 176 177 ### GCP deployment (GCS Single Store) 178 179 ```yaml 180 storage_config: 181 boltdb_shipper: 182 active_index_directory: /loki/boltdb-shipper-active 183 cache_location: /loki/boltdb-shipper-cache 184 cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space 185 shared_store: gcs 186 gcs: 187 bucket_name: <bucket> 188 189 schema_config: 190 configs: 191 - from: 2020-07-01 192 store: boltdb-shipper 193 object_store: gcs 194 schema: v11 195 index: 196 prefix: index_ 197 period: 24h 198 ``` 199 200 ### AWS deployment (S3 Single Store) 201 202 ```yaml 203 storage_config: 204 boltdb_shipper: 205 active_index_directory: /loki/boltdb-shipper-active 206 cache_location: /loki/boltdb-shipper-cache 207 cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space 208 shared_store: s3 209 aws: 210 s3: s3://<access_key>:<uri-encoded-secret-access-key>@<region> 211 bucketnames: <bucket1,bucket2> 212 213 schema_config: 214 configs: 215 - from: 2020-07-01 216 store: boltdb-shipper 217 object_store: aws 218 schema: v11 219 index: 220 prefix: index_ 221 period: 24h 222 ``` 223 224 If you don't wish to hard-code S3 credentials, you can also configure an EC2 225 instance role by changing the `storage_config` section: 226 227 ```yaml 228 storage_config: 229 aws: 230 s3: s3://region 231 bucketnames: <bucket1,bucket2> 232 dynamodb: 233 dynamodb_url: dynamodb://region 234 ``` 235 236 ### On prem deployment (Cassandra+Cassandra) 237 238 **Keeping this for posterity, but this is likely not a common config. Cassandra should work and could be faster in some situations but is likely much more expensive.** 239 240 ```yaml 241 storage_config: 242 cassandra: 243 addresses: <comma-separated-IPs-or-hostnames> 244 keyspace: <keyspace> 245 auth: <true|false> 246 username: <username> # only applicable when auth=true 247 password: <password> # only applicable when auth=true 248 249 schema_config: 250 configs: 251 - from: 2020-07-01 252 store: cassandra 253 object_store: cassandra 254 schema: v11 255 index: 256 prefix: index_ 257 period: 168h 258 chunks: 259 prefix: chunk_ 260 period: 168h 261 262 ``` 263 264 ### On prem deployment (MinIO Single Store) 265 266 We configure MinIO by using the AWS config because MinIO implements the S3 API: 267 268 ```yaml 269 storage_config: 270 aws: 271 # Note: use a fully qualified domain name, like localhost. 272 # full example: http://loki:supersecret@localhost.:9000 273 s3: http<s>://<username>:<secret>@<fqdn>:<port> 274 s3forcepathstyle: true 275 boltdb_shipper: 276 active_index_directory: /loki/boltdb-shipper-active 277 cache_location: /loki/boltdb-shipper-cache 278 cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space 279 shared_store: s3 280 281 schema_config: 282 configs: 283 - from: 2020-07-01 284 store: boltdb-shipper 285 object_store: aws 286 schema: v11 287 index: 288 prefix: index_ 289 period: 24h 290 ``` 291 292 ### Azure Storage Account 293 294 ```yaml 295 schema_config: 296 configs: 297 - from: "2020-12-11" 298 index: 299 period: 24h 300 prefix: index_ 301 object_store: azure 302 schema: v11 303 store: boltdb-shipper 304 storage_config: 305 azure: 306 # Your Azure storage account name 307 account_name: <account-name> 308 # For the account-key, see docs: https://docs.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?tabs=azure-portal 309 account_key: <account-key> 310 # See https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction#containers 311 container_name: <container-name> 312 use_managed_identity: <true|false> 313 # Providing a user assigned ID will override use_managed_identity 314 user_assigned_id: <user-assigned-identity-id> 315 request_timeout: 0 316 boltdb_shipper: 317 active_index_directory: /data/loki/boltdb-shipper-active 318 cache_location: /data/loki/boltdb-shipper-cache 319 cache_ttl: 24h 320 shared_store: azure 321 filesystem: 322 directory: /data/loki/chunks 323 ```