github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/clients/lambda-promtail/_index.md (about) 1 --- 2 title: Lambda Promtail 3 weight: 20 4 --- 5 # Lambda Promtail 6 7 Grafana Loki includes [Terraform](https://www.terraform.io/) and [CloudFormation](https://aws.amazon.com/cloudformation/) for shipping Cloudwatch and loadbalancer logs to Loki via a [lambda function](https://aws.amazon.com/lambda/). This is done via [lambda-promtail](https://github.com/grafana/loki/tree/master/tools/lambda-promtail) which processes cloudwatch events and propagates them to Loki (or a Promtail instance) via the push-api [scrape config](../promtail/configuration#loki_push_api_config). 8 9 ## Deployment 10 11 lambda-promtail can easily be deployed via provided [Terraform](https://github.com/grafana/loki/blob/main/tools/lambda-promtail/main.tf) and [CloudFormation](https://github.com/grafana/loki/blob/main/tools/lambda-promtail/template.yaml) files. The Terraform deployment also pulls variable values defined from [variables.tf](https://github.com/grafana/loki/blob/main/tools/lambda-promtail/variables.tf). 12 13 For both deployment types there are a few values that must be defined: 14 - the write address, a Loki Write API compatible endpoint (Loki or Promtail) 15 - basic auth username/password if the write address is a Loki endpoint and has authentication 16 - the lambda-promtail image, full ECR repo path:tag 17 18 The Terraform deployment also takes in an array of log group and bucket names, and can take arrays for VPC subnets and security groups. 19 20 There's also a flag to keep the log stream label when propagating the logs from Cloudwatch, which defaults to false. This can be helpful when the cardinality is too large, such as the case of a log stream per lambda invocation. 21 22 Additionally, an environment variable can be configured to add extra labels to the logs streamed by lambda-protmail. 23 These extra labels will take the form `__extra_<name>=<value>`. 24 25 An optional environment variable can be configured to add the tenant ID to the logs streamed by lambda-protmail. 26 27 In an effort to make deployment of lambda-promtail as simple as possible, we've created a [public ECR repo](https://gallery.ecr.aws/grafana/lambda-promtail) to publish our builds of lambda-promtail. Users may clone this repo, make their own modifications to the Go code, and upload their own image to their own ECR repo. 28 29 ### Examples 30 31 Terraform: 32 ``` 33 terraform apply -var "lambda_promtail_image=<repo:tag>" -var "write_address=https://logs-prod-us-central1.grafana.net/loki/api/v1/push" -var "password=<password>" -var "username=<user>" -var 'log_group_names=["/aws/lambda/log-group-1", "/aws/lambda/log-group-2"]' -var 'bucket_names=["bucket-a", "bucket-b"]' -var 'batch_size=131072' 34 ``` 35 36 The first few lines of `main.tf` define the AWS region to deploy to. 37 Modify as desired, or remove and deploy to 38 ``` 39 provider "aws" { 40 region = "us-east-2" 41 } 42 ``` 43 44 To keep the log group label add `-var "keep_stream=true"`. 45 46 To add extra labels add `-var 'extra_labels="name1,value1,name2,value2"'`. 47 48 To add tenant id add `-var "tenant_id=value"`. 49 50 Note that the creation of a subscription filter on Cloudwatch in the provided Terraform file only accepts an array of log group names. 51 It does **not** accept strings for regex filtering on the logs contents via the subscription filters. We suggest extending the Terraform file to do so. 52 Or, have lambda-promtail write to Promtail and use [pipeline stages](https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/). 53 54 CloudFormation: 55 ``` 56 aws cloudformation create-stack --stack-name lambda-promtail --template-body file://template.yaml --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM --region us-east-2 --parameters ParameterKey=WriteAddress,ParameterValue=https://logs-prod-us-central1.grafana.net/loki/api/v1/push ParameterKey=Username,ParameterValue=<user> ParameterKey=Password,ParameterValue=<password> ParameterKey=LambdaPromtailImage,ParameterValue=<repo:tag> 57 ``` 58 59 Within the CloudFormation template file, copy, paste, and modify the subscription filter section as needed for each log group: 60 ``` 61 MainLambdaPromtailSubscriptionFilter: 62 Type: AWS::Logs::SubscriptionFilter 63 Properties: 64 DestinationArn: !GetAtt LambdaPromtailFunction.Arn 65 FilterPattern: "" 66 LogGroupName: "/aws/lambda/some-lamda-log-group" 67 ``` 68 69 To keep the log group label, add `ParameterKey=KeepStream,ParameterValue=true`. 70 71 To add extra labels, include `ParameterKey=ExtraLabels,ParameterValue="name1,value1,name2,value2"`. 72 73 To add a tenant ID, add `ParameterKey=TenantID,ParameterValue=value`. 74 75 To modify an existing CloudFormation stack, use [update-stack](https://docs.aws.amazon.com/cli/latest/reference/cloudformation/update-stack.html). 76 77 ## Uses 78 79 ### Ephemeral Jobs 80 81 This workflow is intended to be an effective approach for monitoring ephemeral jobs such as those run on AWS Lambda which are otherwise hard/impossible to monitor via one of the other Loki [clients](../). 82 83 Ephemeral jobs can quite easily run afoul of cardinality best practices. During high request load, an AWS lambda function might balloon in concurrency, creating many log streams in Cloudwatch. For this reason lambda-promtail defaults to **not** keeping the log stream value as a label when propagating the logs to Loki. This is only possible because new versions of Loki no longer have an ingestion ordering constraint on logs within a single stream. 84 85 ### Proof of concept Loki deployments 86 87 For those using Cloudwatch and wishing to test out Loki in a low-risk way, this workflow allows piping Cloudwatch logs to Loki regardless of the event source (EC2, Kubernetes, Lambda, ECS, etc) without setting up a set of Promtail daemons across their infrastructure. However, running Promtail as a daemon on your infrastructure is the best-practice deployment strategy in the long term for flexibility, reliability, performance, and cost. 88 89 Note: Propagating logs from Cloudwatch to Loki means you'll still need to _pay_ for Cloudwatch. 90 91 ### Loadbalancer logs 92 93 This workflow allows ingesting AWS loadbalancer logs stored on S3 to Loki. 94 95 ## Propagated Labels 96 97 Incoming logs can have six special labels assigned to them which can be used in [relabeling](../promtail/configuration/#relabel_config) or later stages in a Promtail [pipeline](../promtail/pipelines/): 98 99 - `__aws_log_type`: Where this log came from (Cloudwatch or S3). 100 - `__aws_cloudwatch_log_group`: The associated Cloudwatch Log Group for this log. 101 - `__aws_cloudwatch_log_stream`: The associated Cloudwatch Log Stream for this log (if `KEEP_STREAM=true`). 102 - `__aws_cloudwatch_owner`: The AWS ID of the owner of this event. 103 - `__aws_s3_log_lb`: The name of the loadbalancer. 104 - `__aws_s3_log_lb_owner`: The Account ID of the loadbalancer owner. 105 106 ## Limitations 107 108 ### Promtail labels 109 110 Note: This section is relevant if running Promtail between lambda-promtail and the end Loki deployment and was used to circumvent `out of order` problems prior to the v2.4 Loki release which removed the ordering constraint. 111 112 As stated earlier, this workflow moves the worst case stream cardinality from `number_of_log_streams` -> `number_of_log_groups` * `number_of_promtails`. For this reason, each Promtail must have a unique label attached to logs it processes (ideally via something like `--client.external-labels=promtail=${HOSTNAME}`) and it's advised to run a small number of Promtails behind a load balancer according to your throughput and redundancy needs. 113 114 This trade-off is very effective when you have a large number of log streams but want to aggregate them by the log group. This is very common in AWS Lambda, where log groups are the "application" and log streams are the individual application containers which are spun up and down at a whim, possibly just for a single function invocation. 115 116 ### Data Persistence 117 118 #### Availability 119 120 For availability concerns, run a set of Promtails behind a load balancer. 121 122 #### Batching 123 124 Relevant if lambda-promtail is configured to write to Promtail. Since Promtail batches writes to Loki for performance, it's possible that Promtail will receive a log, issue a successful `204` http status code for the write, then be killed at a later time before it writes upstream to Loki. This should be rare, but is a downside this workflow has. 125 126 This lambda will flush logs when the batch size hits the default value of `131072` (128KB), this can be changed with `BATCH_SIZE` environment variable, which is set to the number of bytes to use. 127 128 ### Templating/Deployment 129 130 The current CloudFormation template is rudimentary. If you need to add vpc configs, extra log groups to monitor, subnet declarations, etc, you'll need to edit the template manually. If you need to subscribe to more than one Cloudwatch Log Group you'll also need to copy paste that section of the template for each group. 131 132 The Terraform file is a bit more fleshed out, and can be configured to take in an array of log group and bucket names, as well as vpc configuration. 133 134 The provided Terraform and CloudFormation files are meant to cover the default use case, and more complex deployments will likely require some modification and extenstion of the provided files. 135 136 ## Example Promtail Config 137 138 Note: this should be run in conjunction with a Promtail-specific label attached, ideally via a flag argument like `--client.external-labels=promtail=${HOSTNAME}`. It will receive writes via the push-api on ports `3500` (http) and `3600` (grpc). 139 140 ```yaml 141 server: 142 http_listen_port: 9080 143 grpc_listen_port: 0 144 145 positions: 146 filename: /tmp/positions.yaml 147 148 clients: 149 - url: http://ip_or_hostname_where_Loki_run:3100/loki/api/v1/push 150 151 scrape_configs: 152 - job_name: push1 153 loki_push_api: 154 server: 155 http_listen_port: 3500 156 grpc_listen_port: 3600 157 labels: 158 # Adds a label on all streams indicating it was processed by the lambda-promtail workflow. 159 promtail: 'lambda-promtail' 160 relabel_configs: 161 - source_labels: ['__aws_log_type'] 162 target_label: 'log_type' 163 # Maps the cloudwatch log group into a label called `log_group` for use in Loki. 164 - source_labels: ['__aws_cloudwatch_log_group'] 165 target_label: 'log_group' 166 # Maps the loadbalancer name into a label called `loadbalancer_name` for use in Loki. 167 - source_label: ['__aws_s3_log_lb'] 168 target_label: 'loadbalancer_name' 169 ``` 170 171 ## Multiple Promtail Deployment 172 173 **Disclaimer: The following section is only relevant for older versions of Loki that cannot accept out of order logs.** 174 175 However, these may only be active for a very short while. This creates a problem for combining these short-lived log streams in Loki because timestamps may not strictly increase across multiple log streams. The other obvious route is creating labels based on log streams, which is also undesirable because it leads to cardinality problems via many low-throughput log streams. 176 177 Instead we can pipeline Cloudwatch logs to a set of Promtails, which can mitigate these problem in two ways: 178 179 1) Using Promtail's push api along with the `use_incoming_timestamp: false` config, we let Promtail determine the timestamp based on when it ingests the logs, not the timestamp assigned by cloudwatch. Obviously, this means that we lose the origin timestamp because Promtail now assigns it, but this is a relatively small difference in a real time ingestion system like this. 180 2) In conjunction with (1), Promtail can coalesce logs across Cloudwatch log streams because it's no longer susceptible to out-of-order errors when combining multiple sources (lambda invocations). 181 182 One important aspect to keep in mind when running with a set of Promtails behind a load balancer is that we're effectively moving the cardinality problems from the number of log streams -> number of Promtails. If you have not configured Loki to [accept out-of-order writes](../../configuration#accept-out-of-order-writes), you'll need to assign a Promtail-specific label on each Promtail so that you don't run into out-of-order errors when the Promtails send data for the same log groups to Loki. This can easily be done via a configuration like `--client.external-labels=promtail=${HOSTNAME}` passed to Promtail.