github.com/filecoin-project/bacalhau@v0.3.23-0.20230228154132-45c989550ace/ops/aws/canary/README.md (about) 1 # Bacalhau Monitoring Canary 2 This is a canary that continuously call several Bacalhau APIs and alarm whenever the correctness or availability of those APIs fall below a threshold. 3 4 The canary is serverless using AWS Lambda. Infrastructure is defined using AWS CDK, and automatically deployed using AWS CodePipeline. 5 6 ## Quick LInks 7 - [Public Dashboard](https://cloudwatch.amazonaws.com/dashboard.html?dashboard=BacalhauCanaryProd&context=eyJSIjoidXMtZWFzdC0xIiwiRCI6ImN3LWRiLTI4NDMwNTcxNzgzNSIsIlUiOiJ1cy1lYXN0LTFfUTlPMEVrM3llIiwiQyI6IjExc3NlYW1tZmVmaGdtYTFzMDk1c29jaDltIiwiSSI6InVzLWVhc3QtMTpmNGE5MGFiMi0yZWYwLTRlYTEtOWZkNS1jMmQ3MDkxYTA5OTQiLCJNIjoiUHVibGljIn0=) 8 - [AWS Account Sign-in](https://284305717835.signin.aws.amazon.com/console/?region=eu-west-1) 9 - [Canary Prod Logs](https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logsV2:log-groups) 10 - [Canary Lambda Functions](https://eu-west-1.console.aws.amazon.com/lambda/home?region=eu-west-1#/functions?fo=and&o0=%3A&v0=BacalhauCanary) 11 - [Deployment Pipeline](https://console.aws.amazon.com/codesuite/codepipeline/pipelines/BacalhauCanaryPipeline-PipelineC660917D-I0DZJY6IFHTO/view?region=eu-west-1) 12 13 ## Canary Scenarios 14 The canary is composed of several scenarios, each is executed periodically on its own lambda function. The scenarios are defined in the `lambda/pkg/scenarios` directory, and include: 15 - `list`: Call Bacalhau's list API and verify the response. 16 - `submit`: Submits a job to Bacalhau and verify it was successfully completed 17 - `submitAndDescribe`: Submits a job to Bacalhau, waits for it to complete, and then calls the describe related APIs. 18 - `submitAndGet`: Submits a job to Bacalhau, waits for it to complete, and then download the output and verify its correctness. 19 - `submitDockerIPFSJobAndGet`: Submits a job to Bacalhau with an IPFS input, waits for it to complete, and then download the output and verify its correctness. 20 - `submitWithConcurrency`: Submits a job to Bacalhau with a concurrency of 3, and waits for it to complete. 21 - `submitWithConcurrencyOwnedNodes`: Submits a job to Bacalhau owned nodes with a concurrency of 3, and waits for it to complete. 22 23 ### Local Testing 24 You can run the scenarios locally before deploying to lambda by using the following command: 25 ```bash 26 # Assuming you are in the ops/aws/canary directory 27 go run ./lambda/cmd/scenario_local_runner --action list # or any other scenario 28 29 # If you get a `no packages loaded from` error just cd into the /ops/aws/canary/lambda/cmd/scenario_local_runner directory 30 go run . --action list 31 ``` 32 33 ## Releasing a New Version 34 Follow these steps when a new version of Bacalhau is released and deployed to prod so that the canary client is also updated to a compatible version and deployed: 35 1. Update the `go.mod` in the [ops/aws/canary/lambda directory](ops/aws/canary/lambda/go.mod) to point to the new version of Bacalhau. 36 2. Run `go mod tidy` to update the `go.sum` file by running `(cd ops/aws/canary/lambda && go mod tidy)` 37 3. Update any breaking changes in Bacalhau client API. 38 4. Verify the canary is compiling locally by running `(cd ops/aws/canary/lambda && go build -o /dev/null ./cmd/scenario_lambda_runner)` 39 5. Push the changes to main, and the canary pipeline will automatically deploy the new version. 40 41 This is a [sample commit](https://github.com/filecoin-project/bacalhau/commit/958630dbe4ad9ba35b0715be2f82c66c60797ba4) updating the canary to Bacalhau v0.2.6 42 43 ## Infrastructure Stacks 44 There are two types of stacks in this project: 45 - Canary stack(s): one stack per environment (e.g. prod, dev), containing the Lambda function and the CloudWatch alarm. 46 - Pipeline stack: contains the CodePipeline and CodeBuild resources. 47 48 ### Deploying Canary Stacks Changes 49 Changes to the canary stacks are automatically deployed as soon a new commit is pushed to the main branch. You *should not* deploy this stack manually. 50 51 **Note:** Currently only the prod stack is deployed. 52 53 ### Deploying Pipeline Stack Changes 54 Changes to the pipeline such as adding a new stage or modifying the build scripts needs to be deployed manually. To do so, run the following command: 55 ```bash 56 # Assuming you have the AWS CLI installed and configured with a profile named "bacalhau" 57 # Assuming you are in the ops/aws/canary directory 58 cdk --profile bacalhau deploy BacalhauCanaryPipeline -c config=prod 59 ``` 60 Note that we only have a single pipeline stack deployed using prod environment configuration, but it will deploy all canary stacks. 61 62 ### Manual Resources 63 These are the resources that had to be created/updated manually outside of CDK: 64 1. GitHub Connection 65 2. CloudWatch public dashboard link 66 3. Update secret manager with Slack webhook URL 67 68 69 ## Useful CDK commands 70 Keep in mind that you might need to pass your AWS profile and the stack name in some of these commands: 71 * `npm run build` compile typescript to js 72 * `npm run postinstall` deletes cdk golang templates that can result in breaking go commands due to invalid file naming pattern 73 * `npm run watch` watch for changes and compile 74 * `npm run test` perform the jest unit tests 75 * `cdk deploy` deploy this stack to your default AWS account/region 76 * `cdk diff` compare deployed stack with current state 77 * `cdk synth` emits the synthesized CloudFormation template