github.com/Racer159/jackal@v0.32.7-0.20240401174413-0bd2339e4f2e/adr/0018-hooks.md (about) 1 # 18. Jackal Hooks 2 3 Date: 2023-09-20 4 5 ## Status 6 7 Accepted 8 9 ## Context 10 11 The idea of `hooks` is to provide a way for cluster maintainers to register functionality that runs during the deployment lifecycle. Jackal packages already have the concept of `actions` that can execute commands on the host machine's shell during certain package lifecycle events. As `actions` gain more adoption, the team has noticed they are being used to add functionality to Jackal in unexpected ways. We want `actions` to be a tool that extends upon the functionality of Jackal and its packages, not a tool that works around missing or clunky functionality. 12 13 We want package creators to be able to create system agnostic packages by leveraging core Jackal functionality. The following is one such scenario: 14 15 - _IF_ ECR is chosen as the external registry during `jackal init` / cluster creation, _THEN_ Jackal will seamlessly leverage ECR without requiring advanced user effort. 16 17 Using ECR as a remote registry creates 2 problems that Jackal will need to solve: 18 19 1. ECR authentication tokens expire after 12 hours and need to be refreshed. This means the cluster will need to constantly be refreshing its tokens and the user deploying packages will need to make sure they have a valid token. 20 2. ECR Image Repositories do not support 'push-to-create'. This means we will need to explicitly create an image repository for every image that is being pushed within the Jackal package. 21 22 Packages that get deployed onto a cluster initialized with ECR as its remote registry will need to make sure it solves these 2 problems. 23 24 Currently there are 2 solutions: 25 26 1. The package deployer solves the problem pre-deployment (creating needed repos, secrets, etc...) 27 2. The package itself solves these problems with `actions` that are custom built for ECR clusters. 28 29 Neither one of these current solutions are ideal. We don't want to require overly complex external + prior actions for Jackal package deployments, and we don't want package creators to have to create and distribute packages that are specific to ECR. 30 31 Potential considerations: 32 33 ### Internal Jackal Implementation 34 35 Clusters that have hooks will have `jackal-hook-*` secret(s) in the 'jackal' namespace. This secret will contain the hook's configuration and any other required metadata. As part of the package deployment process, Jackal will check if the cluster has any hooks and run them if they exist. Given the scenario above, there is no longer a need for an ECR specific Jackal package to be created. An ECR hook would perform the proper configuration for any package deployed onto that cluster; thereby requiring no extra manual intervention from the package deployer. 36 37 Jackal HookConfig state information struct: 38 39 ```go 40 type HookConfig struct { 41 HookName string `json:"hookName" jsonschema:"description=Name of the hook"` 42 Internal bool `json:"internal" jsonschema:"description=Internal hooks are run by Jackal itself, not by a plugin"` 43 Lifecycle HookLifecycle `json:"lifecycle" jsonschema:"description=Lifecycle of the hook"` 44 HookData map[string]interface{} `json:"hookData" jsonschema:"description=Generic data map used for the hook. The data is obtained from a secret in the Jackal namespace"` 45 OCIReference string `json:"ociReference" jsonschema:"description=Optional OCI reference to the hook image to run"` 46 } 47 ``` 48 49 Example Secret Data: 50 51 ```yaml 52 hookName: ecr-repository 53 internal: true 54 lifecycle: before-component 55 hookData: 56 registryURL: public.ecr.aws/abcdefg/jackal-ecr-registry 57 region: us-east-1 58 repositoryPrefix: ecr-jackal-registry 59 ``` 60 61 For this solution, hooks have to be 'installed' onto a cluster before they are used. When Jackal is deploying a package onto a cluster, it will look for any secrets with the `jackal-hook` label in the `jackal` namespace. If hooks are found, Jackal will run any 'package' level hooks before deploying a component and run any 'component' level hook for each component that is getting deployed. The hook lifecycle options will be: 62 63 1. Before a package deployment 64 2. After a package deployment 65 3. Before a component deployment 66 4. After a component deployment 67 68 NOTE: The order of hook execution is nearly random. If there are multiple hooks for a lifecycle there is no guarantee that they will be executed in a certain order. 69 NOTE: The `package` lifecycle might be changed to a `run-once` lifecycle. This would benefit packages that don't have kube context information when the deployment starts. 70 71 Jackal hooks will have two forms of execution via `Internal` and `External` hooks: 72 73 Internal Hooks: 74 75 Internal hooks will be hooks that are built into the Jackal CLI and run internal code when executed. The logic for these hooks would be built into the Jackal CLI and would be updated with new releases of the CLI. 76 77 External Hooks: 78 79 There are a few approaches for external hooks: 80 81 1. Have the hook metadata reference an OCI image that is downloaded and run. 82 83 - The hook metadata can reference the shasum of the image to ensure the image is not tampered with. 84 - We can pass metadata from the secret to the image. 85 86 1. Have the hook metadata reference an image/endpoint that we call via a gRPC call. 87 - This would require a lot of consideration to security since we will be executing code from an external source. 88 89 1. Have the hook metadata contain a script or list of shell commands that can get run. 90 - This would be the simplest solution but would require the most work from the hook creator. This also has the most potential security issues. 91 92 Pros: 93 94 - Implementing Hooks internally means we don't have to deal with any bootstrapping issues. 95 - Internally managed hooks can leverage Jackal internal code. 96 97 Cons: 98 99 - Since 'Internal' hooks are built into the CLI, the only way to get updates for the hook is to update the CLI. 100 - External hooks will have a few security concerns that we will have to work through. 101 - Implementing hooks internally adds more complexity to the Jackal CLI. This is especially true if we end up using WASM as the execution engine for hooks. 102 103 ### Webhooks 104 105 Webhooks, such as Pepr, can act as a K8s controller that enables Kubernetes mutations. We are (or will be) considering using Pepr to replace the `Jackal Agent`. Pepr is capable to accomplishing most of what Jackal wants to do with the concept of Hooks. Jackal hook configuration could be saved as secrets that Jackal will be able to use. As Jackal is deploying packages onto a cluster, it can check for secrets the represent hooks (as it would if hook execution is handled internally as stated above) and get information on how to run the webhook from the secret. This would likely mean that the secret that describes the hook would have a `URL` instead of an `OCIReference` as well as config information that it would pass through to the hook. With the webhook approach, lifecycle management is a lot more flexible as the webhook can operate on native kubernetes events such as a secret getting created / updated. 106 107 Pros: 108 109 - Pepr as a solution would be more flexible than the internal Jackal implementation of Hooks since the webhook could be anywhere. 110 - Using Pepr would reduce the complexity of Jackal's codebase. 111 - It will be easier to secure third party hooks when Pepr is the one running them. 112 - Lifecycle management would be a lot easier with a webhook solution like Pepr. 113 114 Cons: 115 116 - Pepr is a new project that hasn't been stress tested in production yet (but neither has Hooks). 117 - The Pepr image needs to be pushed to an image registry before it is deployed. This will require a new bootstrapping solution to solve the ECR problem we identified above. 118 119 ## Decision 120 121 [Pepr](https://github.com/defenseunicorns/pepr) will be used to enable custom, or environment-specific, automation tasks to be integrated in the Jackal package deployment lifecycle. Pepr also allows the Jackal codebase to remain agnostic to any third-party APIs or dependencies that may be used. 122 123 A `--skip-webhooks` flag has been added to `jackal package deploy` to allow users to opt out of Jackal checking and waiting for any webhooks to complete during package deployments. 124 125 ## Consequences 126 127 While hooks don't introduce raw schema changes to Jackal, it does add complexity where side affects are happening during package deployments that might not be obvious to the package deployer. This is especially the case if the person who deployed the hooks is different from the person who is deploying the subsequent packages.