github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/howto/hooks/index.md (about)

     1  ---
     2  title: Actions and Hooks
     3  description: Overview of lakeFS Actions and Hooks
     4  has_children: true  
     5  has_toc: false
     6  parent: How-To
     7  redirect_from:
     8    - /reference/hooks.html
     9    - /hooks.html
    10    - /hooks/overview.html
    11    - /hooks/index.html
    12    - /hooks/
    13    - /setup/hooks.html
    14  ---
    15  
    16  # Actions and Hooks in lakeFS
    17  
    18  {% include toc.html %}
    19  
    20  Like other version control systems, lakeFS allows you to configure _Actions_ to trigger when [predefined events](#supported-events) occur. There are numerous uses for Actions, including: 
    21  
    22  1. Format Validator:
    23     A webhook that checks new files to ensure they are of a set of allowed data formats.
    24  1. Schema Validator:
    25     A webhook that reads new Parquet and ORC files to ensure they don't contain a block list of column names (or name prefixes).
    26     This is useful for avoiding accidental PII exposure.
    27  1. Integration with external systems:
    28     Post-merge and post-commit hooks could be used to export metadata about the change to another system. A common example is exporting `symlink.txt` files that allow e.g. [AWS Athena]({% link integrations/athena.md %}) to read data from lakeFS.
    29  1. Notifying downstream consumers:
    30     Running a post-merge hook to trigger an Airflow DAG or to send a Webhook to an API, notifying it of the change that happened
    31  
    32  For step-by-step examples of hooks in action check out the [lakeFS Quickstart]({% link quickstart/actions-and-hooks.md %}) and the [lakeFS samples repository](https://github.com/treeverse/lakeFS-samples/).
    33  
    34  ## Overview
    35  
    36  An _action_ defines one or more _hooks_ to execute. lakeFS supports three types of hook: 
    37  
    38  1. [Lua](./lua.html) - uses an embedded Lua VM
    39  1. [Webhook](./webhooks.html) - makes a REST call to an external URL
    40  1. [Airflow](./airflow.html) - triggers a DAG in Airflow
    41  
    42  "Before" hooks must run successfully before their action. If the hook fails, it aborts the action. Lua hooks and Webhooks are synchronous, and lakeFS waits for them to run to completion. Airflow hooks are asynchronous: lakeFS stops waiting as soon as Airflow accepts triggering the DAG.
    43  
    44  ## Configuration
    45  
    46  There are two parts to configuration an Action: 
    47  
    48  1. Create an Action file and upload it to the lakeFS repository
    49  2. Configure the hook(s) that you specified in the Action file. How these are configured will depend on the type of hook. 
    50  
    51  ### Action files
    52  
    53  An **Action** is a list of Hooks with the same trigger configuration, i.e. an event will trigger all Hooks under an Action or none at all.
    54  
    55  The Hooks under an Action are ordered and so is their execution.
    56  
    57  Before each hook execution the `if` boolean expression is evaluated. The expression can use the functions `success()` and `failure()`, which return true if the hook's actions succeeded or failed, respectively.
    58  
    59  By default, when `if` is empty or omitted, the step will run only if no error occurred (the same as `success()`).
    60  
    61  #### Action File schema
    62  
    63  | Property             | Description                                               | Data Type  | Required | Default Value                                                           |
    64  |----------------------|-----------------------------------------------------------|------------|----------|-------------------------------------------------------------------------|
    65  | `name               `| Identifes the Action file                                 | String     | no       | Action filename                                    |
    66  | `on                 `| List of events that will trigger the hooks                | List       | yes      |                                                                         |
    67  | `on<event>.branches `| Glob pattern list of branches that triggers the hooks     | List       | no       | **Not applicable to Tag events.** If empty, Action runs on all branches |
    68  | `hooks              `| List of hooks to be executed                              | List       | yes      |                                                                         |
    69  | `hook.id            `| ID of the hook, must be unique within the action.         | String     | yes      |                                                                         |
    70  | `hook.type          `| Type of the hook ([types](#hook-types))                   | String     | yes      |                                                                         |
    71  | `hook.description   `| Description for the hook                                  | String     | no       |                                                                         |
    72  | `hook.if            `| Expression that will be evaluated before execute the hook | String     | no       | No value is the same as evaluate `success()`                            |
    73  | `hook.properties    `| Hook's specific configuration, see [Lua](./lua.md#action-file-lua-hook-properties), [WebHook](./webhooks.md#action-file-webhook-properties), and [Airflow](./airflow.md#action-file-airflow-hook-properties) for details                             | Dictionary | true     |                                                                         |
    74  
    75  #### Example Action File
    76  
    77  ```yaml
    78  name: Good files check
    79  description: set of checks to verify that branch is good
    80  on:
    81    pre-commit:
    82    pre-merge:
    83      branches:
    84        - main
    85  hooks:
    86    - id: no_temp
    87      type: webhook
    88      description: checking no temporary files found
    89      properties:
    90        url: "https://example.com/webhook?notmp=true?t=1za2PbkZK1bd4prMuTDr6BeEQwWYcX2R"
    91    - id: no_freeze
    92      type: webhook
    93      description: check production is not in dev freeze
    94      properties:
    95        url: "https://example.com/webhook?nofreeze=true?t=1za2PbkZK1bd4prMuTDr6BeEQwWYcX2R"
    96    - id: alert
    97      type: webhook
    98      if: failure()
    99      description: notify alert system when check failed
   100      properties:
   101         url: "https://example.com/alert"
   102         query_params:
   103            title: good files webhook failed
   104    - id: notification
   105      type: webhook
   106      if: true
   107      description: notify that will always run - no matter if one of the previous steps failed
   108      properties:
   109         url: "https://example.com/notification"
   110         query_params:
   111            title: good files completed
   112  ```
   113  
   114  **Note:** lakeFS will validate action files only when an **Event** has occurred. <br/>
   115  Use `lakectl actions validate <path>` to validate your action files locally.
   116  {: .note }
   117  
   118  
   119  ### Uploading Action files
   120  
   121  Action files should be uploaded with the prefix `_lakefs_actions/` to the lakeFS repository.
   122  When an actionable event (see Supported Events above) takes place, lakeFS will read all files with prefix `_lakefs_actions/`
   123  in the repository branch where the action occurred.
   124  A failure to parse an Action file will result with a failing Run.
   125  
   126  For example, lakeFS will search and execute all the matching Action files with the prefix `lakefs://example-repo/feature-1/_lakefs_actions/` on:
   127  1. Commit to `feature-1` branch on `example-repo` repository.
   128  1. Merge to `main` branch from `feature-1` branch on `repo1` repository.
   129  
   130  
   131  ## Supported Events
   132  
   133  | Event                | Description                                                                    |
   134  |----------------------|--------------------------------------------------------------------------------|
   135  | `pre-commit`         | Runs when the commit occurs, before the commit is finalized                    |
   136  | `post-commit`        | Runs after the commit is finalized                                             |
   137  | `pre-merge`          | Runs on the source branch when the merge occurs, before the merge is finalized |
   138  | `post-merge`         | Runs on the merge result, after the merge is finalized                         |
   139  | `pre-create-branch`  | Runs on the source branch prior to creating a new branch                       |
   140  | `post-create-branch` | Runs on the new branch after the branch was created                            |
   141  | `pre-delete-branch`  | Runs prior to deleting a branch                                                |
   142  | `post-delete-branch` | Runs after the branch was deleted                                              |
   143  | `pre-create-tag`     | Runs prior to creating a new tag                                               |
   144  | `post-create-tag`    | Runs after the tag was created                                                 |
   145  | `pre-delete-tag`     | Runs prior to deleting a tag                                                   |
   146  | `post-delete-tag`    | Runs after the tag was deleted                                                 |
   147  
   148  lakeFS Actions are handled per repository and cannot be shared between repositories.
   149  A failure of any Hook under any Action of a `pre-*` event will result in aborting the lakeFS operation that is taking place.
   150  Hook failures under any Action of a `post-*` event will not revert the operation.
   151  
   152  Hooks are managed by Action files that are written to a prefix in the lakeFS repository.
   153  This allows configuration-as-code inside lakeFS, where Action files are declarative and written in YAML.
   154  
   155  ## Runs API & CLI
   156  
   157  A **Run** is an instantiation of the repository's Action files when the triggering event occurs.
   158  For example, if your repository contains a pre-commit hook, every commit would generate a Run for that specific commit.
   159  
   160  lakeFS will fetch, parse and filter the repository Action files and start to execute the Hooks under each Action.
   161  All executed Hooks (each with `hook_run_id`) exist in the context of that Run (`run_id`).
   162  
   163  The [lakeFS API]({% link reference/api.md %}) and [lakectl][lakectl-actions] expose the results of executions per repository, branch, commit, and specific Action.
   164  The endpoint also allows to download the execution log of any executed Hook under each Run for observability.
   165  
   166  
   167  ## Result Files
   168  
   169  The metadata section of lakeFS repository with each Run contains two types of files:
   170  1. `_lakefs/actions/log/<runID>/<hookRunID>.log` - Execution log of the specific Hook run.
   171  1. `_lakefs/actions/log/<runID>/run.manifest` - Manifest with all Hooks execution for the run with their results and additional metadata.
   172  
   173  **Note:** Metadata section of a lakeFS repository is where lakeFS keeps its metadata, like commits and metaranges.
   174  Metadata files stored in the metadata section aren't accessible like user stored files.
   175  {: .note }
   176  
   177  [lakectl-actions]:  {% link reference/cli.md %}#lakectl-actions