github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/quickstart/actions-and-hooks.md (about) 1 --- 2 title: 6️⃣ Using Actions and Hooks in lakeFS 3 description: lakeFS quickstart / Use Actions and Hooks to enforce conditions when committing and merging changes 4 parent: ⭐ Quickstart 5 nav_order: 30 6 next: ["Work with lakeFS data on your local environment", "./work-with-data-locally.html"] 7 previous: ["Rollback the changes", "./rollback.html"] 8 --- 9 10 # Actions and Hooks in lakeFS 11 12 When we interact with lakeFS it can be useful to have certain checks performed at stages along the way. Let's see how [actions in lakeFS]({% link howto/hooks/index.md %}) can be of benefit here. 13 14 We're going to enforce a rule that when a commit is made to any branch that begins with `etl`: 15 16 * the commit message must not be blank 17 * there must be `job_name` and `version` metadata 18 * the `version` must be numeric 19 20 To do this we'll create an _action_. In lakeFS, an action specifies one or more events that will trigger it, and references one or more _hooks_ to run when triggered. Actions are YAML files written to lakeFS under the `_lakefs_actions/` folder of the lakeFS repository. 21 22 _Hooks_ can be either a [Lua]({% link howto/hooks/lua.md %}) script that lakeFS will execute itself, an external [web hook]({% link howto/hooks/webhooks.md %}), or an [Airflow DAG]({% link howto/hooks/airflow.md %}). In this example, we're using a Lua hook. 23 24 ## Configuring the Action 25 26 1. In lakeFS create a new branch called `add_action`. You can do this through the UI or with `lakectl`: 27 28 ```bash 29 docker exec lakefs \ 30 lakectl branch create \ 31 lakefs://quickstart/add_action \ 32 --source lakefs://quickstart/main 33 ``` 34 35 2. Open up your favorite text editor (or emacs), and paste the following YAML: 36 37 ```yaml 38 name: Check Commit Message and Metadata 39 on: 40 pre-commit: 41 branches: 42 - etl** 43 hooks: 44 - id: check_metadata 45 type: lua 46 properties: 47 script: | 48 commit_message=action.commit.message 49 if commit_message and #commit_message>0 then 50 print("✅ The commit message exists and is not empty: " .. commit_message) 51 else 52 error("\n\n❌ A commit message must be provided") 53 end 54 55 job_name=action.commit.metadata["job_name"] 56 if job_name == nil then 57 error("\n❌ Commit metadata must include job_name") 58 else 59 print("✅ Commit metadata includes job_name: " .. job_name) 60 end 61 62 version=action.commit.metadata["version"] 63 if version == nil then 64 error("\n❌ Commit metadata must include version") 65 else 66 print("✅ Commit metadata includes version: " .. version) 67 if tonumber(version) then 68 print("✅ Commit metadata version is numeric") 69 else 70 error("\n❌ Version metadata must be numeric: " .. version) 71 end 72 end 73 ``` 74 75 3. Save this file as `/tmp/check_commit_metadata.yml` 76 77 * You can save it elsewhere, but make sure you change the path below when uploading 78 79 4. Upload the `check_commit_metadata.yml` file to the `add_action` branch under `_lakefs_actions/`. As above, you can use the UI (make sure you select the correct branch when you do), or with `lakectl`: 80 81 ```bash 82 docker exec lakefs \ 83 lakectl fs upload \ 84 lakefs://quickstart/add_action/_lakefs_actions/check_commit_metadata.yml \ 85 --source /tmp/check_commit_metadata.yml 86 ``` 87 88 5. Go to the **Uncommitted Changes** tab in the UI, and make sure that you see the new file in the path shown: 89 90 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-00.png" alt="lakeFS Uncommitted Changes view showing a file called `check_commit_metadata.yml` under the path `_lakefs_actions/`" class="quickstart"/> 91 92 Click **Commit Changes** and enter a suitable message to commit this new file to the branch. 93 94 6. Now we'll merge this new branch into `main`. From the **Compare** tab in the UI compare the `main` branch with `add_action` and click **Merge** 95 96 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-01.png" alt="lakeFS Compare view showing the difference between `main` and `add_action` branches" class="quickstart"/> 97 98 ## Testing the Action 99 100 Let's remind ourselves what the rules are that the action is going to enforce. 101 102 > When a commit is made to any branch that begins with `etl`: 103 104 > * the commit message must not be blank 105 > * there must be `job_name` and `version` metadata 106 > * the `version` must be numeric 107 108 We'll start by creating a branch that's going to match the `etl` pattern, and then go ahead and commit a change and see how the action works. 109 110 1. Create a new branch (see above instructions on how to do this if necessary) called `etl_20230504`. Make sure you use `main` as the source branch. 111 112 In your new branch you should see the action that you created and merged above: 113 114 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-02.png" alt="lakeFS branch etl_20230504 with object /_lakefs_actions/check_commit_metadata.yml" class="quickstart"/> 115 116 1. To simulate an ETL job we'll use the built-in DuckDB editor to run some SQL and write the result back to the lakeFS branch. 117 118 Open the `lakes.parquet` file on the `etl_20230504` branch from the **Objects** tab. Replace the SQL statement with the following: 119 120 ```sql 121 COPY ( 122 WITH src AS ( 123 SELECT lake_name, country, depth_m, 124 RANK() OVER ( ORDER BY depth_m DESC) AS lake_rank 125 FROM READ_PARQUET('lakefs://quickstart/etl_20230504/lakes.parquet')) 126 SELECT * FROM SRC WHERE lake_rank <= 10 127 ) TO 'lakefs://quickstart/etl_20230504/top10_lakes.parquet' 128 ``` 129 130 1. Head to the **Uncommitted Changes** tab in the UI and notice that there is now a file called `top10_lakes.parquet` waiting to be committed. 131 132 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-03.png" alt="lakeFS branch etl_20230504 with uncommitted file top10_lakes.parquet" class="quickstart"/> 133 134 Now we're ready to start trying out the commit rules, and seeing what happens if we violate them. 135 136 1. Click on **Commit Changes**, leave the _Commit message_ blank, and click **Commit Changes** to confirm. 137 138 Note that the commit fails because the hook did not succeed 139 140 `pre-commit hook aborted` 141 142 with the output from the hook's code displayed 143 144 `❌ A commit message must be provided` 145 146 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-04.png" alt="lakeFS blocking an attempt to commit with no commit message" class="quickstart"/> 147 148 1. Do the same as the previous step, but provide a message this time: 149 150 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-05.png" alt="A commit to lakeFS with commit message in place" class="quickstart"/> 151 152 The commit still fails as we need to include metadata too, which is what the error tells us 153 154 `❌ Commit metadata must include job_name` 155 156 1. Repeat the **Commit Changes** dialog and use the **Add Metadata field** to add the required metadata: 157 158 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-06.png" alt="A commit to lakeFS with commit message and metadata in place" class="quickstart"/> 159 160 We're almost there, but this still fails (as it should), since the version is not entirely numeric but includes `v` and `ß`: 161 162 `❌ Version metadata must be numeric: v1.00ß` 163 164 Repeat the commit attempt specify the version as `1.00` this time, and rejoice as the commit succeeds 165 166 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-07.png" alt="Commit history in lakeFS showing that the commit met the rules set by the action and completed successfully." class="quickstart"/> 167 168 --- 169 170 You can view the history of all action runs from the **Action** tab: 171 172 <img width="75%" src="{{ site.baseurl }}/assets/img/quickstart/hooks-08.png" alt="Action run history in lakeFS" class="quickstart"/> 173