github.com/opencontainers/umoci@v0.4.8-0.20240508124516-656e4836fb0d/doc/site/advanced/workflow-optimisation.md (about) 1 +++ 2 title = "Workflow Optimisation" 3 weight = 10 4 +++ 5 6 One of the first things that a user of umoci may notice is that certain 7 operations can be quite expensive. Notably unpack and repack operations require 8 either scanning through each layer archive of an image, or scanning through the 9 filesystem. Both operations require quite a bit of disk IO, and can take a 10 while. Fedora images are known to be quite large, and can take several seconds 11 to operate on. 12 13 ```text 14 % time umoci unpack --image fedora:26 bundle 15 umoci unpack --image fedora:26 bundle 8.43s user 1.68s system 105% cpu 9.562 total 16 % time umoci repack --image fedora:26-old bundle 17 umoci repack --image fedora:26 bundle 3.62s user 0.43s system 115% cpu 3.520 total 18 % find bundle/rootfs -type f -exec touch {} \; 19 % time umoci repack --image fedora:26-new bundle 20 umoci repack --image fedora:26-new bundle 32.03s user 4.50s system 112% cpu 32.559 total 21 ``` 22 23 While it is not currently possible to optimise or parallelise the above 24 operations individually (due to the structure of the layer archives), it is 25 possible to optimise your workflows in certain situations. These workflow tips 26 effectively revolve around reducing the amount of extractions that are 27 performed. 28 29 ### `--refresh-bundle` ### 30 31 A very common workflow when building a series of layers in an image is that, 32 since you want to place different files in different layers of the image, you 33 have to do something like the following: 34 35 ```text 36 % umoci unpack --image image_build_XYZ:wip bundle_a 37 % ./some_build_process_1 ./bundle_a 38 % umoci repack --image image_build_XYZ:wip bundle_a 39 % umoci unpack --image image_build_XYZ:wip bundle_b 40 % ./some_build_process_2 ./bundle_b 41 % umoci repack --image image_build_XYZ:wip bundle_b 42 % umoci unpack --image image_build_XYZ:wip bundle_c 43 % ./some_build_process_3 ./bundle_c 44 % umoci repack --image image_build_XYZ:wip bundle_c 45 % umoci tag --image image_build_XYZ:wip final 46 ``` 47 48 The above usage, while correct, is not very efficient. Each layer that is 49 created requires us to to do an unpack of the entire `image_build_XYZ:wip` 50 image before we can do anything. By noting that the root filesystem contained 51 in `bundle_a` after we've made our changes is effectively the same as the root 52 filesystem that we extract into `bundle_b` (and since we already have 53 `bundle_a` we don't have to extract it), we can conclude that using `bundle_a` 54 is probably going to be more efficient. However, you cannot just do this the 55 "intuitive way": 56 57 ```text 58 % umoci unpack --image image_build_XYZ:wip bundle_a 59 % ./some_build_process_1 ./bundle_a 60 % umoci repack --image image_build_XYZ:wip bundle_a 61 % ./some_build_process_2 ./bundle_a 62 % umoci repack --image image_build_XYZ:wip bundle_a 63 % ./some_build_process_3 ./bundle_a 64 % umoci repack --image image_build_XYZ:wip bundle_a 65 % umoci tag --image image_build_XYZ:wip final 66 ``` 67 68 Because the metadata stored in `bundle_a` includes information about what image 69 the bundle was based on (this is used when creating the modified image 70 metadata). Thus, the above usage will *not* result in multiple layers being 71 created, and the usage is roughly identical to the following: 72 73 ```text 74 % umoci unpack --image image_build_XYZ:wip bundle_a 75 % ./some_build_process_1 ./bundle_a 76 % ./some_build_process_2 ./bundle_a 77 % ./some_build_process_3 ./bundle_a 78 % umoci repack --image image_build_XYZ:wip bundle_a 79 % umoci tag --image image_build_XYZ:wip final 80 ``` 81 82 Do not despair however, there is a flag just for you! With `--refresh-bundle` 83 it is possible to perform the above operations without needing to do any extra 84 unpack operations. 85 86 ```text 87 % umoci unpack --image image_build_XYZ:wip bundle_a 88 % ./some_build_process_1 ./bundle_a 89 % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a 90 % ./some_build_process_2 ./bundle_a 91 % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a 92 % ./some_build_process_3 ./bundle_a 93 % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a 94 % umoci tag --image image_build_XYZ:wip final 95 ``` 96 97 Internally, `--refresh-bundle` is modifying the few metadata files inside 98 `bundle_a` so that future repack invocations modify the new image created by 99 the previous repack operation rather than basing it on the original unpacked 100 image. Therefore the cost of `--refresh-bundle` is constant, and is actually 101 **much** smaller than the cost of doing additional unpack operations. 102 103 ### `umoci insert` ### 104 105 Sometimes all you want to do is to add some files to an image (or remove some 106 files) and nothing else, and in those cases doing an `umoci unpack`-`umoci 107 repack` cycle is also quite expensive. This is especially true when you 108 consider that OCIv1 images are backed by `tar` archives -- and the delta layer 109 being generated is just going to be a `tar` archive of the files you are 110 adding. The most basic usage of `umoci insert` is to just specify what files 111 you want added, and what you want them to be called in the image (we don't have 112 any magical `rsync` semantics -- we just copy the root to whatever path you 113 tell us). 114 115 {{% notice info %}} 116 Note that unlike most other `umoci` commands, `umoci insert` **will overwrite 117 the image you give it**. As a counter-example, the `--image` flag of `umoci 118 repack` refers to the *target* image not the *source* image (the source image 119 is already known, because `umoci unpack` saves that information). 120 121 This behaviour may change in the future, but it's not clear what would be an 122 obvious interface for this change (older versions of `umoci` had separate 123 `--src` and `--dst` flags, but they were unwieldy and so were removed in 124 favour of the `--image` style). 125 126 Also note that each `umoci insert` creates a separate layer. 127 {{% /notice %}} 128 129 ```text 130 % umoci insert --image myimg:foo mybinary /usr/bin/release-binary 131 % umoci insert --image myimg:foo myconfigdir /etc/binary.d 132 ``` 133 134 If the target file already exists in previous layers, the new layer will 135 overwrite any older versions of the files inserted (when extracted). 136 137 You can also remove a file (or directory) from an image by using the 138 `--whiteout` option, which creates a new layer with a "whiteout" entry for the 139 path you give it. If the file doesn't already exist, the behaviour depends on 140 the extraction tool used -- `umoci insert` will ignore whiteouts for 141 non-existent files when extracting. 142 143 {{% notice warning %}} 144 **Do not use this to remove secrets from an image.** Since `umoci insert` 145 operates by creating a new layer, older layers will still contain a copy of the 146 secret you are trying to remove. If you want to avoid things from being 147 included in an image in the first place, take a look at `umoci repack 148 --mask-path` (which causes changes to the given paths to not be included in the 149 new layer) or `umoci config --config.volumes` (which is automatically treated 150 as a masked path by `umoci repack`). 151 {{% /notice %}} 152 153 ```text 154 % umoci insert --whiteout /usr/bin/old-binary 155 % umoci insert --whiteout /etc/old-config.d 156 ``` 157 158 Finally, there is one more important thing to know about `umoci insert` -- how 159 directory insertion is handled. By default, `umoci insert` just creates a new 160 layer with the contents of the directory. When unpacked, this results in any 161 existing contents in that directory (from older layers) to be merged with the 162 new layer's contents. You can imagine this as though you extracted your new 163 directory on top of the previous layers' cumulative directory state. 164 165 But what if you want to entire replace the contents of a directory? That's the 166 reason why we have `--opaque` -- it allows you to effectively blank out any 167 pre-existing contents of the directory and replace it entirely with the new 168 directory. If the target was not a directory in previous layers, or the source 169 is not a directory, then the behaviour will depend on the tool used for 170 extraction -- `umoci unpack` will just ignore the meaningless opaque whiteout 171 entry. 172 173 ```text 174 % umoci insert --opaque myetcdir /etc 175 ``` 176 177 The same caveat about `umoci insert --whiteout` applies here, as older layers 178 will contain the files that were removed by the opaque whiteout. 179 180 {{% notice info %}} 181 It should be noted that this is the only way that umoci will currently create 182 an "opaque whiteout". This means that if you need to replace an entire 183 directory wholesale, the layer created by `umoci insert --opaque` is far more 184 efficient in the resulting layer than the `umoci unpack`-`umoci repack` cycle 185 (even if you ignore the CPU-time benefits). 186 187 Though currently `umoci insert` only allows one operation per layer, which is 188 mostly a UX restriction. This may change in the future, and so `umoci insert` 189 will be *far* more generally usable and efficient in terms of number of layers 190 generated. 191 {{% /notice %}}