github.com/dctrud/umoci@v0.4.3-0.20191016193643-05a1d37de015/doc/site/advanced/workflow-optimisation.md (about)

     1  +++
     2  title = "Workflow Optimisation"
     3  weight = 10
     4  +++
     5  
     6  One of the first things that a user of umoci may notice is that certain
     7  operations can be quite expensive. Notably unpack and repack operations require
     8  either scanning through each layer archive of an image, or scanning through the
     9  filesystem. Both operations require quite a bit of disk IO, and can take a
    10  while. Fedora images are known to be quite large, and can take several seconds
    11  to operate on.
    12  
    13  ```text
    14  % time umoci unpack --image fedora:26 bundle
    15  umoci unpack --image fedora:26 bundle  8.43s user 1.68s system 105% cpu 9.562 total
    16  % time umoci repack --image fedora:26-old bundle
    17  umoci repack --image fedora:26 bundle  3.62s user 0.43s system 115% cpu 3.520 total
    18  % find bundle/rootfs -type f -exec touch {} \;
    19  % time umoci repack --image fedora:26-new bundle
    20  umoci repack --image fedora:26-new bundle  32.03s user 4.50s system 112% cpu 32.559 total
    21  ```
    22  
    23  While it is not currently possible to optimise or parallelise the above
    24  operations individually (due to the structure of the layer archives), it is
    25  possible to optimise your workflows in certain situations. These workflow tips
    26  effectively revolve around reducing the amount of extractions that are
    27  performed.
    28  
    29  ### `--refresh-bundle` ###
    30  
    31  A very common workflow when building a series of layers in an image is that,
    32  since you want to place different files in different layers of the image, you
    33  have to do something like the following:
    34  
    35  ```text
    36  % umoci unpack --image image_build_XYZ:wip bundle_a
    37  % ./some_build_process_1 ./bundle_a
    38  % umoci repack --image image_build_XYZ:wip bundle_a
    39  % umoci unpack --image image_build_XYZ:wip bundle_b
    40  % ./some_build_process_2 ./bundle_b
    41  % umoci repack --image image_build_XYZ:wip bundle_b
    42  % umoci unpack --image image_build_XYZ:wip bundle_c
    43  % ./some_build_process_3 ./bundle_c
    44  % umoci repack --image image_build_XYZ:wip bundle_c
    45  % umoci tag --image image_build_XYZ:wip final
    46  ```
    47  
    48  The above usage, while correct, is not very efficient. Each layer that is
    49  created requires us to to do an unpack of the entire `image_build_XYZ:wip`
    50  image before we can do anything. By noting that the root filesystem contained
    51  in `bundle_a` after we've made our changes is effectively the same as the root
    52  filesystem that we extract into `bundle_b` (and since we already have
    53  `bundle_a` we don't have to extract it), we can conclude that using `bundle_a`
    54  is probably going to be more efficient. However, you cannot just do this the
    55  "intuitive way":
    56  
    57  ```text
    58  % umoci unpack --image image_build_XYZ:wip bundle_a
    59  % ./some_build_process_1 ./bundle_a
    60  % umoci repack --image image_build_XYZ:wip bundle_a
    61  % ./some_build_process_2 ./bundle_a
    62  % umoci repack --image image_build_XYZ:wip bundle_a
    63  % ./some_build_process_3 ./bundle_a
    64  % umoci repack --image image_build_XYZ:wip bundle_a
    65  % umoci tag --image image_build_XYZ:wip final
    66  ```
    67  
    68  Because the metadata stored in `bundle_a` includes information about what image
    69  the bundle was based on (this is used when creating the modified image
    70  metadata). Thus, the above usage will *not* result in multiple layers being
    71  created, and the usage is roughly identical to the following:
    72  
    73  ```text
    74  % umoci unpack --image image_build_XYZ:wip bundle_a
    75  % ./some_build_process_1 ./bundle_a
    76  % ./some_build_process_2 ./bundle_a
    77  % ./some_build_process_3 ./bundle_a
    78  % umoci repack --image image_build_XYZ:wip bundle_a
    79  % umoci tag --image image_build_XYZ:wip final
    80  ```
    81  
    82  Do not despair however, there is a flag just for you! With `--refresh-bundle`
    83  it is possible to perform the above operations without needing to do any extra
    84  unpack operations.
    85  
    86  ```text
    87  % umoci unpack --image image_build_XYZ:wip bundle_a
    88  % ./some_build_process_1 ./bundle_a
    89  % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a
    90  % ./some_build_process_2 ./bundle_a
    91  % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a
    92  % ./some_build_process_3 ./bundle_a
    93  % umoci repack --refresh-bundle --image image_build_XYZ:wip bundle_a
    94  % umoci tag --image image_build_XYZ:wip final
    95  ```
    96  
    97  Internally, `--refresh-bundle` is modifying the few metadata files inside
    98  `bundle_a` so that future repack invocations modify the new image created by
    99  the previous repack operation rather than basing it on the original unpacked
   100  image. Therefore the cost of `--refresh-bundle` is constant, and is actually
   101  **much** smaller than the cost of doing additional unpack operations.