github.com/wmuizelaar/kpt@v0.0.0-20221018115725-bd564717b2ed/docs/design-docs/02-oci-support.md (about)

     1  # OCI Support
     2  
     3  * Author(s): Louis Dejardin, @loudej
     4  * Approver: \<kpt-maintainer\>
     5  
     6  >    Every feature will need design sign off an PR approval from a core
     7  >    maintainer.  If you have not got in touch with anyone yet, you can leave
     8  >    this blank and we will try to line someone up for you.
     9  
    10  ## Why
    11  
    12  Systems that deal with packaging software or bundling configuration often have
    13  an atomic, versionable artifact. This artifact can exist as a source of
    14  truth independant from the source controlled content from which it was built.
    15  
    16  It is also very common for those packages to have an associated feed or repository
    17  which can receive those packages as they are published, and make them available for 
    18  download as needed. In many companies, using `git` as the as the repository and source
    19  of truth for production configuration comes with challenges.
    20  
    21  This design document proposes to add `OCI` as an alternative to `git` for publishing and
    22  distributing Kpt config packages. As a packaging format, it is well understood and documented.
    23  As a repository format, it leverages existing container registries for pushing and pulling
    24  config as image content. For security and production configuration management, if a 
    25  company has practices for managing Docker container images in private registries, then
    26  the same practices and security model can be applied to config package images in private
    27  registries.
    28  
    29  https://github.com/GoogleContainerTools/kpt/issues/2300
    30  
    31  ## Design
    32  
    33  ### Design Assumptions
    34  
    35  The first stage of OCI support comes from adding support for `oci` in the places
    36  where `git` appears today.
    37  
    38  An image tag is used in the same way a git branch or tag would be used.
    39  
    40  An image digest is used in the same way a git commit would be used.
    41  
    42  The scope of a single image is one root package, with any number of optional sub-packages.
    43  
    44  The structure of the image is a single tar layer. The root `Kptfile` is in the base directory from the tar layer's point of view. It contains only the Kpt package files, no entrypoint or executables.
    45  
    46  A package image should not be confused with a container image. Container images are executable by software, like `Docker`, and package images are purely configuration data.
    47  
    48  ### Config chages
    49  
    50  The `Kptfile` structures for `upstream` and `upstreamLock` have `oci` in addition to `git` properties. The `type` property also has the string `oci` added as an accepted value.
    51  
    52  ```yaml
    53  upstream:
    54    type: oci
    55    oci:
    56      image: 'IMAGE:TAG'
    57  upstreamLock:
    58    type: oci
    59    oci:
    60      image: 'IMAGE:DIGEST'
    61  ```
    62  
    63  New verions of `kpt` will support existing `Kptfile`. The file structures and `git` functionality is unchanged.
    64  
    65  Existing versions of `kpt` will support `Kptfile` with `upstream` based on `git` for the same reason. The structure and meaning of existing fields is not changed.
    66  
    67  Existing versions of `kpt` will not support `Kptfile` with `upstream` based on `oci`. The `type` value, and missing `git` information will fail validation. The `kpt` binary used will need to be upgraded.
    68  
    69  ### Command changes
    70  
    71  ### `kpt pkg get`
    72  
    73  The argument that determines upstream today is parsed into `repo`, `ref`, and `path`, and is implicitly a `git` location.
    74  
    75  To support `oci`, it will be necessary to extract different values in a way that's unambiguous. Unfortunately, OCI image names have no Uri prefix, and are indistinguishable from a valid path or file name.
    76  
    77  To solve this, using [Helm](https://helm.sh/docs/topics/registries/#other-subcommands) as an example, the prefix `oci://` can be used. This ensures that selecting `oci` protocol isn't accidental, and it won't collide with other location formats that may be added.
    78  
    79  ```shell
    80  # clone package as new folder
    81  kpt pkg get oci://us-docker.pkg.dev/the-project-id/the-repo-name/the-package:v3 my-package
    82  ```
    83  
    84  Because OCI image reference already has a convention for `image:tag` references, using `:v3` should be used instead of `@v3` for version. It will be more intuitive how it relates to the registry, and easier to cut and paste values.
    85  
    86  ### `kpt pkg get` sub-packages
    87  
    88  It is possible to use `kpt pkg get` to add sub-packages to a target location.
    89  
    90  Syntax for a sub-package target location is unchanged, it's a normal filesystem path.
    91  
    92  Syntax for an OCI sub-package source location requires the ability to tell when an image name ends and a sub-package path inside that image begins. In `git` this requires an explicit `.git` extension at the transition, and in `.oci` this requires double slash.
    93  
    94  ```shell
    95  # clone sub-package as new sub-folder
    96  kpt pkg get oci://us-docker.pkg.dev/the-project-id/the-repo-name/the-package//simple/example:v3 my-package/simple/my-example
    97  ```
    98  
    99  ### `kpt pkg update`
   100  
   101  The command for update is not changed, but when the `upstream` is `oci` then the `@VERSION` is used to change the `upstream` image's `tag` or `digest` value.
   102  
   103  To update to an image tag, `kpt pkg update @v14` and `kpt pkg update DIR@v14` will assign the `:v14` tag onto the upstream image.
   104  
   105  ```yaml
   106  upstream:
   107    type: oci
   108    oci:
   109      image: us-docker.pkg.dev/the-project-id/the-repo-name/the-package:v14
   110  ```
   111  
   112  To update to an upstream digest, `kpt pkg update @sha256:{SHA256_HEX}` and `kpt pkg update DIR@sha256:{SHA256_HEX}` will assign `@sha256:{SHA256_HEX}` as the new upstream image digest.
   113  
   114  ```yaml
   115  upstream:
   116    type: oci
   117    oci:
   118      image: us-docker.pkg.dev/the-project-id/the-repo-name/the-package@sha256:8815143a333cb9d2cb341f10b984b22f3b8a99fe
   119  ```
   120  
   121  Calling `kpt pkg update` and `kpt pkg update DIR` will perform an update without changing the upstream image name.
   122  
   123  At that point, if the `upstream` is an `image:tag` that is to discover the current `image:digest` for tag, otherwise the `upstream` value for `image:digest` is used. In either case, the `upstreamLock` is changed to point at that new `image:digest`. 
   124  
   125  The package contents of the old and new `upstreamLock` image digest are fetched to temp folders, and are the basis of the 3-way merge to update the target package.
   126  
   127  ### `kpt pkg diff`
   128  
   129  The `kpt pkg diff` command is identical to `kpt pkg update` in the way that `[PKG_PATH@VERSION]` argument is mapped to OCI concepts.
   130  
   131  ### Command additions
   132  
   133  Although it is possible to create and push an OCI image using a combination of commands like `tar` and `gcrane`, that
   134  doesn't provide a very complete end to end experience. Because kpt would already be built with the same OCI go module used 
   135  by `gcrane`, it is not difficult to support additional commands to move pull and push package contents from local folders
   136  to remote images and back.
   137  
   138  ### ` kpt pkg pull`
   139  
   140  ```
   141  Usage: kpt pkg pull oci://{IMAGE[:TAG|@sha256:DIGEST]} [DIR] 
   142    DIR                         Destination folder for image contents. Default folder name is the last part of the IMAGE path.
   143    IMAGE[:TAG|@sha256:DIGEST]  Name of image to pull contents from, with optional TAG or DIGEST. Default TAG is `Latest`
   144  ```
   145  
   146  This command is the reverse of push. An image can be pulled from a repository to a local folder, modified, and pushed
   147  back to the same location, same location with different TAG, or entirely different location.
   148  
   149  The target DIR is optional, following the conventions of `kpg pkg get`, and will default to the final image/path segment.
   150  
   151  `kpt pkg pull` works on git uri as well. This may be used, for example, to mirror a set of known blueprints into a private 
   152  oci registry.
   153  
   154  ### ` kpt pkg push`
   155  
   156  ```
   157  Usage: kpt pkg push [DIR@VERSION] [--origin oci://{IMAGE[:TAG]}] [--increment]
   158    DIR@VERSION     Folder containing package root Kptfile. Default is current directory.
   159                    Optional @VERSION changes tag or branch to push onto. Default is most recently pulled or pushed tag.
   160    --origin        Name of image to push contents onto, with optional TAG to assign to resulting commit.
   161                    Default is to use most recently pulled/pushed image. Required if Kptfile does not have an origin.
   162    --increment     Increase the version by 1 while pushing. Default is to leave the origin's TAG or DIR@VERSION unchanged.
   163                    The Kptfile's image TAG is also updated to the new value. 
   164  ```
   165  
   166  This command will `tar` the contents of the package into a single image layer, and push it into the OCI repository. For
   167  Google Artifact Registry and Google Container Registry, the current `gcloud auth` SSO credentials are used.
   168  
   169  The simplest form of the command is `kpt pkg push` or `kpt pkg push DIR` which will push the current contents back to
   170  the IMAGE:TAG location that was saved when `kpt pkg pull` was run.
   171  
   172  The synxax `kpt pkg push @VERSION` or `kpt pkg push DIR@VERSION` will push back to the image location it came from, but with a new TAG name or version. Examples are `kpt pkt push @draft` or `kpt pkg push @v4`
   173  
   174  If the Kptfile was not obtained by `kpt pkg pull` - for example it's a new package from `kpt pkg init` or `kpt pkg get` - then
   175  the first `kpt pkg push` will require an `--origin IMAGE:TAG` option to provide the target location. It is only necessary on the first
   176  call.
   177  
   178  Finally, if the IMAGE's TAG value is a valid version number, the `--increment` switch can be used to add 1 to the current value before pushing.
   179  
   180  In the simplest case a `v1` is changed to `v2`, and `1` is changed to `2`, but any TAG that is a valid semver (with optional leading 'v') will have the smallest part of the number incremented. So `v1.0` becomes `v1.1`, `v1.0.0` becomes `v1.0.1`, and `v4.1.9-alpha` becomes `v4.1.10-alpha`
   181  
   182  #### Comparison of `pkg get` and `pkg pull`
   183  
   184  Starting with a simple root package, and an orange variant with root as the upstream:
   185  
   186  ```
   187  -- root
   188    \-- orange {upstream: root}
   189  ```
   190  
   191  The purpose of `kpt get` is to create a new leaf node. This is done by creating the initial copy of the new leaf package in a
   192  local folder. This has the side-effects of altering the kptfile name, the upstream values to point at the source, and makes appropriate 
   193  changes to sub-package metadata.
   194  
   195  As an example, after running `kpt pkg get scheme://repo/root green` and `kpt pkg get scheme://repo/orange blue` the `green` and 
   196  `blue` local folder packages are appended to the inheritance tree like this:
   197  
   198  ```
   199  -- root
   200    \-- orange {upstream: root}
   201    | `-- blue {upstream: orange}  ** working copy in ./blue **
   202    \-- green {upstream: root}     ** working copy in ./green **
   203  ```
   204  
   205  By comparison, `kpt pkg pull` does not create a new package node or identity - it only extracts a copy of existing package
   206  contents to a working directory. In this example, if the user additionally ran `kpt pkg pull scheme://repo/root root` and 
   207  `kpt pkg pull scheme://repo/orange orange` the overall state would be this:
   208  
   209  ```
   210  -- root                          ** working copy in ./root **
   211    \-- orange {upstream: root}    ** working copy in ./orange **
   212    | `-- blue {upstream: orange}  ** working copy in ./blue **
   213    \-- green {upstream: root}     ** working copy in ./green **
   214  ```
   215  
   216  ### Alternatives to push/pull
   217  
   218  There are several ways that pull and push could appear as commands. Those two names are very conventional, but 
   219  alternatives to consider could be:
   220  
   221  ### `kpt pkg copy`
   222  
   223  ```
   224  Usage: kpt pkg copy {SOURCE} {DEST}
   225    SOURCE  Package source location: a local DIR, or `oci://` image, or git repo and path
   226    DEST    Package destination: a local DIR, or `oci://` image.
   227  ```
   228  
   229  Puts a copy of the SOURCE package at the DEST location. The package contents would be entirely unchanged by this operation (unlike `kpt pkg get`).
   230  
   231  To pull from remote image to local folder:
   232  
   233  ```
   234  kpt pkg copy \
   235    oci://us-docker.pkg.dev/the-project-id/the-repo-name/the-package:v14 \
   236    the-package
   237  ```
   238  
   239  To push from local folder to new remote image tag:
   240  
   241  ```
   242  kpt pkg copy \
   243    the-package \
   244    oci://us-docker.pkg.dev/the-project-id/the-repo-name/the-package:v15
   245  ```
   246   
   247  To copy a package image from one OCI repo to another:
   248  
   249  ```
   250  kpt pkg copy \
   251    oci://us-docker.pkg.dev/the-project-id/dev-blueprints/the-package:v25 \
   252    oci://us-docker.pkg.dev/the-project-id/prod-blueprints/the-package:v25
   253  ```
   254  
   255  To copy a package from a git location to an OCI repo:
   256  
   257  ```
   258  kpt pkg copy \
   259    https://github.com/GoogleCloudPlatform/blueprints.git/catalog/gke@main \
   260    oci://us-docker.pkg.dev/the-project-id/gcp-catalog/gke:latest
   261  ```
   262  
   263  ## User Guide
   264  
   265  ### Creating a package respository
   266  
   267  Before kpt packages can be pushed and pulled as OCI images, a suitable repository 
   268  must be created. Google Artifact Registry and Google Container Registry are both
   269  excellent choices.
   270  
   271  ```shell
   272  # Choose names and locations
   273  LOCATION="us"
   274  PROJECT_ID="kpt-demo-73823"
   275  REPOSITORY_NAME="blueprints"
   276  
   277  # Base name for any images in this repository
   278  REPOSITORY="${LOCATION}-docker.pkg.dev/${PROJECT_ID}/${REPOSITORY_NAME}"
   279  
   280  # Create the repository
   281  gcloud artifacts repositories create --location="${LOCATION}" --repository-format=docker --project="${PROJECT_ID}" "${REPOSITORY_NAME}"
   282  ```
   283  
   284  ### Creating and pushing a new package
   285  
   286  Creating a new package is no different. But when ready to publish, the `kpt pkg push` command 
   287  is used instead of source control operations.
   288  
   289  ```shell
   290  # A package in a new directory
   291  mkdir hello-world
   292  kpt pkg init hello-world --description="A simple blueprint"
   293  
   294  # Store the contents in the repository, tagged as v1
   295  kpt pkg push hello-world --image=${REPOSITORY}/hello-world:v1
   296  
   297  # The local files are not needed any more, pushing has stored them all
   298  rm -r hello-world
   299  ```
   300  
   301  ### Pulling and updating a package
   302  
   303  Because the package folder was discarded earlier, the `kpt pkg pull` command 
   304  is used to place the contents of a particular version at a location. These set of
   305  commands may be run in `Cloud Build` steps as well, if you are automating the
   306  publication of packages as part of a CI/CD process.
   307  
   308  ```shell
   309  # Recreate the folder and extract the pulled image
   310  kpt pkg pull hello-world --image=${REPOSITORY}/hello-world:v1
   311  
   312  # Add a sub-package from a git repo
   313  kpt pkg get https://github.com/GoogleCloudPlatform/blueprints.git/catalog/bucket hello-world/my-bucket
   314  
   315  # Render to be sure contents are hydrated, and push to a new version tag
   316  kpt pkg render hello-world
   317  kpt pkg push hello-world --image=${REPOSITORY}/hello-world:v2
   318  ```
   319  
   320  Similar to container image tags, the package image tags like `:v1` and `:v2` above may be used any
   321  number of ways based on your preferred workflows. The tag `:latest` is used by default if all
   322  pulls and pushes should read from and overwrite the same location. Semantic tags like `:draft` and
   323  environmental tags like `:dev`, `:qa`, and `:prod` may be also used.
   324  
   325  No matter what tags are used, the image repository and `kpt` cli will treat them as an alphanumeric 
   326  label.
   327  
   328  ### Using OCI repository as an upstream
   329  
   330  In addition to providing storage for packages, an OCI registry may also 
   331  be used as a source of upstream images to clone. The `oci://` prefix on this
   332  command is required to ensure 
   333  
   334  ```shell
   335  # Clone the hello-world v1 blueprint into a new folder
   336  kpt pkg get oci://${REPOSITORY}/hello-world:v1 greetings-planet
   337  
   338  # Push the results to the repository, using default `latest` tag in this example
   339  kpt pkg push greetings-planet --image=${REPOSITORY}/greetings-planet
   340  ```
   341  
   342  Looking in the `greetings-planet/Kptfile` at this point will show that
   343  the `hello-world:v1` image is the `upstream`, and the `upstreamLock` will show
   344  exactly the digest that this clone is up-to-date with. 
   345  
   346  ```yaml
   347  apiVersion: kpt.dev/v1
   348  kind: Kptfile
   349  metadata:
   350    name: greetings-planet
   351  upstream:
   352    type: oci
   353    oci:
   354      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world:v1
   355    updateStrategy: resource-merge
   356  upstreamLock:
   357    type: oci
   358    oci:
   359      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world@sha256:1632e00af3fe858c5e3b3f9e75c16e6327449155
   360  ```
   361  
   362  ### Adding a subpackage from an OCI upstream subfolder
   363  
   364  Often a folder inside a package is meant to be used as a way to create "more of the same".
   365  
   366  To use an OCI image subfolder as the source of a subpackage, the path is added in a 
   367  way that's distinct from the image itself.
   368  
   369  ```shell
   370  # Clone the hello-world v1 blueprint into a new folder
   371  kpt pkg get oci://${REPOSITORY}/hello-world//my-bucket:v1 greetings-planet/another-bucket
   372  ```
   373  
   374  The `greetings-planet` package will now contain both a `greetings-planet/my-bucket` as well as a
   375  `greetings-planet/another-bucket` folder. The contents in locations will now both receive changes
   376  when the upstream `hello-world/my-bucket` is updated.
   377  
   378  ### Updating package with upstream changes
   379  
   380  The value of the upstream image tag is used to `kpt pkg update` to a specific version.
   381  This works no matter if the tag appears to look like a version number or not.
   382  
   383  ```shell
   384  # Update the greetings-planet by applying any differences between the upstreamLock digest and the `v2` tag
   385  kpt pkg update greetings-planet@v2
   386  
   387  # Overwrite the `greetings-planet:latest` image with the folder contents
   388  kpt pkg push greetings-planet --image=${REPOSITORY}/greetings-planet
   389  ```
   390  
   391  The Kptfile will now show that the `upstream` and `upstreamLock` have both been changed.
   392  
   393  ```yaml
   394  apiVersion: kpt.dev/v1
   395  kind: Kptfile
   396  metadata:
   397    name: greetings-planet
   398  upstream:
   399    type: oci
   400    oci:
   401      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world:v2
   402    updateStrategy: resource-merge
   403  upstreamLock:
   404    type: oci
   405    oci:
   406      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world@sha256:a6f1ed69c6ab51e2a148f6d4926bccb24c843887
   407  ```
   408  
   409  Just like with a git upstream, it is also possible to `kpt pkg update` without providing a different
   410  tag or version value. This is similar to pulling from a remote branch where the name of the branch does 
   411  change but the latest commit on that branch is a different hash.
   412  
   413  In that case the Kptfile `upstream` tag will not change, but if that tag has been overwritten with
   414  new contents then the differences between `upstreamLock` and current contents will be applied to the local copy,
   415  and the `upstreamLock` will be changed tot he current digest.
   416  
   417  ### Updating package to a specific upstream push
   418  
   419  Much like you can `kpt pkg update` to a specific commit hash in git, you can update to an
   420  exact image digest with OCI. This is an even more precise reference than by a version number because,
   421  like git commits, the image digest is based on the package file contents and cannot be forged or altered.
   422  
   423  ```shell
   424  # Update instread to an exact image digest of the upstream location
   425  kpt pkg update greetings-planet@sha256:3b42daa41102fa83bce07bd82a72edcd691868d6
   426  ```
   427  
   428  The resulting Kptfile in the local folder will look like this.
   429  
   430  ```yaml
   431  apiVersion: kpt.dev/v1
   432  kind: Kptfile
   433  metadata:
   434    name: greetings-planet
   435  upstream:
   436    type: oci
   437    oci:
   438      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world@sha256:3b42daa41102fa83bce07bd82a72edcd691868d6
   439    updateStrategy: resource-merge
   440  upstreamLock:
   441    type: oci
   442    oci:
   443      image: us-docker.pkg.dev/kpt-demo-73823/blueprints/hello-world@sha256:3b42daa41102fa83bce07bd82a72edcd691868d6
   444  ```
   445  
   446  ## Open Issues/Questions
   447  
   448  > Please list any open questions here in the following format:
   449  > 
   450  > ### \<Question\>
   451  > 
   452  > Resolution: Please list the resolution if resolved during the design process or
   453  > specify __Not Yet Resolved__
   454  
   455  ### What additional container registries should be supported?
   456  
   457  The protocol and information is the same. It would mainly be a question
   458  of how the credentials for the call are provided.
   459  
   460  ### What commands on exising kpt binary will work on Kptfile with `oci`
   461  
   462  It may be possible `kpt` commands that to not process `upstream` structures
   463  may not require update to work correctly. `kpt fn` and `kpt live` commands
   464  should be tested to see how they behave.
   465  
   466  ## Alternatives Considered
   467  
   468  If there is an industry precedent or alternative approaches please list them 
   469  here as well as citing *why* you decided not to pursue those paths.
   470  
   471  ### \<Approach\>
   472  
   473  Links and description of the approach, the pros and cons identified during the 
   474  design.