k8s.io/registry.k8s.io@v0.3.1/cmd/archeio/README.md (about)

     1  # Archeio
     2  
     3  αρχείο (archeío) is roughly? Greek for "registry"
     4  
     5  This binary is a custom redirect/alias server for the Kubernetes project's 
     6  OCI artifact ("docker image") hosting.
     7  
     8  Current design details will be detailed here as they mature.
     9  
    10  For more current details see also: https://github.com/kubernetes/k8s.io/wiki/New-Registry-url-for-Kubernetes-(registry.k8s.io)
    11  
    12  **NOTE**: The code in this binary is **not** intended to be fully reusable,
    13  it is the most efficient and expedient implementation of
    14  Kubernetes SIG K8s-Infra's needs. However, some of the packages under
    15  [`pkg/`](./../../pkg/) may be useful if you have a similar problem,
    16  and they should be pretty generalized and re-usable.
    17  
    18  Please also see the main repo README and in particular [the "stability" note](../../README.md#stability).
    19  
    20  -----
    21  
    22  For a rough TLDR of the current design:
    23  
    24  - Images are hosted primarily in [Artifact Registry][artifact-registry] instances as the source of truth
    25    - Why AR?
    26      - Kubernetes has non-trivial tooling for managing, securing, monitoring etc. of our registries using GCP APIs that fill gaps in the OCI distribution spec, and otherwise allow synchronization and discovery of content, notably https://github.com/opencontainers/distribution-spec/issues/222
    27      - We have directly migrated all of this infrastructure from GCR (previously k8s.gcr.io) to AR with ~no code changes
    28      - Until recently our infrastructure funding has primarily been GCP (now matched by AWS) and GCP remains a significant source of funding
    29  - Mirrors *of content-addressed* layers are hosted in S3 buckets in AWS
    30    - Why mirror only [Content-Addresed][content-addressed] Image APIs?
    31      - Image Layers (which are content-addressed) in particular are the vast majority of bandwidth == costs. Serving them from one cloud to another is expensive.
    32      - Content Addressed APIs are relatively safe to serve from untrusted or less-secured hosts, since all major clients confirming the result matches the requested digest
    33  - We detect client IP address and match it to Cloud Providers we use in order to serve content-addressed API calls from the most local and cost-effective copy
    34  - Other API calls (getting and listing tags etc) are redirected to the regional upstream source-of-truth Artifact Registries
    35  
    36  This allows us to offload substantial bandwidth securely, while not having to fully
    37  implement a registry from scratch and maintaining the project's existing security
    38  controls around the GCP source registries (implemented elsewhere in the Kubernetes project).
    39  We only re-route some content-addressed storage requests to additional hosts.
    40  
    41  Clients do still need to either pull by digest (`registry.k8s.io/foo@sha256:...`),
    42  verify sigstore signatures, or else trust that the redirector instance is secure,
    43  but not the S3 buckets or additional future content-addressed storage hosts.
    44  
    45  We maintain relatively tight control over the production redirector instance and
    46  the source registries. Very few contributors have access to this infrastructure.
    47  
    48  We have a development instance at https://registry-sandbox.k8s.io which is
    49  *not* supported for any usage outside of the development of this project and 
    50  may or may not be working at any given time. 
    51  Changes will be deployed there before we deploy to production, and be exercised
    52  by a subset of Kubernetes' own CI.
    53  
    54  Mirroring content-addressed content to object storage is currently handled by [`cmd/geranos`](./../geranos).
    55  
    56  For more detail see:
    57  - How requests are handled: [docs/request-handling.md](./docs/request-handling.md)
    58  - How we test registry.k8s.io changes: [docs/testing.md](./docs/testing.md)
    59  - For IP matching info for both AWS and GCP ranges: [`pkg/net/cloudcidrs`](./../../pkg/net/cloudcidrs)
    60  
    61  ----
    62  
    63  Historical Context:
    64  
    65  **You must join one of the open community mailinglists below to access the original design doc.**
    66  
    67  The original design doc is shared with members of
    68  [dev@kubernetes.io](https://groups.google.com/a/kubernetes.io/g/dev), 
    69  anyone can join this list and gain access to read
    70  [the document](https://docs.google.com/document/d/1yNQ7DaDE5LbDJf9ku82YtlKZK0tcg5Wpk9L72-x2S2k/). 
    71  It is not accessible to accounts that are not members of the Kubernetes mailinglist
    72  due to organization constraints and joining the list is the **only** way to gain
    73  access. See https://git.k8s.io/community/community-membership.md
    74  
    75  It is not fully reflective of the current design anyhow, but some may find it
    76  interesting.
    77  
    78  Originally the project primarily needed to take advantage of an offer from Amazon
    79  to begin paying for AWS user traffic, which was the majority of our traffic and
    80  cost a lot due to high amounts of egress traffic between GCP<>AWS.
    81  
    82  In addition, in order to get the registry.k8s.io domain in place, initially we
    83  only served a trivial redirect to the existing registry 
    84  (https://k8s.gcr.io), so we could safely start to move users / clients to the new domain
    85  that would eventually serve the more complex version.
    86  
    87  Since then we've redesigned a bit to make populating content into AWS async and
    88  not blocked on the image promoter, as well as extending our Geo-routing approach
    89  to detect and route users on dimensions other than "is a known AWS IP in a known AWS region".
    90  
    91  More changes will come in the future, and these implementation details while documented
    92  **CANNOT** be depended on.
    93  
    94  [artifact-registry]: https://cloud.google.com/artifact-registry
    95  [content-addressed]: https://en.wikipedia.org/wiki/Content-addressable_storage