k8s.io/registry.k8s.io@v0.3.1/docs/debugging.md (about)

     1  # Debugging issues with registry.k8s.io
     2  
     3  The registry.k8s.io is a Kubernetes container images registry that behaves generally like an [OCI](https://github.com/opencontainers/distribution-spec) compliant registry. Since registry.k8s.io is a proxy routing traffic to the closest available source, you will need connectivity to several domains to download images. It is also best for performance to create your own registry mirror.
     4  
     5  When you are debugging issues, make sure you run these commands on the node that is attempting to run images. Things may be working fine on your laptop, but not on the Kubernetes node.
     6  
     7  <!--TODO: identify what this looks like on s3 etc.-->
     8  > **Note**
     9  >
    10  > If you see a [403 error][http-403] like `Your client does not have permission to get URL`,
    11  > this error is not specific to the Kubernetes project / registry.k8s.io and
    12  > you need to work with your cloud vendor / service provider to get unblocked
    13  > by GCP.
    14  >
    15  > Please file an issue with your provider, the Kubernetes project does not
    16  > control this and it is not specific to us.
    17  
    18  ## Verify DNS resolution
    19  
    20  You may use the `dig` or `nslookup` command to validate DNS resolution of the registry.k8s.io domain or any domain it references. For example, running `dig registry.k8s.io` should return an answer that contains:
    21  
    22  ```log
    23  ;; ANSWER SECTION:
    24  registry.k8s.io.	3600	IN	A	34.107.244.51
    25  ```
    26  
    27  If you cannot successfully resolve a domain, check your DNS configuration, often configured in your resolv.conf file.
    28  
    29  ## Verify HTTP connectivity
    30  
    31  You may use `curl` or `wget` to validate HTTP connectivity. For example, running `curl -v https://registry.k8s.io/v2/` should return an answer that contains:
    32  
    33  ```log
    34  < HTTP/2 200 
    35  < docker-distribution-api-version: registry/2.0
    36  < x-cloud-trace-context: ca200d1c5a504b919e999b0cf80e3b71
    37  < date: Fri, 17 Mar 2023 09:13:18 GMT
    38  < content-type: text/html
    39  < server: Google Frontend
    40  < content-length: 0
    41  < via: 1.1 google
    42  < alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
    43  < 
    44  ```
    45  
    46  If do not have HTTP connectivity, check your firewall or HTTP proxy settings.
    47  
    48  ## Verify image repositories and tags
    49  
    50  You may use `crane` or `oras` to validate the available tags in the registry. You may also use [https://explore.ggcr.dev/?repo=registry.k8s.io](https://explore.ggcr.dev/?repo=registry.k8s.io) to verify the existence of an image repository and tag, but these commands will verify your node can access them. For example, the `crane ls registry.k8s.io/pause` or `oras repo tags registry.k8s.io/pause` will return:
    51  
    52  ```log
    53  0.8.0
    54  1.0
    55  2.0
    56  3.0
    57  3.1
    58  3.2
    59  3.3
    60  3.4.1
    61  3.5
    62  3.6
    63  3.7
    64  3.8
    65  3.9
    66  go
    67  latest
    68  sha256-7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097.sig
    69  sha256-9001185023633d17a2f98ff69b6ff2615b8ea02a825adffa40422f51dfdcde9d.sig
    70  test
    71  test2
    72  ```
    73  
    74  ## Verify image pulls
    75  
    76  Since registry.k8s.io proxies image components to the nearest source, you should validate the ability to pull images. The ability to pull images should be tested on the machine running the image which will often be a node in your Kubernetes cluster. The location where you pull image components from depends on the source IP address of the node.
    77  
    78  You may use commands such as `crane`, `oras`, `crictl` or `docker` to verify the ability to pull an image. If you run the command `crane pull --verbose registry.k8s.io/pause:3.9 pause.tgz` for example, you will see it query registry.k8s.io first and then at least two other domains to download the image. If things are working correctly and you ran `crane pull --verbose registry.k8s.io/pause:3.9 pause.tgz 2>&1 | grep 'GET https'` (from Colorado):
    79  
    80  ```log
    81  2023/03/17 04:45:48 --> GET https://registry.k8s.io/v2/
    82  2023/03/17 04:45:48 --> GET https://registry.k8s.io/v2/pause/manifests/3.9
    83  2023/03/17 04:45:48 --> GET https://us-west1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.9
    84  2023/03/17 04:45:48 --> GET https://registry.k8s.io/v2/pause/manifests/sha256:8d4106c88ec0bd28001e34c975d65175d994072d65341f62a8ab0754b0fafe10
    85  2023/03/17 04:45:48 --> GET https://us-west1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/sha256:8d4106c88ec0bd28001e34c975d65175d994072d65341f62a8ab0754b0fafe10
    86  2023/03/17 04:45:49 --> GET https://registry.k8s.io/v2/pause/blobs/sha256:e6f1816883972d4be47bd48879a08919b96afcd344132622e4d444987919323c
    87  2023/03/17 04:45:49 --> GET https://prod-registry-k8s-io-us-west-2.s3.dualstack.us-west-2.amazonaws.com/containers/images/sha256%3Ae6f1816883972d4be47bd48879a08919b96afcd344132622e4d444987919323c
    88  2023/03/17 04:45:49 --> GET https://registry.k8s.io/v2/pause/blobs/sha256:61fec91190a0bab34406027bbec43d562218df6e80d22d4735029756f23c7007 [body redacted: omitting binary blobs from logs]
    89  2023/03/17 04:45:49 --> GET https://prod-registry-k8s-io-us-west-2.s3.dualstack.us-west-2.amazonaws.com/containers/images/sha256%3A61fec91190a0bab34406027bbec43d562218df6e80d22d4735029756f23c7007 [body redacted: omitting binary blobs from logs]
    90  ```
    91  
    92  From my location, the pull command accesses registry.k8s.io, us-west1-docker.pkg.dev and prod-registry-k8s-io-us-west-2.s3.dualstack.us-west-2.amazonaws.com. You will need to have DNS and HTTP access to these domains on your node to pull images.
    93  
    94  It's also possible to run these commands on your node if you don't have SSH access by using `kubectl run`:
    95  
    96  ```sh
    97  kubectl run --rm -it crane --restart=Never --image=gcr.io/go-containerregistry/crane --overrides='{"spec": {"hostNetwork":true}}' -- pull --verbose registry.k8s.io/pause:3.9 /dev/null
    98  ```
    99  
   100  ## Example Logs
   101  
   102  If there are problems accessing registry.k8s.io, you are likely to see failures starting pods with an `ErrImagePull` status. The `kubectl describe pod` command may give you more details:
   103  
   104  ```log
   105    Warning  Failed     2s (x2 over 16s)  kubelet            Failed to pull image "registry.k8s.io/pause:3.10": rpc error: code = NotFound desc = failed to pull and unpack image "registry.k8s.io/pause:3.10": failed to resolve reference "registry.k8s.io/pause:3.10": registry.k8s.io/pause:3.10: not found
   106    Warning  Failed     2s (x2 over 16s)  kubelet            Error: ErrImagePull
   107  ```
   108  
   109  If you were to check your kubelet log for example, you might see (with something like `journalctl -xeu kubelet`):
   110  
   111  ```log
   112  Mar 17 11:33:05 kind-control-plane kubelet[804]: E0317 11:33:05.192844     804 kuberuntime_manager.go:862] container &Container{Name:my-puase-container,Image:registry.k8s.io/pause:3.10,Command:[],Args:[],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:kube-api-access-4bv66,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod my-pause_default(4b642716-1dba-44d4-833b-1eccd6b6ca7a): ErrImagePull: rpc error: code = NotFound desc = failed to pull and unpack image "registry.k8s.io/pause:3.10": failed to resolve reference "registry.k8s.io/pause:3.10": registry.k8s.io/pause:3.10: not found
   113  ```
   114  
   115  You may see similar errors in the containerd log (with something like `journalctl -xeu containerd`):
   116  
   117  ```log
   118  Mar 17 11:33:04 kind-control-plane containerd[224]: time="2023-03-17T11:33:04.658642300Z" level=info msg="PullImage \"registry.k8s.io/pause:3.10\""
   119  Mar 17 11:33:05 kind-control-plane containerd[224]: time="2023-03-17T11:33:05.189169600Z" level=info msg="trying next host - response was http.StatusNotFound" host=registry.k8s.io
   120  Mar 17 11:33:05 kind-control-plane containerd[224]: time="2023-03-17T11:33:05.191777300Z" level=error msg="PullImage \"registry.k8s.io/pause:3.10\" failed" error="rpc error: code = NotFound desc = failed to pull and unpack image \"registry.k8s.io/pause:3.10\": failed to resolve reference \"registry.k8s.io/pause:3.10\": registry.k8s.io/pause:3.10: not found"
   121  ```
   122  
   123  ## Example issues
   124  
   125  - https://github.com/kubernetes/registry.k8s.io/issues/137#issuecomment-1376574499
   126  - https://github.com/kubernetes/registry.k8s.io/issues/174#issuecomment-1467646821
   127  - https://github.com/kubernetes-sigs/kind/issues/1895#issuecomment-1468991168
   128  - https://github.com/kubernetes/registry.k8s.io/issues/174#issuecomment-1467646821
   129  - https://github.com/kubernetes/registry.k8s.io/issues/154#issuecomment-1435028502
   130  
   131  [http-403]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403