github.com/openshift/installer@v1.4.17/docs/user/troubleshooting.md (about) 1 # Installer Troubleshooting 2 3 Unfortunately, there will always be some cases where OpenShift fails to install properly. In these events, it is helpful to understand the likely failure modes as well as how to troubleshoot the failure. 4 5 If you have a Red Hat subscription for OpenShift, see [here][access-article] for support. 6 7 ## Common Failures 8 9 ### No Worker Nodes Created 10 11 The installer doesn't provision worker nodes directly, like it does with master nodes. Instead, the cluster relies on the Machine API Operator, which is able to scale up and down nodes on supported platforms. If more than fifteen to twenty minutes (depending on the speed of the cluster's Internet connection) have elapsed without any workers, the Machine API Operator needs to be investigated. 12 13 The status of the Machine API Operator can be checked by running the following command from the machine used to install the cluster: 14 15 ```sh 16 oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig --namespace=openshift-machine-api get deployments 17 ``` 18 19 If the API is unavailable, that will need to be [investigated first](#kubernetes-api-is-unavailable). 20 21 The previous command should yield output similar to the following: 22 23 ``` 24 NAME READY UP-TO-DATE AVAILABLE AGE 25 cluster-autoscaler-operator 1/1 1 1 86m 26 machine-api-controllers 1/1 1 1 85m 27 machine-api-operator 1/1 1 1 86m 28 ``` 29 30 Check the machine controller logs with the following command. 31 32 ```sh 33 oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig --namespace=openshift-machine-api logs deployments/machine-api-controllers --container=machine-controller 34 ``` 35 36 ### Kubernetes API is Unavailable 37 38 When the Kubernetes API is unavailable, the master nodes will need to checked to ensure that they are running the correct components. This requires SSH access so it is necessary to include an administrator's SSH key during the installation. 39 40 If SSH access to the master nodes isn't available, that will need to be [investigated next](#unable-to-ssh-into-master-nodes). 41 42 The first thing to check is to make sure that etcd is running on each of the masters. The etcd logs can be viewed by running the following on each master node: 43 44 ```sh 45 sudo crictl logs $(sudo crictl ps --pod=$(sudo crictl pods --name=etcd-member --quiet) --quiet) 46 ``` 47 48 If the previous command fails, ensure that the etcd pods have been created by the Kubelet: 49 50 ```sh 51 sudo crictl pods --name=etcd-member 52 ``` 53 54 If no pods are shown, etcd will need to be [investigated](#etcd-is-not-running). 55 56 ### Unable to SSH into Master Nodes 57 58 For added security, SSH isn't available from the Internet by default. There are several options for enabling this functionality: 59 60 - Create a bastion host that is accessible from the Internet and has access to the cluster. If the bootstrap machine hasn't been automatically destroyed yet, it can double as a temporary bastion host since it is given a public IP address. 61 - Configure network peering or a VPN to allow remote access to the private network. 62 63 In order to SSH into the master nodes as user `core`, it is necessary to include an administrator's SSH key during the installation. 64 The selected key, if any, will be added to the `core` user's `~/.ssh/authorized_keys` via [Ignition](https://github.com/coreos/ignition) and is not configured via platform-specific approaches like [AWS key pairs][aws-key-pairs]. 65 See [here][machine-config-daemon-ssh-keys] for information about managing SSH keys via the machine-config daemon. 66 67 If SSH isn't able to connect to the nodes, they may be waiting on the bootstrap node before they can boot. The initial set of master nodes fetch their boot configuration (the Ignition Config) from the bootstrap node and will not complete until they successfully do so. Check the console output of the nodes to determine if they have successfully booted or if they are waiting for Ignition to fetch the remote config. 68 69 Master nodes waiting for Ignition is indicative of problems on the bootstrap node. SSH into the bootstrap node to [investigate further](#troubleshooting-the-bootstrap-node). 70 71 ### Troubleshooting the Bootstrap Node 72 73 If the bootstrap node isn't available, first double check that it hasn't been automatically removed by the installer. If it's not being created in the first place, the installer will need to be [troubleshot](#installer-fails-to-create-resources). 74 75 The most important thing to look at on the bootstrap node is `bootkube.service`. The logs can be viewed in two different ways: 76 77 1. If SSH is available, the following command can be run on the bootstrap node: `journalctl --unit=bootkube.service` 78 2. Regardless of whether or not SSH is available, the following command can be run: `curl --insecure --cert ${INSTALL_DIR}/tls/journal-gatewayd.crt --key ${INSTALL_DIR}/tls/journal-gatewayd.key 'https://${BOOTSTRAP_IP}:19531/entries?follow&_SYSTEMD_UNIT=bootkube.service'` 79 80 The installer can also gather a log bundle from the bootstrap host using SSH as describe in [troubleshooting bootstrap](./troubleshootingbootstrap.md) document. 81 82 ### etcd Is Not Running 83 84 During the bootstrap process, the Kubelet may emit errors like the following: 85 86 ``` 87 Error signing CSR provided in request from agent: error parsing profile: invalid organization 88 ``` 89 90 This is safe to ignore and merely indicates that the etcd bootstrapping is still in progress. etcd makes use of the CSR APIs provided by Kubernetes to issue and rotate its TLS assets, but these facilities aren't available before etcd has formed quorum. In order to break this dependency loop, a CSR service is run on the bootstrap node which only signs CSRs for etcd. When the Kubelet attempts to go through its TLS bootstrap, it is initially denied because the service it is communicating with only respects CSRs from etcd. After etcd starts and the control plane begins bootstrapping, an approver is scheduled and the Kubelet CSR requests will succeed. 91 92 ### Installer Fails to Create Resources 93 94 The easiest way to get more debugging information from the installer is to check the log file (`.openshift_install.log`) in the install directory. Regardless of the logging level specified, the installer will write its logs in case they need to be inspected retroactively. 95 96 ### Installer Fails to Initialize the Cluster 97 98 The installer uses the [cluster-version-operator] to create all the components of an OpenShift cluster. When the installer fails to initialize the cluster, the most important information can be fetched by looking at the [ClusterVersion][clusterversion] and [ClusterOperator][clusteroperator] objects: 99 100 1. Inspecting the `ClusterVersion` object. 101 102 ```console 103 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusterversion -oyaml 104 apiVersion: config.openshift.io/v1 105 kind: ClusterVersion 106 metadata: 107 creationTimestamp: 2019-02-27T22:24:21Z 108 generation: 1 109 name: version 110 resourceVersion: "19927" 111 selfLink: /apis/config.openshift.io/v1/clusterversions/version 112 uid: 6e0f4cf8-3ade-11e9-9034-0a923b47ded4 113 spec: 114 channel: stable-4.1 115 clusterID: 5ec312f9-f729-429d-a454-61d4906896ca 116 status: 117 availableUpdates: null 118 conditions: 119 - lastTransitionTime: 2019-02-27T22:50:30Z 120 message: Done applying 4.1.1 121 status: "True" 122 type: Available 123 - lastTransitionTime: 2019-02-27T22:50:30Z 124 status: "False" 125 type: Failing 126 - lastTransitionTime: 2019-02-27T22:50:30Z 127 message: Cluster version is 4.1.1 128 status: "False" 129 type: Progressing 130 - lastTransitionTime: 2019-02-27T22:24:31Z 131 message: 'Unable to retrieve available updates: unknown version 4.1.1 132 reason: RemoteFailed 133 status: "False" 134 type: RetrievedUpdates 135 desired: 136 image: registry.svc.ci.openshift.org/openshift/origin-release@sha256:91e6f754975963e7db1a9958075eb609ad226968623939d262d1cf45e9dbc39a 137 version: 4.1.1 138 history: 139 - completionTime: 2019-02-27T22:50:30Z 140 image: registry.svc.ci.openshift.org/openshift/origin-release@sha256:91e6f754975963e7db1a9958075eb609ad226968623939d262d1cf45e9dbc39a 141 startedTime: 2019-02-27T22:24:31Z 142 state: Completed 143 version: 4.1.1 144 observedGeneration: 1 145 versionHash: Wa7as_ik1qE= 146 ``` 147 148 Some of most important [conditions][cluster-operator-conditions] to take note are `Failing`, `Available` and `Progressing`. You can look at the conditions using: 149 150 ```console 151 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusterversion version -o=jsonpath='{range .status.conditions[*]}{.type}{" "}{.status}{" "}{.message}{"\n"}{end}' 152 Available True Done applying 4.1.1 153 Failing False 154 Progressing False Cluster version is 4.0.0-0.alpha-2019-02-26-194020 155 RetrievedUpdates False Unable to retrieve available updates: unknown version 4.1.1 156 ``` 157 158 2. Inspecting the `ClusterOperator` object. 159 160 You can get the status of all the cluster operators: 161 162 ```console 163 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator 164 NAME VERSION AVAILABLE PROGRESSING FAILING SINCE 165 cluster-autoscaler True False False 17m 166 cluster-storage-operator True False False 10m 167 console True False False 7m21s 168 dns True False False 31m 169 image-registry True False False 9m58s 170 ingress True False False 10m 171 kube-apiserver True False False 28m 172 kube-controller-manager True False False 21m 173 kube-scheduler True False False 25m 174 machine-api True False False 17m 175 machine-config True False False 17m 176 marketplace-operator True False False 10m 177 monitoring True False False 8m23s 178 network True False False 13m 179 node-tuning True False False 11m 180 openshift-apiserver True False False 15m 181 openshift-authentication True False False 20m 182 openshift-cloud-credential-operator True False False 18m 183 openshift-controller-manager True False False 10m 184 openshift-samples True False False 8m42s 185 operator-lifecycle-manager True False False 17m 186 service-ca True False False 30m 187 ``` 188 189 To get detailed information on why an individual cluster operator is `Failing` or not yet `Available`, you can check the status of that individual operator, for example `monitoring`: 190 191 ```console 192 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator monitoring -oyaml 193 apiVersion: config.openshift.io/v1 194 kind: ClusterOperator 195 metadata: 196 creationTimestamp: 2019-02-27T22:47:04Z 197 generation: 1 198 name: monitoring 199 resourceVersion: "24677" 200 selfLink: /apis/config.openshift.io/v1/clusteroperators/monitoring 201 uid: 9a6a5ef9-3ae1-11e9-bad4-0a97b6ba9358 202 spec: {} 203 status: 204 conditions: 205 - lastTransitionTime: 2019-02-27T22:49:10Z 206 message: Successfully rolled out the stack. 207 status: "True" 208 type: Available 209 - lastTransitionTime: 2019-02-27T22:49:10Z 210 status: "False" 211 type: Progressing 212 - lastTransitionTime: 2019-02-27T22:49:10Z 213 status: "False" 214 type: Failing 215 extension: null 216 relatedObjects: null 217 version: "" 218 ``` 219 220 Again, the cluster operators also publish [conditions][cluster-operator-conditions] like `Failing`, `Available` and `Progressing` that can help user provide information on the current state of the operator: 221 222 ```console 223 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator monitoring -o=jsonpath='{range .status.conditions[*]}{.type}{" "}{.status}{" "}{.message}{"\n"}{end}' 224 Available True Successfully rolled out the stack 225 Progressing False 226 Failing False 227 ``` 228 229 Each clusteroperator also publishes the list of objects owned by the cluster operator. To get that information: 230 231 ```console 232 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator kube-apiserver -o=jsonpath='{.status.relatedObjects}' 233 [map[resource:kubeapiservers group:operator.openshift.io name:cluster] map[group: name:openshift-config resource:namespaces] map[group: name:openshift-config-managed resource:namespaces] map[group: name:openshift-kube-apiserver-operator resource:namespaces] map[group: name:openshift-kube-apiserver resource:namespaces]] 234 ``` 235 236 **NOTE:** Failing to initialize the cluster is usually not a fatal failure in terms of cluster creation as the user can look at the failures from `ClusterOperator` to debug failures for a cluster operator and take actions which can allow `cluster-version-operator` to make progress. 237 238 ### Installer Fails to Fetch Console URL 239 240 The installer fetches the URL for OpenShift console using the [route][route-object] in `openshift-console` namespace. If the installer fails the fetch the URL for the console: 241 242 1. Check if the console router is `Available` or `Failing` 243 244 ```console 245 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get clusteroperator console -oyaml 246 apiVersion: config.openshift.io/v1 247 kind: ClusterOperator 248 metadata: 249 creationTimestamp: 2019-02-27T22:46:57Z 250 generation: 1 251 name: console 252 resourceVersion: "19682" 253 selfLink: /apis/config.openshift.io/v1/clusteroperators/console 254 uid: 960364aa-3ae1-11e9-bad4-0a97b6ba9358 255 spec: {} 256 status: 257 conditions: 258 - lastTransitionTime: 2019-02-27T22:46:58Z 259 status: "False" 260 type: Failing 261 - lastTransitionTime: 2019-02-27T22:50:12Z 262 status: "False" 263 type: Progressing 264 - lastTransitionTime: 2019-02-27T22:50:12Z 265 status: "True" 266 type: Available 267 - lastTransitionTime: 2019-02-27T22:46:57Z 268 status: "True" 269 type: Upgradeable 270 extension: null 271 relatedObjects: 272 - group: operator.openshift.io 273 name: cluster 274 resource: consoles 275 - group: config.openshift.io 276 name: cluster 277 resource: consoles 278 - group: oauth.openshift.io 279 name: console 280 resource: oauthclients 281 - group: "" 282 name: openshift-console-operator 283 resource: namespaces 284 - group: "" 285 name: openshift-console 286 resource: namespaces 287 versions: null 288 ``` 289 290 2. Manually get the URL for `console` 291 292 ```console 293 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get route console -n openshift-console -o=jsonpath='{.spec.host}' 294 console-openshift-console.apps.adahiya-1.devcluster.openshift.com 295 ``` 296 297 ### Installer Fails to Add Default Ingress Certificate to Kubeconfig 298 299 The installer adds the default ingress certificate to the list of trusted client certificate authorities in `${INSTALL_DIR}/auth/kubeconfig`. If the installer fails to add the ingress certificate to `kubeconfig`, you can fetch the certificate from the cluster using the following command: 300 301 ```console 302 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get configmaps default-ingress-cert -n openshift-config-managed -o=jsonpath='{.data.ca-bundle\.crt}' 303 -----BEGIN CERTIFICATE----- 304 MIIC/TCCAeWgAwIBAgIBATANBgkqhkiG9w0BAQsFADAuMSwwKgYDVQQDDCNjbHVz 305 dGVyLWluZ3Jlc3Mtb3BlcmF0b3JAMTU1MTMwNzU4OTAeFw0xOTAyMjcyMjQ2Mjha 306 Fw0yMTAyMjYyMjQ2MjlaMC4xLDAqBgNVBAMMI2NsdXN0ZXItaW5ncmVzcy1vcGVy 307 YXRvckAxNTUxMzA3NTg5MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA 308 uCA4fQ+2YXoXSUL4h/mcvJfrgpBfKBW5hfB8NcgXeCYiQPnCKblH1sEQnI3VC5Pk 309 2OfNCF3PUlfm4i8CHC95a7nCkRjmJNg1gVrWCvS/ohLgnO0BvszSiRLxIpuo3C4S 310 EVqqvxValHcbdAXWgZLQoYZXV7RMz8yZjl5CfhDaaItyBFj3GtIJkXgUwp/5sUfI 311 LDXW8MM6AXfuG+kweLdLCMm3g8WLLfLBLvVBKB+4IhIH7ll0buOz04RKhnYN+Ebw 312 tcvFi55vwuUCWMnGhWHGEQ8sWm/wLnNlOwsUz7S1/sW8nj87GFHzgkaVM9EOnoNI 313 gKhMBK9ItNzjrP6dgiKBCQIDAQABoyYwJDAOBgNVHQ8BAf8EBAMCAqQwEgYDVR0T 314 AQH/BAgwBgEB/wIBADANBgkqhkiG9w0BAQsFAAOCAQEAq+vi0sFKudaZ9aUQMMha 315 CeWx9CZvZBblnAWT/61UdpZKpFi4eJ2d33lGcfKwHOi2NP/iSKQBebfG0iNLVVPz 316 vwLbSG1i9R9GLdAbnHpPT9UG6fLaDIoKpnKiBfGENfxeiq5vTln2bAgivxrVlyiq 317 +MdDXFAWb6V4u2xh6RChI7akNsS3oU9PZ9YOs5e8vJp2YAEphht05X0swA+X8V8T 318 C278FFifpo0h3Q0Dbv8Rfn4UpBEtN4KkLeS+JeT+0o2XOsFZp7Uhr9yFIodRsnNo 319 H/Uwmab28ocNrGNiEVaVH6eTTQeeZuOdoQzUbClElpVmkrNGY0M42K0PvOQ/e7+y 320 AQ== 321 -----END CERTIFICATE----- 322 ``` 323 324 You can then **prepend** that certificate to `client-certificate-authority-data` field in your `${INSTALL_DIR}/auth/kubeconfig`. 325 326 ## Generic Troubleshooting 327 328 Here are some ideas if none of the [common failures](#common-failures) match your symptoms. 329 For other generic troubleshooting, see [the Kubernetes documentation][kubernetes-debug]. 330 331 ### Check for Pending or Crashing Pods 332 333 This is the generic version of the [*No Worker Nodes Created*](#no-worker-nodes-created) troubleshooting procedure. 334 335 ```console 336 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get pods --all-namespaces 337 NAMESPACE NAME READY STATUS RESTARTS AGE 338 kube-system etcd-member-wking-master-0 1/1 Running 0 46s 339 openshift-machine-api machine-api-operator-586bd5b6b9-bxq9s 0/1 Pending 0 1m 340 openshift-cluster-dns-operator cluster-dns-operator-7f4f6866b9-kzth5 0/1 Pending 0 2m 341 ... 342 ``` 343 344 You can investigate any pods listed as `Pending` with: 345 346 ```sh 347 oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig describe -n openshift-machine-api pod/machine-api-operator-586bd5b6b9-bxq9s 348 ``` 349 350 which may show events with warnings like: 351 352 ``` 353 Warning FailedScheduling 1m (x10 over 1m) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate. 354 ``` 355 356 You can get the image used for a crashing pod with: 357 358 ```console 359 $ oc --kubeconfig=${INSTALL_DIR}/auth/kubeconfig get pod -o "jsonpath={range .status.containerStatuses[*]}{.name}{'\t'}{.state}{'\t'}{.image}{'\n'}{end}" -n openshift-machine-api machine-api-operator-586bd5b6b9-bxq9s 360 machine-api-operator map[running:map[startedAt:2018-11-13T19:04:50Z]] registry.svc.ci.openshift.org/openshift/origin-v4.0-20181113175638@sha256:c97d0b53b98d07053090f3c9563cfd8277587ce94f8c2400b33e246aa08332c7 361 ``` 362 363 And you can see where that image comes from with: 364 365 ```console 366 $ oc adm release info registry.svc.ci.openshift.org/openshift/origin-release:v4.0-20181113175638 --commits 367 Name: v4.0-20181113175638 368 Digest: sha256:58196e73cc7bbc16346483d824fb694bf1a73d517fe13f6b5e589a7e0e1ccb5b 369 Created: 2018-11-13 09:56:46 -0800 PST 370 OS/Arch: linux/amd64 371 Manifests: 121 372 373 Images: 374 NAME REPO COMMIT 375 ... 376 machine-api-operator https://github.com/openshift/machine-api-operator e681e121e15d2243739ad68978113a07aa35c6ae 377 ... 378 ``` 379 380 ### One or more nodes are never Ready (Network / CNI issues) 381 382 You might see that one or more nodes are never ready, e.g 383 384 ```console 385 $ kubectl get nodes 386 NAME STATUS ROLES AGE VERSION 387 ip-10-0-27-9.ec2.internal NotReady master 29m v1.11.0+d4cacc0 388 ... 389 ``` 390 391 This usually means that, for whatever reason, networking is not available on the node. You can confirm this by looking at the detailed output of the node: 392 393 ```console 394 $ kubectl describe node ip-10-0-27-9.ec2.internal 395 ... (lots of output skipped) 396 'runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni config uninitialized' 397 ``` 398 399 The first thing to determine is the status of the SDN. The SDN deploys three daemonsets: 400 - *sdn-controller*, a control-plane component 401 - *sdn*, the node-level networking daemon 402 - *ovs*, the OpenVSwitch management daemon 403 404 All 3 must be healthy (though only a single `sdn-controller` needs to be running). `sdn` and `ovs` must be running on every node, and DESIRED should equal AVAILABLE. On a healthy 2-node cluster you would see: 405 406 ```console 407 $ kubectl -n openshift-sdn get daemonsets 408 NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE 409 ovs 2 2 2 2 2 beta.kubernetes.io/os=linux 2h 410 sdn 2 2 2 2 2 beta.kubernetes.io/os=linux 2h 411 sdn-controller 1 1 1 1 1 node-role.kubernetes.io/master= 2h 412 ``` 413 414 If, instead, you get a different error message: 415 416 ```console 417 $ kubectl -n openshift-sdn get daemonsets 418 No resources found. 419 ``` 420 421 This means the network-operator didn't run. Skip ahead [to that section](#debugging-the-cluster-network-operator). Otherwise, let's debug the SDN. 422 423 #### Debugging the openshift-sdn 424 425 On the NotReady node, you need to find out which pods, if any, are in a bad state. Be sure to substitute in the correct `spec.nodeName` (or just remove it). 426 427 ```console 428 $ kubectl -n openshift-sdn get pod --field-selector "spec.nodeName=ip-10-0-27-9.ec2.internal" 429 NAME READY STATUS RESTARTS AGE 430 ovs-dk8bh 1/1 Running 1 52m 431 sdn-8nl47 1/1 CrashLoopBackoff 3 52m 432 ``` 433 434 Then, retrieve the logs for the SDN (and the OVS pod, if that is failed): 435 436 ```sh 437 kubectl -n openshift-sdn logs sdn-8nl47 438 ``` 439 440 Some common error messages: 441 - `Cannot fetch default cluster network`: This means the `sdn-controller` has failed to run to completion. Retrieve its logs with `kubectl -n openshift-sdn logs -l app=sdn-controller`. 442 - `warning: Another process is currently listening on the CNI socket, waiting 15s`: Something has gone wrong, and multiple SDN processes are running. SSH to the node in question, capture the out of `ps -faux`. If you just need the cluster up, reboot the node. 443 - Error messages about ovs or OpenVSwitch: Check that the `ovs-*` pod on the same node is healthy. Retrieve its logs with `kubectl -n openshift-sdn logs ovs-<name>`. Rebooting the node should fix it. 444 - Any indication that the control plane is unavailable: Check to make sure the apiserver is reachable from the node. You may be able to find useful information via `journalctl -f -u kubelet`. 445 446 If you think it's a misconfiguration, file a [network operator](https://github.com/openshift/cluster-network-operator) issue. RH employees can also try #forum-sdn. 447 448 #### Debugging the cluster-network-operator 449 The cluster network operator is responsible for deploying the networking components. It does this in response to a special object created by the installer. 450 451 From a deployment perspective, the network operator is often the "canary in the coal mine." It runs very early in the installation process, after the master nodes have come up but before the bootstrap control plane has been torn down. It can be indicative of more subtle installer issues, such as long delays in bringing up master nodes or apiserver communication issues. Nevertheless, it can have other bugs. 452 453 First, determine that the network configuration exists: 454 455 ```console 456 $ kubectl get network.config.openshift.io cluster -oyaml 457 apiVersion: config.openshift.io/v1 458 kind: Network 459 metadata: 460 name: cluster 461 spec: 462 serviceNetwork: 463 - 172.30.0.0/16 464 clusterNetwork: 465 - cidr: 10.128.0.0/14 466 hostPrefix: 23 467 networkType: OVNKubernetes 468 ``` 469 470 If it doesn't exist, the installer didn't create it. You'll have to run `openshift-install create manifests` to determine why. 471 472 Next, check that the network-operator is running: 473 474 ```sh 475 kubectl -n openshift-network-operator get pods 476 ``` 477 478 And retrieve the logs. Note that, on multi-master systems, the operator will perform leader election and all other operators will sleep: 479 480 ```sh 481 kubectl -n openshift-network-operator logs -l "name=network-operator" 482 ``` 483 484 If appropriate, file a [network operator](https://github.com/openshift/cluster-network-operator) issue. RH employees can also try #forum-sdn. 485 486 [access-article]: https://access.redhat.com/articles/3780981#debugging-an-install-1 487 [aws-key-pairs]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html 488 [kubernetes-debug]: https://kubernetes.io/docs/tasks/debug-application-cluster/ 489 [machine-config-daemon-ssh-keys]: https://github.com/openshift/machine-config-operator/blob/master/docs/Update-SSHKeys.md 490 [cluster-version-operator]: https://github.com/openshift/cluster-version-operator/blob/master/README.md 491 [clusterversion]: https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusterversion.md 492 [clusteroperator]: https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusteroperator.md 493 [cluster-operator-conditions]: https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusteroperator.md#conditions