github.com/imran-kn/cilium-fork@v1.6.9/Documentation/gettingstarted/k8s-install-etcd-operator.rst (about) 1 .. only:: not (epub or latex or html) 2 3 WARNING: You are looking at unreleased Cilium documentation. 4 Please use the official rendered version released here: 5 http://docs.cilium.io 6 7 .. _k8s_install_etcd_operator: 8 9 ****************************** 10 Installation with managed etcd 11 ****************************** 12 13 The standard :ref:`k8s_quick_install` guide will set up Cilium to use 14 Kubernetes CRDs to store and propagate state between agents. Use of CRDs can 15 impose scale limitations depending on the size of your environment. Use of etcd 16 optimizes the propagation of state between agents. This guide explains the 17 steps required to set up Cilium with a managed etcd where etcd is managed by an 18 operator which maintains an etcd cluster as part of the Kubernetes cluster. 19 20 The identity allocation remains to be CRD-based which means that etcd remains 21 an optional component to improve scalability. Failures in providing etcd will 22 not be critical to the availability of Cilium but will reduce the efficacy of 23 state propagation. This allows the managed etcd to recover while depending on 24 Cilium itself to provide connectivity and security. 25 26 Should you encounter any issues during the installation, please refer to the 27 :ref:`troubleshooting_k8s` section and / or seek help on the `Slack channel`. 28 29 .. include:: requirements_intro.rst 30 31 Deploy Cilium + cilium-etcd-operator 32 ==================================== 33 34 .. include:: k8s-install-download-release.rst 35 36 Generate the required YAML file and deploy it: 37 38 .. code:: bash 39 40 helm template cilium \ 41 --namespace kube-system \ 42 --set global.etcd.enabled=true \ 43 --set global.etcd.managed=true \ 44 > cilium.yaml 45 kubectl create -f cilium.yaml 46 47 48 Validate the Installation 49 ========================= 50 51 You can monitor as Cilium and all required components are being installed: 52 53 .. parsed-literal:: 54 55 kubectl -n kube-system get pods --watch 56 NAME READY STATUS RESTARTS AGE 57 cilium-etcd-operator-6ffbd46df9-pn6cf 1/1 Running 0 7s 58 cilium-operator-cb4578bc5-q52qk 0/1 Pending 0 8s 59 cilium-s8w5m 0/1 PodInitializing 0 7s 60 coredns-86c58d9df4-4g7dd 0/1 ContainerCreating 0 8m57s 61 coredns-86c58d9df4-4l6b2 0/1 ContainerCreating 0 8m57s 62 63 It may take a couple of minutes for the etcd-operator to bring up the necessary 64 number of etcd pods to achieve quorum. Once it reaches quorum, all components 65 should be healthy and ready: 66 67 .. parsed-literal:: 68 69 cilium-etcd-8d95ggpjmw 1/1 Running 0 78s 70 cilium-etcd-operator-6ffbd46df9-pn6cf 1/1 Running 0 4m12s 71 cilium-etcd-t695lgxf4x 1/1 Running 0 118s 72 cilium-etcd-zw285m6t9g 1/1 Running 0 2m41s 73 cilium-operator-cb4578bc5-q52qk 1/1 Running 0 4m13s 74 cilium-s8w5m 1/1 Running 0 4m12s 75 coredns-86c58d9df4-4g7dd 1/1 Running 0 13m 76 coredns-86c58d9df4-4l6b2 1/1 Running 0 13m 77 etcd-operator-5cf67779fd-hd9j7 1/1 Running 0 2m42s 78 79 80 Troubleshooting 81 =============== 82 83 * Make sure that ``kube-dns`` or ``coredns`` is running and healthy in the 84 ``kube-system`` namespace. A functioning Kubernetes DNS is strictly required 85 in order for Cilium to resolve the ClusterIP of the etcd cluster. If either 86 ``kube-dns`` or ``coredns`` were already running before Cilium was deployed, 87 the pods may be managed by a former CNI plugin. ``cilium-operator`` will 88 automatically restart the pods to ensure that they are being managed by the 89 Cilium CNI plugin. You can manually restart the pods as well if required and 90 validate that Cilium is managing ``kube-dns`` or ``coredns`` by running: 91 92 .. code:: bash 93 94 kubectl -n kube-system get cep 95 96 You should see ``kube-dns-xxx`` or ``coredns-xxx`` pods. 97 98 * In order for the entire system to come up, the following components have to 99 be running at the same time: 100 101 * ``kube-dns`` or ``coredns`` 102 * ``cilium-xxx`` 103 * ``cilium-etcd-operator`` 104 * ``etcd-operator`` 105 * ``etcd-xxx`` 106 107 All timeouts are configured that this will typically work out smoothly even 108 if some of the pods restart once or twice. In case any of the above pods get 109 into a long ``CrashLoopBackoff``, bootstrapping can be expedited by 110 restarting the pods to reset the ``CrashLoopBackoff`` time. 111 112 CoreDNS: Enable reverse lookups 113 ------------------------------- 114 115 In order for the TLS certificates between etcd peers to work correctly, a DNS 116 reverse lookup on a pod IP must map back to pod name. If you are using CoreDNS, 117 check the CoreDNS ConfigMap and validate that ``in-addr.arpa`` and ``ip6.arpa`` 118 are listed as wildcards for the kubernetes block like this: 119 120 :: 121 122 kubectl -n kube-system edit cm coredns 123 [...] 124 apiVersion: v1 125 data: 126 Corefile: | 127 .:53 { 128 errors 129 health 130 kubernetes cluster.local in-addr.arpa ip6.arpa { 131 pods insecure 132 upstream 133 fallthrough in-addr.arpa ip6.arpa 134 } 135 prometheus :9153 136 proxy . /etc/resolv.conf 137 cache 30 138 } 139 140 The contents can look different than the above. The specific configuration that 141 matters is to make sure that ``in-addr.arpa`` and ``ip6.arpa`` are listed as 142 wildcards next to ``cluster.local``. 143 144 You can validate this by looking up a pod IP with the ``host`` utility from any 145 pod: 146 147 :: 148 149 host 10.60.20.86 150 86.20.60.10.in-addr.arpa domain name pointer cilium-etcd-972nprv9dp.cilium-etcd.kube-system.svc.cluster.local. 151 152 .. _k8s_what_is_the_cilium_etcd_operator: 153 154 What is the cilium-etcd-operator? 155 ================================= 156 157 The cilium-etcd-operator uses and extends the etcd-operator to guarantee quorum, 158 auto-create certificates, and manage compaction: 159 160 * Automatic re-creation of the etcd cluster when the cluster loses quorum. The 161 standard etcd-operator will refuse to bring up new etcd nodes and the etcd 162 cluster becomes unusable. 163 164 * Automatic creation of certificates and keys. This simplifies the 165 installation of the operator and makes the certificates and keys required to 166 access the etcd cluster available to Cilium using a well known Kubernetes 167 secret name. 168 169 * Compaction is automatically handled. 170 171 .. _k8s_etcd_operator_limitations: 172 173 Limitations 174 =========== 175 176 Use of the cilium-etcd-operator offers a lot of advantages including simplicity 177 of installation, automatic management of the etcd cluster including compaction, 178 restart on quorum loss, and automatic use of TLS. There are several 179 disadvantages which can become of relevance as you scale up your clusters: 180 181 * etcd nodes operated by the etcd-operator will not use persistent storage. 182 Once the etcd cluster looses quorum, the etcd cluster is automatically 183 re-created by the cilium-etcd-operator. Cilium will automatically recover and 184 re-create all state in etcd. This operation can take can couple of seconds 185 and may cause minor disruptions as ongoing distributed locks are invalidated 186 and security identities have to be re-allocated. 187 188 * etcd is very sensitive to disk IO latency and requires fast disk access at a 189 certain scale. The cilium-etcd-operator will not take any measures to provide 190 fast disk access and performance will depend whatever is provided to the pods 191 in your Kubernetes cluster. See `etcd Hardware recommendations 192 <https://coreos.com/etcd/docs/latest/op-guide/hardware.html>`_ for more details.