github.com/jogo/docker@v1.7.0-rc1/docs/sources/articles/networking.md (about) 1 page_title: Network configuration 2 page_description: Docker networking 3 page_keywords: network, networking, bridge, docker, documentation 4 5 # Network configuration 6 7 ## TL;DR 8 9 When Docker starts, it creates a virtual interface named `docker0` on 10 the host machine. It randomly chooses an address and subnet from the 11 private range defined by [RFC 1918](http://tools.ietf.org/html/rfc1918) 12 that are not in use on the host machine, and assigns it to `docker0`. 13 Docker made the choice `172.17.42.1/16` when I started it a few minutes 14 ago, for example — a 16-bit netmask providing 65,534 addresses for the 15 host machine and its containers. The MAC address is generated using the 16 IP address allocated to the container to avoid ARP collisions, using a 17 range from `02:42:ac:11:00:00` to `02:42:ac:11:ff:ff`. 18 19 > **Note:** 20 > This document discusses advanced networking configuration 21 > and options for Docker. In most cases you won't need this information. 22 > If you're looking to get started with a simpler explanation of Docker 23 > networking and an introduction to the concept of container linking see 24 > the [Docker User Guide](/userguide/dockerlinks/). 25 26 But `docker0` is no ordinary interface. It is a virtual *Ethernet 27 bridge* that automatically forwards packets between any other network 28 interfaces that are attached to it. This lets containers communicate 29 both with the host machine and with each other. Every time Docker 30 creates a container, it creates a pair of “peer” interfaces that are 31 like opposite ends of a pipe — a packet sent on one will be received on 32 the other. It gives one of the peers to the container to become its 33 `eth0` interface and keeps the other peer, with a unique name like 34 `vethAQI2QT`, out in the namespace of the host machine. By binding 35 every `veth*` interface to the `docker0` bridge, Docker creates a 36 virtual subnet shared between the host machine and every Docker 37 container. 38 39 The remaining sections of this document explain all of the ways that you 40 can use Docker options and — in advanced cases — raw Linux networking 41 commands to tweak, supplement, or entirely replace Docker's default 42 networking configuration. 43 44 ## Quick guide to the options 45 46 Here is a quick list of the networking-related Docker command-line 47 options, in case it helps you find the section below that you are 48 looking for. 49 50 Some networking command-line options can only be supplied to the Docker 51 server when it starts up, and cannot be changed once it is running: 52 53 * `-b BRIDGE` or `--bridge=BRIDGE` — see 54 [Building your own bridge](#bridge-building) 55 56 * `--bip=CIDR` — see 57 [Customizing docker0](#docker0) 58 59 * `--default-gateway=IP_ADDRESS` — see 60 [How Docker networks a container](#container-networking) 61 62 * `--default-gateway-v6=IP_ADDRESS` — see 63 [IPv6](#ipv6) 64 65 * `--fixed-cidr` — see 66 [Customizing docker0](#docker0) 67 68 * `--fixed-cidr-v6` — see 69 [IPv6](#ipv6) 70 71 * `-H SOCKET...` or `--host=SOCKET...` — 72 This might sound like it would affect container networking, 73 but it actually faces in the other direction: 74 it tells the Docker server over what channels 75 it should be willing to receive commands 76 like “run container” and “stop container.” 77 78 * `--icc=true|false` — see 79 [Communication between containers](#between-containers) 80 81 * `--ip=IP_ADDRESS` — see 82 [Binding container ports](#binding-ports) 83 84 * `--ipv6=true|false` — see 85 [IPv6](#ipv6) 86 87 * `--ip-forward=true|false` — see 88 [Communication between containers and the wider world](#the-world) 89 90 * `--iptables=true|false` — see 91 [Communication between containers](#between-containers) 92 93 * `--mtu=BYTES` — see 94 [Customizing docker0](#docker0) 95 96 * `--userland-proxy=true|false` — see 97 [Binding container ports](#binding-ports) 98 99 There are two networking options that can be supplied either at startup 100 or when `docker run` is invoked. When provided at startup, set the 101 default value that `docker run` will later use if the options are not 102 specified: 103 104 * `--dns=IP_ADDRESS...` — see 105 [Configuring DNS](#dns) 106 107 * `--dns-search=DOMAIN...` — see 108 [Configuring DNS](#dns) 109 110 Finally, several networking options can only be provided when calling 111 `docker run` because they specify something specific to one container: 112 113 * `-h HOSTNAME` or `--hostname=HOSTNAME` — see 114 [Configuring DNS](#dns) and 115 [How Docker networks a container](#container-networking) 116 117 * `--link=CONTAINER_NAME_or_ID:ALIAS` — see 118 [Configuring DNS](#dns) and 119 [Communication between containers](#between-containers) 120 121 * `--net=bridge|none|container:NAME_or_ID|host` — see 122 [How Docker networks a container](#container-networking) 123 124 * `--mac-address=MACADDRESS...` — see 125 [How Docker networks a container](#container-networking) 126 127 * `-p SPEC` or `--publish=SPEC` — see 128 [Binding container ports](#binding-ports) 129 130 * `-P` or `--publish-all=true|false` — see 131 [Binding container ports](#binding-ports) 132 133 To supply networking options to the Docker server at startup, use the 134 `DOCKER_OPTS` variable in the Docker upstart configuration file. For Ubuntu, edit the 135 variable in `/etc/default/docker` or `/etc/sysconfig/docker` for CentOS. 136 137 The following example illustrates how to configure Docker on Ubuntu to recognize a 138 newly built bridge. 139 140 Edit the `/etc/default/docker` file: 141 142 $ echo 'DOCKER_OPTS="-b=bridge0"' >> /etc/default/docker 143 144 Then restart the Docker server. 145 146 $ sudo service docker start 147 148 For additional information on bridges, see [building your own 149 bridge](#building-your-own-bridge) later on this page. 150 151 The following sections tackle all of the above topics in an order that we can move roughly from simplest to most complex. 152 153 ## Configuring DNS 154 155 <a name="dns"></a> 156 157 How can Docker supply each container with a hostname and DNS 158 configuration, without having to build a custom image with the hostname 159 written inside? Its trick is to overlay three crucial `/etc` files 160 inside the container with virtual files where it can write fresh 161 information. You can see this by running `mount` inside a container: 162 163 $$ mount 164 ... 165 /dev/disk/by-uuid/1fec...ebdf on /etc/hostname type ext4 ... 166 /dev/disk/by-uuid/1fec...ebdf on /etc/hosts type ext4 ... 167 /dev/disk/by-uuid/1fec...ebdf on /etc/resolv.conf type ext4 ... 168 ... 169 170 This arrangement allows Docker to do clever things like keep 171 `resolv.conf` up to date across all containers when the host machine 172 receives new configuration over DHCP later. The exact details of how 173 Docker maintains these files inside the container can change from one 174 Docker version to the next, so you should leave the files themselves 175 alone and use the following Docker options instead. 176 177 Four different options affect container domain name services. 178 179 * `-h HOSTNAME` or `--hostname=HOSTNAME` — sets the hostname by which 180 the container knows itself. This is written into `/etc/hostname`, 181 into `/etc/hosts` as the name of the container's host-facing IP 182 address, and is the name that `/bin/bash` inside the container will 183 display inside its prompt. But the hostname is not easy to see from 184 outside the container. It will not appear in `docker ps` nor in the 185 `/etc/hosts` file of any other container. 186 187 * `--link=CONTAINER_NAME_or_ID:ALIAS` — using this option as you `run` a 188 container gives the new container's `/etc/hosts` an extra entry 189 named `ALIAS` that points to the IP address of the container identified by 190 `CONTAINER_NAME_or_ID`. This lets processes inside the new container 191 connect to the hostname `ALIAS` without having to know its IP. The 192 `--link=` option is discussed in more detail below, in the section 193 [Communication between containers](#between-containers). Because 194 Docker may assign a different IP address to the linked containers 195 on restart, Docker updates the `ALIAS` entry in the `/etc/hosts` file 196 of the recipient containers. 197 198 * `--dns=IP_ADDRESS...` — sets the IP addresses added as `server` 199 lines to the container's `/etc/resolv.conf` file. Processes in the 200 container, when confronted with a hostname not in `/etc/hosts`, will 201 connect to these IP addresses on port 53 looking for name resolution 202 services. 203 204 * `--dns-search=DOMAIN...` — sets the domain names that are searched 205 when a bare unqualified hostname is used inside of the container, by 206 writing `search` lines into the container's `/etc/resolv.conf`. 207 When a container process attempts to access `host` and the search 208 domain `example.com` is set, for instance, the DNS logic will not 209 only look up `host` but also `host.example.com`. 210 Use `--dns-search=.` if you don't wish to set the search domain. 211 212 Regarding DNS settings, in the absence of either the `--dns=IP_ADDRESS...` 213 or the `--dns-search=DOMAIN...` option, Docker makes each container's 214 `/etc/resolv.conf` look like the `/etc/resolv.conf` of the host machine (where 215 the `docker` daemon runs). When creating the container's `/etc/resolv.conf`, 216 the daemon filters out all localhost IP address `nameserver` entries from 217 the host's original file. 218 219 Filtering is necessary because all localhost addresses on the host are 220 unreachable from the container's network. After this filtering, if there 221 are no more `nameserver` entries left in the container's `/etc/resolv.conf` 222 file, the daemon adds public Google DNS nameservers 223 (8.8.8.8 and 8.8.4.4) to the container's DNS configuration. If IPv6 is 224 enabled on the daemon, the public IPv6 Google DNS nameservers will also 225 be added (2001:4860:4860::8888 and 2001:4860:4860::8844). 226 227 > **Note**: 228 > If you need access to a host's localhost resolver, you must modify your 229 > DNS service on the host to listen on a non-localhost address that is 230 > reachable from within the container. 231 232 You might wonder what happens when the host machine's 233 `/etc/resolv.conf` file changes. The `docker` daemon has a file change 234 notifier active which will watch for changes to the host DNS configuration. 235 236 > **Note**: 237 > The file change notifier relies on the Linux kernel's inotify feature. 238 > Because this feature is currently incompatible with the overlay filesystem 239 > driver, a Docker daemon using "overlay" will not be able to take advantage 240 > of the `/etc/resolv.conf` auto-update feature. 241 242 When the host file changes, all stopped containers which have a matching 243 `resolv.conf` to the host will be updated immediately to this newest host 244 configuration. Containers which are running when the host configuration 245 changes will need to stop and start to pick up the host changes due to lack 246 of a facility to ensure atomic writes of the `resolv.conf` file while the 247 container is running. If the container's `resolv.conf` has been edited since 248 it was started with the default configuration, no replacement will be 249 attempted as it would overwrite the changes performed by the container. 250 If the options (`--dns` or `--dns-search`) have been used to modify the 251 default host configuration, then the replacement with an updated host's 252 `/etc/resolv.conf` will not happen as well. 253 254 > **Note**: 255 > For containers which were created prior to the implementation of 256 > the `/etc/resolv.conf` update feature in Docker 1.5.0: those 257 > containers will **not** receive updates when the host `resolv.conf` 258 > file changes. Only containers created with Docker 1.5.0 and above 259 > will utilize this auto-update feature. 260 261 ## Communication between containers and the wider world 262 263 <a name="the-world"></a> 264 265 Whether a container can talk to the world is governed by two factors. 266 267 1. Is the host machine willing to forward IP packets? This is governed 268 by the `ip_forward` system parameter. Packets can only pass between 269 containers if this parameter is `1`. Usually you will simply leave 270 the Docker server at its default setting `--ip-forward=true` and 271 Docker will go set `ip_forward` to `1` for you when the server 272 starts up. To check the setting or turn it on manually: 273 274 $ sysctl net.ipv4.conf.all.forwarding 275 net.ipv4.conf.all.forwarding = 0 276 $ sysctl net.ipv4.conf.all.forwarding=1 277 $ sysctl net.ipv4.conf.all.forwarding 278 net.ipv4.conf.all.forwarding = 1 279 280 Many using Docker will want `ip_forward` to be on, to at 281 least make communication *possible* between containers and 282 the wider world. 283 284 May also be needed for inter-container communication if you are 285 in a multiple bridge setup. 286 287 2. Do your `iptables` allow this particular connection? Docker will 288 never make changes to your system `iptables` rules if you set 289 `--iptables=false` when the daemon starts. Otherwise the Docker 290 server will append forwarding rules to the `DOCKER` filter chain. 291 292 Docker will not delete or modify any pre-existing rules from the `DOCKER` 293 filter chain. This allows the user to create in advance any rules required 294 to further restrict access to the containers. 295 296 Docker's forward rules permit all external source IPs by default. To allow 297 only a specific IP or network to access the containers, insert a negated 298 rule at the top of the `DOCKER` filter chain. For example, to restrict 299 external access such that *only* source IP 8.8.8.8 can access the 300 containers, the following rule could be added: 301 302 $ iptables -I DOCKER -i ext_if ! -s 8.8.8.8 -j DROP 303 304 ## Communication between containers 305 306 <a name="between-containers"></a> 307 308 Whether two containers can communicate is governed, at the operating 309 system level, by two factors. 310 311 1. Does the network topology even connect the containers' network 312 interfaces? By default Docker will attach all containers to a 313 single `docker0` bridge, providing a path for packets to travel 314 between them. See the later sections of this document for other 315 possible topologies. 316 317 2. Do your `iptables` allow this particular connection? Docker will never 318 make changes to your system `iptables` rules if you set 319 `--iptables=false` when the daemon starts. Otherwise the Docker server 320 will add a default rule to the `FORWARD` chain with a blanket `ACCEPT` 321 policy if you retain the default `--icc=true`, or else will set the 322 policy to `DROP` if `--icc=false`. 323 324 It is a strategic question whether to leave `--icc=true` or change it to 325 `--icc=false` so that 326 `iptables` will protect other containers — and the main host — from 327 having arbitrary ports probed or accessed by a container that gets 328 compromised. 329 330 If you choose the most secure setting of `--icc=false`, then how can 331 containers communicate in those cases where you *want* them to provide 332 each other services? 333 334 The answer is the `--link=CONTAINER_NAME_or_ID:ALIAS` option, which was 335 mentioned in the previous section because of its effect upon name 336 services. If the Docker daemon is running with both `--icc=false` and 337 `--iptables=true` then, when it sees `docker run` invoked with the 338 `--link=` option, the Docker server will insert a pair of `iptables` 339 `ACCEPT` rules so that the new container can connect to the ports 340 exposed by the other container — the ports that it mentioned in the 341 `EXPOSE` lines of its `Dockerfile`. Docker has more documentation on 342 this subject — see the [linking Docker containers](/userguide/dockerlinks) 343 page for further details. 344 345 > **Note**: 346 > The value `CONTAINER_NAME` in `--link=` must either be an 347 > auto-assigned Docker name like `stupefied_pare` or else the name you 348 > assigned with `--name=` when you ran `docker run`. It cannot be a 349 > hostname, which Docker will not recognize in the context of the 350 > `--link=` option. 351 352 You can run the `iptables` command on your Docker host to see whether 353 the `FORWARD` chain has a default policy of `ACCEPT` or `DROP`: 354 355 # When --icc=false, you should see a DROP rule: 356 357 $ sudo iptables -L -n 358 ... 359 Chain FORWARD (policy ACCEPT) 360 target prot opt source destination 361 DOCKER all -- 0.0.0.0/0 0.0.0.0/0 362 DROP all -- 0.0.0.0/0 0.0.0.0/0 363 ... 364 365 # When a --link= has been created under --icc=false, 366 # you should see port-specific ACCEPT rules overriding 367 # the subsequent DROP policy for all other packets: 368 369 $ sudo iptables -L -n 370 ... 371 Chain FORWARD (policy ACCEPT) 372 target prot opt source destination 373 DOCKER all -- 0.0.0.0/0 0.0.0.0/0 374 DROP all -- 0.0.0.0/0 0.0.0.0/0 375 376 Chain DOCKER (1 references) 377 target prot opt source destination 378 ACCEPT tcp -- 172.17.0.2 172.17.0.3 tcp spt:80 379 ACCEPT tcp -- 172.17.0.3 172.17.0.2 tcp dpt:80 380 381 > **Note**: 382 > Docker is careful that its host-wide `iptables` rules fully expose 383 > containers to each other's raw IP addresses, so connections from one 384 > container to another should always appear to be originating from the 385 > first container's own IP address. 386 387 ## Binding container ports to the host 388 389 <a name="binding-ports"></a> 390 391 By default Docker containers can make connections to the outside world, 392 but the outside world cannot connect to containers. Each outgoing 393 connection will appear to originate from one of the host machine's own 394 IP addresses thanks to an `iptables` masquerading rule on the host 395 machine that the Docker server creates when it starts: 396 397 # You can see that the Docker server creates a 398 # masquerade rule that let containers connect 399 # to IP addresses in the outside world: 400 401 $ sudo iptables -t nat -L -n 402 ... 403 Chain POSTROUTING (policy ACCEPT) 404 target prot opt source destination 405 MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0 406 ... 407 408 But if you want containers to accept incoming connections, you will need 409 to provide special options when invoking `docker run`. These options 410 are covered in more detail in the [Docker User Guide](/userguide/dockerlinks) 411 page. There are two approaches. 412 413 First, you can supply `-P` or `--publish-all=true|false` to `docker run` which 414 is a blanket operation that identifies every port with an `EXPOSE` line in the 415 image's `Dockerfile` or `--expose <port>` commandline flag and maps it to a 416 host port somewhere within an *ephemeral port range*. The `docker port` command 417 then needs to be used to inspect created mapping. The *ephemeral port range* is 418 configured by `/proc/sys/net/ipv4/ip_local_port_range` kernel parameter, 419 typically ranging from 32768 to 61000. 420 421 Mapping can be specified explicitly using `-p SPEC` or `--publish=SPEC` option. 422 It allows you to particularize which port on docker server - which can be any 423 port at all, not just one within the *ephemeral port range* — you want mapped 424 to which port in the container. 425 426 Either way, you should be able to peek at what Docker has accomplished 427 in your network stack by examining your NAT tables. 428 429 # What your NAT rules might look like when Docker 430 # is finished setting up a -P forward: 431 432 $ iptables -t nat -L -n 433 ... 434 Chain DOCKER (2 references) 435 target prot opt source destination 436 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:49153 to:172.17.0.2:80 437 438 # What your NAT rules might look like when Docker 439 # is finished setting up a -p 80:80 forward: 440 441 Chain DOCKER (2 references) 442 target prot opt source destination 443 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 to:172.17.0.2:80 444 445 You can see that Docker has exposed these container ports on `0.0.0.0`, 446 the wildcard IP address that will match any possible incoming port on 447 the host machine. If you want to be more restrictive and only allow 448 container services to be contacted through a specific external interface 449 on the host machine, you have two choices. When you invoke `docker run` 450 you can use either `-p IP:host_port:container_port` or `-p IP::port` to 451 specify the external interface for one particular binding. 452 453 Or if you always want Docker port forwards to bind to one specific IP 454 address, you can edit your system-wide Docker server settings and add the 455 option `--ip=IP_ADDRESS`. Remember to restart your Docker server after 456 editing this setting. 457 458 > **Note**: 459 > With hairpin NAT enabled (`--userland-proxy=false`), containers port exposure 460 > is achieved purely through iptables rules, and no attempt to bind the exposed 461 > port is ever made. This means that nothing prevents shadowing a previously 462 > listening service outside of Docker through exposing the same port for a 463 > container. In such conflicting situation, Docker created iptables rules will 464 > take precedence and route to the container. 465 466 The `--userland-proxy` parameter, true by default, provides a userland 467 implementation for inter-container and outside-to-container communication. When 468 disabled, Docker uses both an additional `MASQUERADE` iptable rule and the 469 `net.ipv4.route_localnet` kernel parameter which allow the host machine to 470 connect to a local container exposed port through the commonly used loopback 471 address: this alternative is preferred for performance reason. 472 473 Again, this topic is covered without all of these low-level networking 474 details in the [Docker User Guide](/userguide/dockerlinks/) document if you 475 would like to use that as your port redirection reference instead. 476 477 ## IPv6 478 479 <a name="ipv6"></a> 480 481 As we are [running out of IPv4 addresses](http://en.wikipedia.org/wiki/IPv4_address_exhaustion) 482 the IETF has standardized an IPv4 successor, [Internet Protocol Version 6](http://en.wikipedia.org/wiki/IPv6) 483 , in [RFC 2460](https://www.ietf.org/rfc/rfc2460.txt). Both protocols, IPv4 and 484 IPv6, reside on layer 3 of the [OSI model](http://en.wikipedia.org/wiki/OSI_model). 485 486 487 ### IPv6 with Docker 488 By default, the Docker server configures the container network for IPv4 only. 489 You can enable IPv4/IPv6 dualstack support by running the Docker daemon with the 490 `--ipv6` flag. Docker will set up the bridge `docker0` with the IPv6 491 [link-local address](http://en.wikipedia.org/wiki/Link-local_address) `fe80::1`. 492 493 By default, containers that are created will only get a link-local IPv6 address. 494 To assign globally routable IPv6 addresses to your containers you have to 495 specify an IPv6 subnet to pick the addresses from. Set the IPv6 subnet via the 496 `--fixed-cidr-v6` parameter when starting Docker daemon: 497 498 docker -d --ipv6 --fixed-cidr-v6="2001:db8:1::/64" 499 500 The subnet for Docker containers should at least have a size of `/80`. This way 501 an IPv6 address can end with the container's MAC address and you prevent NDP 502 neighbor cache invalidation issues in the Docker layer. 503 504 With the `--fixed-cidr-v6` parameter set Docker will add a new route to the 505 routing table. Further IPv6 routing will be enabled (you may prevent this by 506 starting Docker daemon with `--ip-forward=false`): 507 508 $ ip -6 route add 2001:db8:1::/64 dev docker0 509 $ sysctl net.ipv6.conf.default.forwarding=1 510 $ sysctl net.ipv6.conf.all.forwarding=1 511 512 All traffic to the subnet `2001:db8:1::/64` will now be routed 513 via the `docker0` interface. 514 515 Be aware that IPv6 forwarding may interfere with your existing IPv6 516 configuration: If you are using Router Advertisements to get IPv6 settings for 517 your host's interfaces you should set `accept_ra` to `2`. Otherwise IPv6 518 enabled forwarding will result in rejecting Router Advertisements. E.g., if you 519 want to configure `eth0` via Router Advertisements you should set: 520 521 $ sysctl net.ipv6.conf.eth0.accept_ra=2 522 523 ![](/article-img/ipv6_basic_host_config.svg) 524 525 Every new container will get an IPv6 address from the defined subnet. Further 526 a default route will be added on `eth0` in the container via the address 527 specified by the daemon option `--default-gateway-v6` if present, otherwise 528 via `fe80::1`: 529 530 docker run -it ubuntu bash -c "ip -6 addr show dev eth0; ip -6 route show" 531 532 15: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 533 inet6 2001:db8:1:0:0:242:ac11:3/64 scope global 534 valid_lft forever preferred_lft forever 535 inet6 fe80::42:acff:fe11:3/64 scope link 536 valid_lft forever preferred_lft forever 537 538 2001:db8:1::/64 dev eth0 proto kernel metric 256 539 fe80::/64 dev eth0 proto kernel metric 256 540 default via fe80::1 dev eth0 metric 1024 541 542 In this example the Docker container is assigned a link-local address with the 543 network suffix `/64` (here: `fe80::42:acff:fe11:3/64`) and a globally routable 544 IPv6 address (here: `2001:db8:1:0:0:242:ac11:3/64`). The container will create 545 connections to addresses outside of the `2001:db8:1::/64` network via the 546 link-local gateway at `fe80::1` on `eth0`. 547 548 Often servers or virtual machines get a `/64` IPv6 subnet assigned (e.g. 549 `2001:db8:23:42::/64`). In this case you can split it up further and provide 550 Docker a `/80` subnet while using a separate `/80` subnet for other 551 applications on the host: 552 553 ![](/article-img/ipv6_slash64_subnet_config.svg) 554 555 In this setup the subnet `2001:db8:23:42::/80` with a range from `2001:db8:23:42:0:0:0:0` 556 to `2001:db8:23:42:0:ffff:ffff:ffff` is attached to `eth0`, with the host listening 557 at `2001:db8:23:42::1`. The subnet `2001:db8:23:42:1::/80` with an address range from 558 `2001:db8:23:42:1:0:0:0` to `2001:db8:23:42:1:ffff:ffff:ffff` is attached to 559 `docker0` and will be used by containers. 560 561 #### Using NDP proxying 562 563 If your Docker host is only part of an IPv6 subnet but has not got an IPv6 564 subnet assigned you can use NDP proxying to connect your containers via IPv6 to 565 the internet. 566 For example your host has the IPv6 address `2001:db8::c001`, is part of the 567 subnet `2001:db8::/64` and your IaaS provider allows you to configure the IPv6 568 addresses `2001:db8::c000` to `2001:db8::c00f`: 569 570 $ ip -6 addr show 571 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 572 inet6 ::1/128 scope host 573 valid_lft forever preferred_lft forever 574 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000 575 inet6 2001:db8::c001/64 scope global 576 valid_lft forever preferred_lft forever 577 inet6 fe80::601:3fff:fea1:9c01/64 scope link 578 valid_lft forever preferred_lft forever 579 580 Let's split up the configurable address range into two subnets 581 `2001:db8::c000/125` and `2001:db8::c008/125`. The first one can be used by the 582 host itself, the latter by Docker: 583 584 docker -d --ipv6 --fixed-cidr-v6 2001:db8::c008/125 585 586 You notice the Docker subnet is within the subnet managed by your router that 587 is connected to `eth0`. This means all devices (containers) with the addresses 588 from the Docker subnet are expected to be found within the router subnet. 589 Therefore the router thinks it can talk to these containers directly. 590 591 ![](/article-img/ipv6_ndp_proxying.svg) 592 593 As soon as the router wants to send an IPv6 packet to the first container it 594 will transmit a neighbor solicitation request, asking, who has 595 `2001:db8::c009`? But it will get no answer because no one on this subnet has 596 this address. The container with this address is hidden behind the Docker host. 597 The Docker host has to listen to neighbor solicitation requests for the container 598 address and send a response that itself is the device that is responsible for 599 the address. This is done by a Kernel feature called `NDP Proxy`. You can 600 enable it by executing 601 602 $ sysctl net.ipv6.conf.eth0.proxy_ndp=1 603 604 Now you can add the container's IPv6 address to the NDP proxy table: 605 606 $ ip -6 neigh add proxy 2001:db8::c009 dev eth0 607 608 This command tells the Kernel to answer to incoming neighbor solicitation requests 609 regarding the IPv6 address `2001:db8::c009` on the device `eth0`. As a 610 consequence of this all traffic to this IPv6 address will go into the Docker 611 host and it will forward it according to its routing table via the `docker0` 612 device to the container network: 613 614 $ ip -6 route show 615 2001:db8::c008/125 dev docker0 metric 1 616 2001:db8::/64 dev eth0 proto kernel metric 256 617 618 You have to execute the `ip -6 neigh add proxy ...` command for every IPv6 619 address in your Docker subnet. Unfortunately there is no functionality for 620 adding a whole subnet by executing one command. 621 622 ### Docker IPv6 cluster 623 624 #### Switched network environment 625 Using routable IPv6 addresses allows you to realize communication between 626 containers on different hosts. Let's have a look at a simple Docker IPv6 cluster 627 example: 628 629 ![](/article-img/ipv6_switched_network_example.svg) 630 631 The Docker hosts are in the `2001:db8:0::/64` subnet. Host1 is configured 632 to provide addresses from the `2001:db8:1::/64` subnet to its containers. It 633 has three routes configured: 634 635 - Route all traffic to `2001:db8:0::/64` via `eth0` 636 - Route all traffic to `2001:db8:1::/64` via `docker0` 637 - Route all traffic to `2001:db8:2::/64` via Host2 with IP `2001:db8::2` 638 639 Host1 also acts as a router on OSI layer 3. When one of the network clients 640 tries to contact a target that is specified in Host1's routing table Host1 will 641 forward the traffic accordingly. It acts as a router for all networks it knows: 642 `2001:db8::/64`, `2001:db8:1::/64` and `2001:db8:2::/64`. 643 644 On Host2 we have nearly the same configuration. Host2's containers will get 645 IPv6 addresses from `2001:db8:2::/64`. Host2 has three routes configured: 646 647 - Route all traffic to `2001:db8:0::/64` via `eth0` 648 - Route all traffic to `2001:db8:2::/64` via `docker0` 649 - Route all traffic to `2001:db8:1::/64` via Host1 with IP `2001:db8:0::1` 650 651 The difference to Host1 is that the network `2001:db8:2::/64` is directly 652 attached to the host via its `docker0` interface whereas it reaches 653 `2001:db8:1::/64` via Host1's IPv6 address `2001:db8::1`. 654 655 This way every container is able to contact every other container. The 656 containers `Container1-*` share the same subnet and contact each other directly. 657 The traffic between `Container1-*` and `Container2-*` will be routed via Host1 658 and Host2 because those containers do not share the same subnet. 659 660 In a switched environment every host has to know all routes to every subnet. You 661 always have to update the hosts' routing tables once you add or remove a host 662 to the cluster. 663 664 Every configuration in the diagram that is shown below the dashed line is 665 handled by Docker: The `docker0` bridge IP address configuration, the route to 666 the Docker subnet on the host, the container IP addresses and the routes on the 667 containers. The configuration above the line is up to the user and can be 668 adapted to the individual environment. 669 670 #### Routed network environment 671 672 In a routed network environment you replace the layer 2 switch with a layer 3 673 router. Now the hosts just have to know their default gateway (the router) and 674 the route to their own containers (managed by Docker). The router holds all 675 routing information about the Docker subnets. When you add or remove a host to 676 this environment you just have to update the routing table in the router - not 677 on every host. 678 679 ![](/article-img/ipv6_routed_network_example.svg) 680 681 In this scenario containers of the same host can communicate directly with each 682 other. The traffic between containers on different hosts will be routed via 683 their hosts and the router. For example packet from `Container1-1` to 684 `Container2-1` will be routed through `Host1`, `Router` and `Host2` until it 685 arrives at `Container2-1`. 686 687 To keep the IPv6 addresses short in this example a `/48` network is assigned to 688 every host. The hosts use a `/64` subnet of this for its own services and one 689 for Docker. When adding a third host you would add a route for the subnet 690 `2001:db8:3::/48` in the router and configure Docker on Host3 with 691 `--fixed-cidr-v6=2001:db8:3:1::/64`. 692 693 Remember the subnet for Docker containers should at least have a size of `/80`. 694 This way an IPv6 address can end with the container's MAC address and you 695 prevent NDP neighbor cache invalidation issues in the Docker layer. So if you 696 have a `/64` for your whole environment use `/78` subnets for the hosts and 697 `/80` for the containers. This way you can use 4096 hosts with 16 `/80` subnets 698 each. 699 700 Every configuration in the diagram that is visualized below the dashed line is 701 handled by Docker: The `docker0` bridge IP address configuration, the route to 702 the Docker subnet on the host, the container IP addresses and the routes on the 703 containers. The configuration above the line is up to the user and can be 704 adapted to the individual environment. 705 706 ## Customizing docker0 707 708 <a name="docker0"></a> 709 710 By default, the Docker server creates and configures the host system's 711 `docker0` interface as an *Ethernet bridge* inside the Linux kernel that 712 can pass packets back and forth between other physical or virtual 713 network interfaces so that they behave as a single Ethernet network. 714 715 Docker configures `docker0` with an IP address, netmask and IP 716 allocation range. The host machine can both receive and send packets to 717 containers connected to the bridge, and gives it an MTU — the *maximum 718 transmission unit* or largest packet length that the interface will 719 allow — of either 1,500 bytes or else a more specific value copied from 720 the Docker host's interface that supports its default route. These 721 options are configurable at server startup: 722 723 * `--bip=CIDR` — supply a specific IP address and netmask for the 724 `docker0` bridge, using standard CIDR notation like 725 `192.168.1.5/24`. 726 727 * `--fixed-cidr=CIDR` — restrict the IP range from the `docker0` subnet, 728 using the standard CIDR notation like `172.167.1.0/28`. This range must 729 be and IPv4 range for fixed IPs (ex: 10.20.0.0/16) and must be a subset 730 of the bridge IP range (`docker0` or set using `--bridge`). For example 731 with `--fixed-cidr=192.168.1.0/25`, IPs for your containers will be chosen 732 from the first half of `192.168.1.0/24` subnet. 733 734 * `--mtu=BYTES` — override the maximum packet length on `docker0`. 735 736 737 Once you have one or more containers up and running, you can confirm 738 that Docker has properly connected them to the `docker0` bridge by 739 running the `brctl` command on the host machine and looking at the 740 `interfaces` column of the output. Here is a host with two different 741 containers connected: 742 743 # Display bridge info 744 745 $ sudo brctl show 746 bridge name bridge id STP enabled interfaces 747 docker0 8000.3a1d7362b4ee no veth65f9 748 vethdda6 749 750 If the `brctl` command is not installed on your Docker host, then on 751 Ubuntu you should be able to run `sudo apt-get install bridge-utils` to 752 install it. 753 754 Finally, the `docker0` Ethernet bridge settings are used every time you 755 create a new container. Docker selects a free IP address from the range 756 available on the bridge each time you `docker run` a new container, and 757 configures the container's `eth0` interface with that IP address and the 758 bridge's netmask. The Docker host's own IP address on the bridge is 759 used as the default gateway by which each container reaches the rest of 760 the Internet. 761 762 # The network, as seen from a container 763 764 $ docker run -i -t --rm base /bin/bash 765 766 $$ ip addr show eth0 767 24: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 768 link/ether 32:6f:e0:35:57:91 brd ff:ff:ff:ff:ff:ff 769 inet 172.17.0.3/16 scope global eth0 770 valid_lft forever preferred_lft forever 771 inet6 fe80::306f:e0ff:fe35:5791/64 scope link 772 valid_lft forever preferred_lft forever 773 774 $$ ip route 775 default via 172.17.42.1 dev eth0 776 172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.3 777 778 $$ exit 779 780 Remember that the Docker host will not be willing to forward container 781 packets out on to the Internet unless its `ip_forward` system setting is 782 `1` — see the section above on [Communication between 783 containers](#between-containers) for details. 784 785 ## Building your own bridge 786 787 <a name="bridge-building"></a> 788 789 If you want to take Docker out of the business of creating its own 790 Ethernet bridge entirely, you can set up your own bridge before starting 791 Docker and use `-b BRIDGE` or `--bridge=BRIDGE` to tell Docker to use 792 your bridge instead. If you already have Docker up and running with its 793 old `docker0` still configured, you will probably want to begin by 794 stopping the service and removing the interface: 795 796 # Stopping Docker and removing docker0 797 798 $ sudo service docker stop 799 $ sudo ip link set dev docker0 down 800 $ sudo brctl delbr docker0 801 $ sudo iptables -t nat -F POSTROUTING 802 803 Then, before starting the Docker service, create your own bridge and 804 give it whatever configuration you want. Here we will create a simple 805 enough bridge that we really could just have used the options in the 806 previous section to customize `docker0`, but it will be enough to 807 illustrate the technique. 808 809 # Create our own bridge 810 811 $ sudo brctl addbr bridge0 812 $ sudo ip addr add 192.168.5.1/24 dev bridge0 813 $ sudo ip link set dev bridge0 up 814 815 # Confirming that our bridge is up and running 816 817 $ ip addr show bridge0 818 4: bridge0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state UP group default 819 link/ether 66:38:d0:0d:76:18 brd ff:ff:ff:ff:ff:ff 820 inet 192.168.5.1/24 scope global bridge0 821 valid_lft forever preferred_lft forever 822 823 # Tell Docker about it and restart (on Ubuntu) 824 825 $ echo 'DOCKER_OPTS="-b=bridge0"' >> /etc/default/docker 826 $ sudo service docker start 827 828 # Confirming new outgoing NAT masquerade is set up 829 830 $ sudo iptables -t nat -L -n 831 ... 832 Chain POSTROUTING (policy ACCEPT) 833 target prot opt source destination 834 MASQUERADE all -- 192.168.5.0/24 0.0.0.0/0 835 836 837 The result should be that the Docker server starts successfully and is 838 now prepared to bind containers to the new bridge. After pausing to 839 verify the bridge's configuration, try creating a container — you will 840 see that its IP address is in your new IP address range, which Docker 841 will have auto-detected. 842 843 Just as we learned in the previous section, you can use the `brctl show` 844 command to see Docker add and remove interfaces from the bridge as you 845 start and stop containers, and can run `ip addr` and `ip route` inside a 846 container to see that it has been given an address in the bridge's IP 847 address range and has been told to use the Docker host's IP address on 848 the bridge as its default gateway to the rest of the Internet. 849 850 ## How Docker networks a container 851 852 <a name="container-networking"></a> 853 854 While Docker is under active development and continues to tweak and 855 improve its network configuration logic, the shell commands in this 856 section are rough equivalents to the steps that Docker takes when 857 configuring networking for each new container. 858 859 Let's review a few basics. 860 861 To communicate using the Internet Protocol (IP), a machine needs access 862 to at least one network interface at which packets can be sent and 863 received, and a routing table that defines the range of IP addresses 864 reachable through that interface. Network interfaces do not have to be 865 physical devices. In fact, the `lo` loopback interface available on 866 every Linux machine (and inside each Docker container) is entirely 867 virtual — the Linux kernel simply copies loopback packets directly from 868 the sender's memory into the receiver's memory. 869 870 Docker uses special virtual interfaces to let containers communicate 871 with the host machine — pairs of virtual interfaces called “peers” that 872 are linked inside of the host machine's kernel so that packets can 873 travel between them. They are simple to create, as we will see in a 874 moment. 875 876 The steps with which Docker configures a container are: 877 878 1. Create a pair of peer virtual interfaces. 879 880 2. Give one of them a unique name like `veth65f9`, keep it inside of 881 the main Docker host, and bind it to `docker0` or whatever bridge 882 Docker is supposed to be using. 883 884 3. Toss the other interface over the wall into the new container (which 885 will already have been provided with an `lo` interface) and rename 886 it to the much prettier name `eth0` since, inside of the container's 887 separate and unique network interface namespace, there are no 888 physical interfaces with which this name could collide. 889 890 4. Set the interface's MAC address according to the `--mac-address` 891 parameter or generate a random one. 892 893 5. Give the container's `eth0` a new IP address from within the 894 bridge's range of network addresses. The default route is set to the 895 IP address passed to the Docker daemon using the `--default-gateway` 896 option if specified, otherwise to the IP address that the Docker host 897 owns on the bridge. The MAC address is generated from the IP address 898 unless otherwise specified. This prevents ARP cache invalidation 899 problems, when a new container comes up with an IP used in the past by 900 another container with another MAC. 901 902 With these steps complete, the container now possesses an `eth0` 903 (virtual) network card and will find itself able to communicate with 904 other containers and the rest of the Internet. 905 906 You can opt out of the above process for a particular container by 907 giving the `--net=` option to `docker run`, which takes four possible 908 values. 909 910 * `--net=bridge` — The default action, that connects the container to 911 the Docker bridge as described above. 912 913 * `--net=host` — Tells Docker to skip placing the container inside of 914 a separate network stack. In essence, this choice tells Docker to 915 **not containerize the container's networking**! While container 916 processes will still be confined to their own filesystem and process 917 list and resource limits, a quick `ip addr` command will show you 918 that, network-wise, they live “outside” in the main Docker host and 919 have full access to its network interfaces. Note that this does 920 **not** let the container reconfigure the host network stack — that 921 would require `--privileged=true` — but it does let container 922 processes open low-numbered ports like any other root process. 923 It also allows the container to access local network services 924 like D-bus. This can lead to processes in the container being 925 able to do unexpected things like 926 [restart your computer](https://github.com/docker/docker/issues/6401). 927 You should use this option with caution. 928 929 * `--net=container:NAME_or_ID` — Tells Docker to put this container's 930 processes inside of the network stack that has already been created 931 inside of another container. The new container's processes will be 932 confined to their own filesystem and process list and resource 933 limits, but will share the same IP address and port numbers as the 934 first container, and processes on the two containers will be able to 935 connect to each other over the loopback interface. 936 937 * `--net=none` — Tells Docker to put the container inside of its own 938 network stack but not to take any steps to configure its network, 939 leaving you free to build any of the custom configurations explored 940 in the last few sections of this document. 941 942 To get an idea of the steps that are necessary if you use `--net=none` 943 as described in that last bullet point, here are the commands that you 944 would run to reach roughly the same configuration as if you had let 945 Docker do all of the configuration: 946 947 # At one shell, start a container and 948 # leave its shell idle and running 949 950 $ docker run -i -t --rm --net=none base /bin/bash 951 root@63f36fc01b5f:/# 952 953 # At another shell, learn the container process ID 954 # and create its namespace entry in /var/run/netns/ 955 # for the "ip netns" command we will be using below 956 957 $ docker inspect -f '{{.State.Pid}}' 63f36fc01b5f 958 2778 959 $ pid=2778 960 $ sudo mkdir -p /var/run/netns 961 $ sudo ln -s /proc/$pid/ns/net /var/run/netns/$pid 962 963 # Check the bridge's IP address and netmask 964 965 $ ip addr show docker0 966 21: docker0: ... 967 inet 172.17.42.1/16 scope global docker0 968 ... 969 970 # Create a pair of "peer" interfaces A and B, 971 # bind the A end to the bridge, and bring it up 972 973 $ sudo ip link add A type veth peer name B 974 $ sudo brctl addif docker0 A 975 $ sudo ip link set A up 976 977 # Place B inside the container's network namespace, 978 # rename to eth0, and activate it with a free IP 979 980 $ sudo ip link set B netns $pid 981 $ sudo ip netns exec $pid ip link set dev B name eth0 982 $ sudo ip netns exec $pid ip link set eth0 address 12:34:56:78:9a:bc 983 $ sudo ip netns exec $pid ip link set eth0 up 984 $ sudo ip netns exec $pid ip addr add 172.17.42.99/16 dev eth0 985 $ sudo ip netns exec $pid ip route add default via 172.17.42.1 986 987 At this point your container should be able to perform networking 988 operations as usual. 989 990 When you finally exit the shell and Docker cleans up the container, the 991 network namespace is destroyed along with our virtual `eth0` — whose 992 destruction in turn destroys interface `A` out in the Docker host and 993 automatically un-registers it from the `docker0` bridge. So everything 994 gets cleaned up without our having to run any extra commands! Well, 995 almost everything: 996 997 # Clean up dangling symlinks in /var/run/netns 998 999 find -L /var/run/netns -type l -delete 1000 1001 Also note that while the script above used modern `ip` command instead 1002 of old deprecated wrappers like `ipconfig` and `route`, these older 1003 commands would also have worked inside of our container. The `ip addr` 1004 command can be typed as `ip a` if you are in a hurry. 1005 1006 Finally, note the importance of the `ip netns exec` command, which let 1007 us reach inside and configure a network namespace as root. The same 1008 commands would not have worked if run inside of the container, because 1009 part of safe containerization is that Docker strips container processes 1010 of the right to configure their own networks. Using `ip netns exec` is 1011 what let us finish up the configuration without having to take the 1012 dangerous step of running the container itself with `--privileged=true`. 1013 1014 ## Tools and examples 1015 1016 Before diving into the following sections on custom network topologies, 1017 you might be interested in glancing at a few external tools or examples 1018 of the same kinds of configuration. Here are two: 1019 1020 * Jérôme Petazzoni has created a `pipework` shell script to help you 1021 connect together containers in arbitrarily complex scenarios: 1022 <https://github.com/jpetazzo/pipework> 1023 1024 * Brandon Rhodes has created a whole network topology of Docker 1025 containers for the next edition of Foundations of Python Network 1026 Programming that includes routing, NAT'd firewalls, and servers that 1027 offer HTTP, SMTP, POP, IMAP, Telnet, SSH, and FTP: 1028 <https://github.com/brandon-rhodes/fopnp/tree/m/playground> 1029 1030 Both tools use networking commands very much like the ones you saw in 1031 the previous section, and will see in the following sections. 1032 1033 ## Building a point-to-point connection 1034 1035 <a name="point-to-point"></a> 1036 1037 By default, Docker attaches all containers to the virtual subnet 1038 implemented by `docker0`. You can create containers that are each 1039 connected to some different virtual subnet by creating your own bridge 1040 as shown in [Building your own bridge](#bridge-building), starting each 1041 container with `docker run --net=none`, and then attaching the 1042 containers to your bridge with the shell commands shown in [How Docker 1043 networks a container](#container-networking). 1044 1045 But sometimes you want two particular containers to be able to 1046 communicate directly without the added complexity of both being bound to 1047 a host-wide Ethernet bridge. 1048 1049 The solution is simple: when you create your pair of peer interfaces, 1050 simply throw *both* of them into containers, and configure them as 1051 classic point-to-point links. The two containers will then be able to 1052 communicate directly (provided you manage to tell each container the 1053 other's IP address, of course). You might adjust the instructions of 1054 the previous section to go something like this: 1055 1056 # Start up two containers in two terminal windows 1057 1058 $ docker run -i -t --rm --net=none base /bin/bash 1059 root@1f1f4c1f931a:/# 1060 1061 $ docker run -i -t --rm --net=none base /bin/bash 1062 root@12e343489d2f:/# 1063 1064 # Learn the container process IDs 1065 # and create their namespace entries 1066 1067 $ docker inspect -f '{{.State.Pid}}' 1f1f4c1f931a 1068 2989 1069 $ docker inspect -f '{{.State.Pid}}' 12e343489d2f 1070 3004 1071 $ sudo mkdir -p /var/run/netns 1072 $ sudo ln -s /proc/2989/ns/net /var/run/netns/2989 1073 $ sudo ln -s /proc/3004/ns/net /var/run/netns/3004 1074 1075 # Create the "peer" interfaces and hand them out 1076 1077 $ sudo ip link add A type veth peer name B 1078 1079 $ sudo ip link set A netns 2989 1080 $ sudo ip netns exec 2989 ip addr add 10.1.1.1/32 dev A 1081 $ sudo ip netns exec 2989 ip link set A up 1082 $ sudo ip netns exec 2989 ip route add 10.1.1.2/32 dev A 1083 1084 $ sudo ip link set B netns 3004 1085 $ sudo ip netns exec 3004 ip addr add 10.1.1.2/32 dev B 1086 $ sudo ip netns exec 3004 ip link set B up 1087 $ sudo ip netns exec 3004 ip route add 10.1.1.1/32 dev B 1088 1089 The two containers should now be able to ping each other and make 1090 connections successfully. Point-to-point links like this do not depend 1091 on a subnet nor a netmask, but on the bare assertion made by `ip route` 1092 that some other single IP address is connected to a particular network 1093 interface. 1094 1095 Note that point-to-point links can be safely combined with other kinds 1096 of network connectivity — there is no need to start the containers with 1097 `--net=none` if you want point-to-point links to be an addition to the 1098 container's normal networking instead of a replacement. 1099 1100 A final permutation of this pattern is to create the point-to-point link 1101 between the Docker host and one container, which would allow the host to 1102 communicate with that one container on some single IP address and thus 1103 communicate “out-of-band” of the bridge that connects the other, more 1104 usual containers. But unless you have very specific networking needs 1105 that drive you to such a solution, it is probably far preferable to use 1106 `--icc=false` to lock down inter-container communication, as we explored 1107 earlier. 1108 1109 ## Editing networking config files 1110 1111 Starting with Docker v.1.2.0, you can now edit `/etc/hosts`, `/etc/hostname` 1112 and `/etc/resolve.conf` in a running container. This is useful if you need 1113 to install bind or other services that might override one of those files. 1114 1115 Note, however, that changes to these files will not be saved by 1116 `docker commit`, nor will they be saved during `docker run`. 1117 That means they won't be saved in the image, nor will they persist when a 1118 container is restarted; they will only "stick" in a running container.