github.com/akerouanton/docker@v1.11.0-rc3/experimental/vlan-networks.md (about)

     1  
     2  
     3  # Macvlan and Ipvlan Network Drivers
     4  
     5  ### Getting Started
     6  
     7  The Macvlan and Ipvlan drivers are currently in experimental mode in order to incubate Docker users use cases and vet the implementation to ensure a hardened, production ready driver in a future release. Libnetwork now gives users total control over both IPv4 and IPv6 addressing. The VLAN drivers build on top of that in giving operators complete control of layer 2 VLAN tagging and even Ipvlan L3 routing for users interested in underlay network integration. For overlay deployments that abstract away physical constraints see the [multi-host overlay ](https://docs.docker.com/engine/userguide/networking/get-started-overlay/) driver.
     8  
     9  Macvlan and Ipvlan are a new twist on the tried and true network virtualization technique. The Linux implementations are extremely lightweight because rather than using the traditional Linux bridge for isolation, they are simply associated to a Linux Ethernet interface or sub-interface to enforce separation between networks and connectivity to the physical network.
    10  
    11  Macvlan and Ipvlan offer a number of unique features and plenty of room for further innovations with the various modes. Two high level advantages of these approaches are, the positive performance implications of bypassing the Linux bridge and the simplicity of having less moving parts. Removing the bridge that traditionally resides in between the Docker host NIC and container interface leaves a very simple setup consisting of container interfaces, attached directly to the Docker host interface. This result is easy access for external facing services as there is no port mappings in these scenarios.
    12  
    13  
    14  ### Pre-Requisites
    15  
    16  - The examples on this page are all single host and setup using Docker experimental builds that can be installed with the following instructions: [Install Docker experimental](https://github.com/docker/docker/tree/master/experimental)
    17  
    18  - All of the examples can be performed on a single host running Docker. Any examples using a sub-interface like `eth0.10` can be replaced with `eth0` or any other valid parent interface on the Docker host. Sub-interfaces with a `.` are created on the fly. `-o parent` interfaces can also be left out of the `docker network create` all together and the driver will create a `dummy` interface that will enable local host connectivity to perform the examples.
    19  
    20  - Kernel requirements:
    21   
    22   - To check your current kernel version, use `uname -r` to display your kernel version
    23   - Macvlan Linux kernel v3.9–3.19 and 4.0+
    24   - Ipvlan Linux kernel v4.2+ (support for earlier kernels exists but is buggy)
    25  
    26  
    27  ### MacVlan Bridge Mode Example Usage
    28  
    29  Macvlan Bridge mode has a unique MAC address per container used to track MAC to port mappings by the Docker host. This is the largest difference from Ipvlan L2 mode which uses the same MAC address as the parent interface for each container `eth0` interface.
    30  
    31  - Macvlan and Ipvlan driver networks are attached to a parent Docker host interface. Examples are a physical interface such as `eth0`, a sub-interface for 802.1q VLAN tagging like `eth0.10` (`.10` representing VLAN `10`) or even bonded host adaptors which bundle two Ethernet interfaces into a single logical interface.
    32  
    33  - The specified gateway is external to the host provided by the network infrastructure. 
    34  
    35  - Each Macvlan Bridge mode Docker network is isolated from one another and there can be only one network attached to a parent interface at a time. There is a theoretical limit of 4,094 sub-interfaces per host adaptor that a Docker network could be attached to.
    36  
    37  - It is not recommended to mix ipvlan and macvlan networks on the same `-o parent=` interface. Older kernel versions will throw uninformative netlink errors such as `device is busy`.
    38  
    39  - Any container inside the same subnet can talk any other container in the same network without a  gateway in both `macvlan bridge` mode and `ipvlan L2` modes.
    40  
    41  - The same `docker network` commands apply to the vlan drivers. Some are irrelevant such as `-icc` or `--set-macaddress` for the Ipvlan driver.
    42  
    43  - In Macvlan and Ipvlan L2 mode, containers on separate networks cannot reach one another without an external process routing between the two networks/subnets. This also applies to multiple subnets within the same `docker network`. See Ipvlan L3 mode for inter-subnet communications without a router.
    44  
    45  In the following example, `eth0` on the docker host has an IP on the `172.16.86.0/24` network and a default gateway of `172.16.86.1`. The gateway is an external router with an address of `172.16.86.1`. An IP address is not required on the Docker host interface `eth0` in `bridge` mode, it merely needs to be on the proper upstream network to get forwarded by a network switch or network router.
    46  
    47  ![Simple Macvlan Bridge Mode Example](images/macvlan_bridge_simple.png)
    48  
    49  
    50  **Note** For Macvlan bridge mode and Ipvlan L2 mode the subnet values need to match the NIC's interface of the Docker host. For example, Use the same subnet and gateway of the Docker host ethernet interface that is specified by the `-o parent=` option.
    51  
    52  - The parent interface used in this example is `eth0` and it is on the subnet `172.16.86.0/24`. The containers in the `docker network` will also need to be on this same subnet as the parent `-o parent=`. The gateway is an external router on the network, not any ip masquerading or any other local proxy.
    53  
    54  - The driver is specified with `-d driver_name` option. In this case `-d macvlan`
    55  
    56  - The parent interface `-o parent=eth0` is configured as followed:
    57  
    58  ```
    59  ip addr show eth0
    60  3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    61      inet 172.16.86.250/24 brd 172.16.86.255 scope global eth0
    62  ```
    63  
    64  Create the macvlan network and run a couple of containers attached to it:
    65  
    66  ```
    67  # Macvlan  (-o macvlan_mode= Defaults to Bridge mode if not specified)
    68  docker network create -d macvlan \
    69      --subnet=172.16.86.0/24 \
    70      --gateway=172.16.86.1  \
    71      -o parent=eth0 pub_net
    72  
    73  # Run a container on the new network specifying the --ip address.
    74  docker  run --net=pub_net --ip=172.16.86.10 -itd alpine /bin/sh
    75  
    76  # Start a second container and ping the first
    77  docker  run --net=pub_net -it --rm alpine /bin/sh
    78  ping -c 4 172.16.86.10
    79  
    80  ```
    81  
    82   Take a look at the containers ip and routing table:
    83   
    84  ```
    85  
    86  ip a show eth0
    87      eth0@if3: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UNKNOWN
    88      link/ether 46:b2:6b:26:2f:69 brd ff:ff:ff:ff:ff:ff
    89      inet 172.16.86.2/24 scope global eth0
    90      
    91  ip route
    92      default via 172.16.86.1 dev eth0
    93      172.16.86.0/24 dev eth0  src 172.16.86.2
    94  
    95  # NOTE: the containers can NOT ping the underlying host interfaces as
    96  # they are intentionally filtered by Linux for additional isolation.
    97  # In this case the containers cannot ping the -o parent=172.16.86.250
    98  ```
    99  
   100  
   101  You can explicitly specify the `bridge` mode option `-o macvlan_mode=bridge`. It is the default so will be in `bridge` mode either way.
   102  
   103  While the `eth0` interface does not need to have an IP address in Macvlan Bridge mode or Ipvlan L2 mode it is not uncommon to have an IP address on the interface. Addresses can be excluded from getting an address from the default built in IPAM by using the `--aux-address=x.x.x.x` flag. This will blacklist the specified address from being handed out to containers. The same network example above blocking the `-o parent=eth0` address from being handed out to a container.
   104  
   105  ```
   106  docker network create -d macvlan \
   107      --subnet=172.16.86.0/24 \
   108      --gateway=172.16.86.1  \
   109      --aux-address="exclude_host=172.16.86.250" \
   110      -o parent=eth0 pub_net
   111  ```
   112  
   113  Another option for subpool IP address selection in a network provided by the default Docker IPAM driver is to use `--ip-range=`. This specifies the driver to allocate container addresses from this pool rather then the broader range from the `--subnet=` argument from a network create as seen in the following example that will allocate addresses beginning at `192.168.32.128` and increment upwards from there.
   114  
   115  ```
   116  docker network create -d macvlan  \
   117      --subnet=192.168.32.0/24  \
   118      --ip-range=192.168.32.128/25 \
   119      --gateway=192.168.32.254  \
   120      -o parent=eth0 macnet32
   121  
   122  # Start a container and verify the address is 192.168.32.128
   123  docker run --net=macnet32 -it --rm alpine /bin/sh
   124  ```
   125  
   126  The network can then be deleted with:
   127  
   128  ```
   129  docker network rm <network_name or id>
   130  ```
   131  
   132  - **Note:** In both Macvlan and Ipvlan you are not able to ping or communicate with the default namespace IP address. For example, if you create a container and try to ping the Docker host's `eth0` it will **not** work. That traffic is explicitly filtered by the kernel modules themselves to offer additional provider isolation and security.
   133  
   134  For more on Docker networking commands see [Working with Docker network commands](https://docs.docker.com/engine/userguide/networking/work-with-networks/)
   135  
   136  ### Ipvlan L2 Mode Example Usage
   137  
   138  The ipvlan `L2` mode example is virtually identical to the macvlan `bridge` mode example. The driver is specified with `-d driver_name` option. In this case `-d ipvlan`
   139  
   140  ![Simple Ipvlan L2 Mode Example](images/ipvlan_l2_simple.png)
   141  
   142  The parent interface in the next example `-o parent=eth0` is configured as followed:
   143  
   144  ```
   145  ip addr show eth0
   146  3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
   147      inet 192.168.1.250/24 brd 192.168.1.255 scope global eth0
   148  ```
   149  
   150  Use the network from the host's interface as the `--subnet` in the `docker network create`. The container will be attached to the same network as the host interface as set via the `-o parent=` option.
   151  
   152  Create the ipvlan network and run a container attaching to it:
   153  
   154  ```
   155  # Ipvlan  (-o ipvlan_mode= Defaults to L2 mode if not specified)
   156  docker network  create -d ipvlan \
   157      --subnet=192.168.1.0/24 \ 
   158      --gateway=192.168.1.1 \
   159      -o ipvlan_mode=l2 \
   160      -o parent=eth0 db_net
   161  
   162  # Start a container on the db_net network
   163  docker  run --net=db_net -it --rm alpine /bin/sh
   164  
   165  # NOTE: the containers can NOT ping the underlying host interfaces as
   166  # they are intentionally filtered by Linux for additional isolation.
   167  ```
   168  
   169  The default mode for Ipvlan is `l2`. The default mode for Macvlan is `bridge`. If `-o ipvlan_mode=` or `-o macvlan_mode=` are left unspecified, the default modes will be used. Similarly, if the `--gateway` is left empty, the first usable address on the network will be set as the gateway. For example, if the subnet provided in the network create is `--subnet=192.168.1.0/24` then the gateway the container receives is `192.168.1.1`.
   170  
   171  To help understand how this mode interacts with other hosts, the following figure shows the same layer 2 segment between two Docker hosts that applies to both Macvlan Bride mode and Ipvlan L2 mode.
   172  
   173  ![Multiple Ipvlan and Macvlan Hosts](images/macvlan-bridge-ipvlan-l2.png)
   174  
   175  The following will create the exact same network as the network `db_net` created prior, with the driver defaults for `--gateway=192.168.1.1` and `-o ipvlan_mode=l2`.
   176  
   177  ```
   178  # Ipvlan  (-o ipvlan_mode= Defaults to L2 mode if not specified)
   179  docker network  create -d ipvlan \
   180      --subnet=192.168.1.0/24 \ 
   181      -o parent=eth0 db_net_ipv
   182  
   183  # Start a container with an explicit name in daemon mode
   184  docker  run --net=db_net_ipv --name=ipv1 -itd alpine /bin/sh
   185  
   186  # Start a second container and ping using the container name
   187  # to see the docker included name resolution functionality
   188  docker  run --net=db_net_ipv --name=ipv2 -it --rm alpine /bin/sh
   189  ping -c 4 ipv1
   190  
   191  # NOTE: the containers can NOT ping the underlying host interfaces as
   192  # they are intentionally filtered by Linux for additional isolation.
   193  ```
   194  
   195  The drivers also support the `--internal` flag that will completely isolate containers on a network from any communications external to that network. Since network isolation is tightly coupled to the network's parent interface the result of leaving the `-o parent=` option off of a network create is the exact same as the `--internal` option. If the parent interface is not specified or the `--internal` flag is used, a netlink type `dummy` parent interface is created for the user and used as the parent interface effectively isolating the network completely.
   196  
   197  The following two `docker network create` examples result in identical networks that you can attach container to:
   198  
   199  ```
   200  # Empty '-o parent=' creates an isolated network
   201  docker network  create -d ipvlan \
   202      --subnet=192.168.10.0/24 isolated1
   203  
   204  # Explicit '--internal' flag is the same:
   205  docker network  create -d ipvlan \
   206      --subnet=192.168.11.0/24 --internal isolated2
   207  
   208  # Even the '--subnet=' can be left empty and the default 
   209  # IPAM subnet of 172.18.0.0/16 will be assigned
   210  docker network  create -d ipvlan isolated3
   211  
   212  docker run --net=isolated1 --name=cid1 -it --rm alpine /bin/sh
   213  docker run --net=isolated2 --name=cid2 -it --rm alpine /bin/sh
   214  docker run --net=isolated3 --name=cid3 -it --rm alpine /bin/sh
   215  
   216  # To attach to any use `docker exec` and start a shell
   217  docker exec -it cid1 /bin/sh
   218  docker exec -it cid2 /bin/sh
   219  docker exec -it cid3 /bin/sh
   220  ```
   221  
   222  ### Macvlan 802.1q Trunk Bridge Mode Example Usage
   223  
   224  VLANs (Virtual Local Area Networks) have long been a primary means of virtualizing data center networks and are still in virtually all existing networks today. VLANs work by tagging a Layer-2 isolation domain with a 12-bit identifier ranging from 1-4094 that is inserted into a packet header that enables a logical grouping of a single or multiple subnets of both IPv4 and IPv6. It is very common for network operators to separate traffic using VLANs based on a subnet(s) function or security profile such as `web`, `db` or any other isolation needs.
   225  
   226  It is very common to have a compute host requirement of running multiple virtual networks concurrently on a host. Linux networking has long supported VLAN tagging, also known by it's standard 802.1q, for maintaining datapath isolation between networks. The Ethernet link connected to a Docker host can be configured to support the 802.1q VLAN IDs, by creating Linux sub-interfaces, each one dedicated to a unique VLAN ID.
   227  
   228  ![Simple Ipvlan L2 Mode Example](images/multi_tenant_8021q_vlans.png)
   229  
   230  Trunking 802.1q to a Linux host is notoriously painful for many in operations. It requires configuration file changes in order to be persistent through a reboot. If a bridge is involved, a physical NIC needs to be moved into the bridge and the bridge then gets the IP address. This has lead to many a stranded servers since the risk of cutting off access during that convoluted process is high.
   231  
   232  Like all of the Docker network drivers, the overarching goal is to alleviate the operational pains of managing network resources. To that end, when a network receives a sub-interface as the parent that does not exist, the drivers create the VLAN tagged interfaces while creating the network.
   233  
   234  In the case of a host reboot, instead of needing to modify often complex network configuration files the driver will recreate all network links when the Docker daemon restarts. The driver tracks if it created the VLAN tagged sub-interface originally with the network create and will **only** recreate the sub-interface after a restart or delete `docker network rm` the link if it created it in the first place with `docker network create`.
   235  
   236  If the user doesn't want Docker to modify the `-o parent` sub-interface, the user simply needs to pass an existing link that already exists as the parent interface. Parent interfaces such as `eth0` are not deleted, only sub-interfaces that are not master links.
   237  
   238  For the driver to add/delete the vlan sub-interfaces the format needs to be `interface_name.vlan_tag`.
   239  
   240  For example: `eth0.50` denotes a parent interface of `eth0` with a slave of `eth0.50` tagged with vlan id `50`. The equivalent `ip link` command would be `ip link add link eth0 name eth0.50 type vlan id 50`.
   241  
   242  Replace the `macvlan` with `ipvlan` in the `-d` driver argument to create macvlan 802.1q trunks. 
   243  
   244  **Vlan ID 50**
   245  
   246  In the first network tagged and isolated by the Docker host, `eth0.50` is the parent interface tagged with vlan id `50` specified with `-o parent=eth0.50`. Other naming formats can be used, but the links need to be added and deleted manually using `ip link` or Linux configuration files. As long as the `-o parent` exists anything can be used if compliant with Linux netlink.
   247  
   248  ```
   249  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
   250  docker network  create  -d macvlan \
   251      --subnet=192.168.50.0/24 \
   252      --gateway=192.168.50.1 \
   253      -o parent=eth0.50 macvlan50
   254  
   255  # In two separate terminals, start a Docker container and the containers can now ping one another.
   256  docker run --net=macvlan50 -it --name macvlan_test5 --rm alpine /bin/sh
   257  docker run --net=macvlan50 -it --name macvlan_test6 --rm alpine /bin/sh
   258  ```
   259  
   260  **Vlan ID 60**
   261  
   262  In the second network, tagged and isolated by the Docker host, `eth0.60` is the parent interface tagged with vlan id `60` specified with `-o parent=eth0.60`. The `macvlan_mode=` defaults to `macvlan_mode=bridge`. It can also be explicitly set with the same result as shown in the next example.
   263  
   264  ```
   265  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged. 
   266  docker network  create  -d macvlan \
   267      --subnet=192.168.60.0/24 \
   268      --gateway=192.168.60.1 \
   269      -o parent=eth0.60 -o \
   270      -o macvlan_mode=bridge macvlan60
   271  
   272  # In two separate terminals, start a Docker container and the containers can now ping one another.
   273  docker run --net=macvlan60 -it --name macvlan_test7 --rm alpine /bin/sh
   274  docker run --net=macvlan60 -it --name macvlan_test8 --rm alpine /bin/sh
   275  ```
   276  
   277  **Example:** Multi-Subnet Macvlan 802.1q Trunking
   278  
   279  The same as the example before except there is an additional subnet bound to the network that the user can choose to provision containers on. In MacVlan/Bridge mode, containers can only ping one another if they are on the same subnet/broadcast domain unless there is an external router that routes the traffic (answers ARP etc) between the two subnets.
   280  
   281  ```
   282  ### Create multiple L2 subnets
   283  docker network  create  -d ipvlan \
   284      --subnet=192.168.210.0/24 \
   285      --subnet=192.168.212.0/24 \
   286      --gateway=192.168.210.254  \
   287      --gateway=192.168.212.254  \
   288       -o ipvlan_mode=l2 ipvlan210
   289  
   290  # Test 192.168.210.0/24 connectivity between containers
   291  docker run --net=ipvlan210 --ip=192.168.210.10 -itd alpine /bin/sh
   292  docker run --net=ipvlan210 --ip=192.168.210.9 -it --rm alpine ping -c 2 192.168.210.10
   293  
   294  # Test 192.168.212.0/24 connectivity between containers
   295  docker run --net=ipvlan210 --ip=192.168.212.10 -itd alpine /bin/sh
   296  docker run --net=ipvlan210 --ip=192.168.212.9 -it --rm alpine ping -c 2 192.168.212.10
   297  
   298  ```
   299  
   300  ### Ipvlan 802.1q Trunk L2 Mode Example Usage
   301  
   302  Architecturally, Ipvlan L2 mode trunking is the same as Macvlan with regard to gateways and L2 path isolation. There are nuances that can be advantageous for CAM table pressure in ToR switches, one MAC per port and MAC exhaustion on a host's parent NIC to name a few. The 802.1q trunk scenario looks the same. Both modes adhere to tagging standards and have seamless integration with the physical network for underlay integration and hardware vendor plugin integrations.
   303  
   304  Hosts on the same VLAN are typically on the same subnet and almost always are grouped together based on their security policy. In most scenarios, a multi-tier application is tiered into different subnets because the security profile of each process requires some form of isolation. For example, hosting your credit card processing on the same virtual network as the frontend webserver would be a regulatory compliance issue, along with circumventing the long standing best practice of layered defense in depth architectures. VLANs or the equivocal VNI (Virtual Network Identifier) when using the Overlay driver, are the first step in isolating tenant traffic.
   305  
   306  ![Docker VLANs in Depth](images/vlans-deeper-look.png)
   307  
   308  The Linux sub-interface tagged with a vlan can either already exist or will be created when you call a `docker network create`. `docker network rm` will delete the sub-interface. Parent interfaces such as `eth0` are not deleted, only sub-interfaces with a netlink parent index > 0.
   309  
   310  For the driver to add/delete the vlan sub-interfaces the format needs to be `interface_name.vlan_tag`. Other sub-interface naming can be used as the specified parent, but the link will not be deleted automatically when `docker network rm` is invoked. 
   311  
   312  The option to use either existing parent vlan sub-interfaces or let Docker manage them enables the user to either completely manage the Linux interfaces and networking or let Docker create and delete the Vlan parent sub-interfaces (netlink `ip link`) with no effort from the user.
   313  
   314  For example: `eth0.10` to denote a sub-interface of `eth0` tagged with vlan id `10`. The equivalent `ip link` command would be `ip link add link eth0 name eth0.10 type vlan id 10`.
   315  
   316  The example creates the vlan tagged networks and then start two containers to test connectivity between containers. Different Vlans cannot ping one another without a router routing between the two networks. The default namespace is not reachable per ipvlan design in order to isolate container namespaces from the underlying host.
   317  
   318  **Vlan ID 20**
   319  
   320  In the first network tagged and isolated by the Docker host, `eth0.20` is the parent interface tagged with vlan id `20` specified with `-o parent=eth0.20`. Other naming formats can be used, but the links need to be added and deleted manually using `ip link` or Linux configuration files. As long as the `-o parent` exists anything can be used if compliant with Linux netlink.
   321  
   322  ```
   323  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
   324  docker network  create  -d ipvlan \
   325      --subnet=192.168.20.0/24 \
   326      --gateway=192.168.20.1 \
   327      -o parent=eth0.20 ipvlan20
   328  
   329  # in two separate terminals, start a Docker container and the containers can now ping one another.
   330  docker run --net=ipvlan20 -it --name ivlan_test1 --rm alpine /bin/sh
   331  docker run --net=ipvlan20 -it --name ivlan_test2 --rm alpine /bin/sh
   332  ```
   333  
   334  **Vlan ID 30**
   335  
   336  In the second network, tagged and isolated by the Docker host, `eth0.30` is the parent interface tagged with vlan id `30` specified with `-o parent=eth0.30`. The `ipvlan_mode=` defaults to l2 mode `ipvlan_mode=l2`. It can also be explicitly set with the same result as shown in the next example.
   337  
   338  ```
   339  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged.
   340  docker network  create  -d ipvlan \
   341      --subnet=192.168.30.0/24 \
   342      --gateway=192.168.30.1 \
   343      -o parent=eth0.30 \
   344      -o ipvlan_mode=l2 ipvlan30
   345  
   346  # in two separate terminals, start a Docker container and the containers can now ping one another.
   347  docker run --net=ipvlan30 -it --name ivlan_test3 --rm alpine /bin/sh
   348  docker run --net=ipvlan30 -it --name ivlan_test4 --rm alpine /bin/sh
   349  ```
   350  
   351  The gateway is set inside of the container as the default gateway. That gateway would typically be an external router on the network.
   352  
   353  ```
   354  $ ip route
   355    default via 192.168.30.1 dev eth0
   356    192.168.30.0/24 dev eth0  src 192.168.30.2
   357  ```
   358  
   359  Example: Multi-Subnet Ipvlan L2 Mode starting two containers on the same subnet and pinging one another. In order for the `192.168.114.0/24` to reach `192.168.116.0/24` it requires an external router in L2 mode. L3 mode can route between subnets that share a common `-o parent=`. This same multi-subnet example is also valid for Macvlan `bridge` mode.
   360  
   361  Secondary addresses on network routers are common as an address space becomes exhausted to add another secondary to a L3 vlan interface or commonly referred to as a "switched virtual interface" (SVI).
   362  
   363  ```
   364  docker network  create  -d ipvlan \
   365      --subnet=192.168.114.0/24 --subnet=192.168.116.0/24 \
   366      --gateway=192.168.114.254  --gateway=192.168.116.254 \
   367       -o parent=eth0.114 \
   368       -o ipvlan_mode=l2 ipvlan114
   369       
   370  docker run --net=ipvlan114 --ip=192.168.114.10 -it --rm alpine /bin/sh
   371  docker run --net=ipvlan114 --ip=192.168.114.11 -it --rm alpine /bin/sh
   372  ```
   373  
   374  A key takeaway is, operators have the ability to map their physical network into their virtual network for integrating containers into their environment with no operational overhauls required. NetOps simply drops an 802.1q trunk into the Docker host. That virtual link would be the `-o parent=` passed in the network creation. For untagged (non-VLAN) links, it is as simple as `-o parent=eth0` or for 802.1q trunks with VLAN IDs each network gets mapped to the corresponding VLAN/Subnet from the network.
   375  
   376  An example being, NetOps provides VLAN ID and the associated subnets for VLANs being passed on the Ethernet link to the Docker host server. Those values are simply plugged into the `docker network create` commands when provisioning the Docker networks. These are persistent configurations that are applied every time the Docker engine starts which alleviates having to manage often complex configuration files. The network interfaces can also be managed manually by being pre-created and docker networking will never modify them, simply use them as parent interfaces. Example mappings from NetOps to Docker network commands are as follows:
   377  
   378  - VLAN: 10, Subnet: 172.16.80.0/24, Gateway: 172.16.80.1
   379  
   380      - `--subnet=172.16.80.0/24 --gateway=172.16.80.1 -o parent=eth0.10` 
   381  
   382  - VLAN: 20, IP subnet: 172.16.50.0/22, Gateway: 172.16.50.1
   383  
   384      - `--subnet=172.16.50.0/22 --gateway=172.16.50.1 -o parent=eth0.20 ` 
   385  
   386  - VLAN: 30, Subnet: 10.1.100.0/16, Gateway: 10.1.100.1
   387  
   388      - `--subnet=10.1.100.0/16 --gateway=10.1.100.1 -o parent=eth0.30` 
   389  
   390  ### IPVlan L3 Mode Example
   391  
   392  IPVlan will require routes to be distributed to each endpoint. The driver only builds the Ipvlan L3 mode port and attaches the container to the interface. Route distribution throughout a cluster is beyond the initial implementation of this single host scoped driver. In L3 mode, the Docker host is very similar to a router starting new networks in the container. They are on networks that the upstream network will not know about without route distribution. For those curious how Ipvlan L3 will fit into container networking see the following examples.
   393  
   394  ![Docker Ipvlan L2 Mode](images/ipvlan-l3.png)
   395  
   396  Ipvlan L3 mode drops all broadcast and multicast traffic. This reason alone makes Ipvlan L3 mode a prime candidate for those looking for massive scale and predictable network integrations. It is predictable and in turn will lead to greater uptimes because there is no bridging involved. Bridging loops have been responsible for high profile outages that can be hard to pinpoint depending on the size of the failure domain. This is due to the cascading nature of BPDUs (Bridge Port Data Units) that are flooded throughout a broadcast domain (VLAN) to find and block topology loops. Eliminating bridging domains, or at the least, keeping them isolated to a pair of ToRs (top of rack switches) will reduce hard to troubleshoot bridging instabilities. Macvlan Bridge and Ipvlan L2 modes are well suited for isolated VLANs only trunked into a pair of ToRs that can provide a loop-free non-blocking fabric. The next step further is to route at the edge via Ipvlan L3 mode that reduces a failure domain to a local host only. 
   397  
   398  - L3 mode needs to be on a separate subnet as the default namespace since it requires a netlink route in the default namespace pointing to the Ipvlan parent interface.
   399  
   400  - The parent interface used in this example is `eth0` and it is on the subnet `192.168.1.0/24`. Notice the `docker network` is **not** on the same subnet as `eth0`.
   401  
   402  - Unlike macvlan bridge mode and ipvlan l2 modes, different subnets/networks can ping one another as long as they share the same parent interface `-o parent=`.
   403  
   404  ```
   405  ip a show eth0
   406  3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
   407      link/ether 00:50:56:39:45:2e brd ff:ff:ff:ff:ff:ff
   408      inet 192.168.1.250/24 brd 192.168.1.255 scope global eth0
   409  ```
   410  
   411  -A traditional gateway doesn't mean much to an L3 mode Ipvlan interface since there is no broadcast traffic allowed. Because of that, the container default gateway simply point the the containers `eth0` device. See below for CLI output of `ip route` or `ip -6 route` from inside an L3 container for details.
   412  
   413  The mode ` -o ipvlan_mode=l3` must be explicitly specified since the default ipvlan mode is `l2`.
   414  
   415  The following example does not specify a parent interface. The network drivers will create a dummy type link for the user rather then rejecting the network creation and isolating containers from only communicating with one another.
   416  
   417  ```
   418  # Create the Ipvlan L3 network
   419  docker network  create  -d ipvlan \
   420      --subnet=192.168.214.0/24 \
   421      --subnet=10.1.214.0/24 \
   422       -o ipvlan_mode=l3 ipnet210
   423  
   424  # Test 192.168.214.0/24 connectivity
   425  docker run --net=ipnet210 --ip=192.168.214.10 -itd alpine /bin/sh
   426  docker run --net=ipnet210 --ip=10.1.214.10 -itd alpine /bin/sh
   427  
   428  # Test L3 connectivity from 10.1.214.0/24 to 192.168.212.0/24
   429  docker run --net=ipnet210 --ip=192.168.214.9 -it --rm alpine ping -c 2 10.1.214.10
   430  
   431  # Test L3 connectivity from 192.168.212.0/24 to 10.1.214.0/24
   432  docker run --net=ipnet210 --ip=10.1.214.9 -it --rm alpine ping -c 2 192.168.214.10
   433  
   434  ```
   435  
   436  Notice there is no `--gateway=` option in the network create. The field is ignored if one is specified `l3` mode. Take a look at the container routing table from inside of the container:
   437  
   438  ```
   439  # Inside an L3 mode container
   440  $ ip route
   441    default dev eth0
   442    192.168.120.0/24 dev eth0  src 192.168.120.2
   443  ```
   444  
   445  In order to ping the containers from a remote Docker host or the container be able to ping a remote host, the remote host or the physical network in between need to have a route pointing to the host IP address of the container's Docker host eth interface. More on this as we evolve the Ipvlan `L3` story.
   446  
   447  ### Dual Stack IPv4 IPv6 Macvlan Bridge Mode
   448  
   449  **Example:** Macvlan Bridge mode, 802.1q trunk, VLAN ID: 218, Multi-Subnet, Dual Stack
   450  
   451  ```
   452  # Create multiple bridge subnets with a gateway of x.x.x.1:
   453  docker network  create  -d macvlan \
   454      --subnet=192.168.216.0/24 --subnet=192.168.218.0/24 \
   455      --gateway=192.168.216.1  --gateway=192.168.218.1 \
   456      --subnet=2001:db8:abc8::/64 --gateway=2001:db8:abc8::10 \
   457       -o parent=eth0.218 \
   458       -o macvlan_mode=bridge macvlan216
   459  
   460  # Start a container on the first subnet 192.168.216.0/24
   461  docker run --net=macvlan216 --name=macnet216_test --ip=192.168.216.10 -itd alpine /bin/sh
   462  
   463  # Start a container on the second subnet 192.168.218.0/24
   464  docker run --net=macvlan216 --name=macnet216_test --ip=192.168.218.10 -itd alpine /bin/sh
   465  
   466  # Ping the first container started on the 192.168.216.0/24 subnet
   467  docker run --net=macvlan216 --ip=192.168.216.11 -it --rm alpine /bin/sh
   468  ping 192.168.216.10
   469  
   470  # Ping the first container started on the 192.168.218.0/24 subnet
   471  docker run --net=macvlan216 --ip=192.168.218.11 -it --rm alpine /bin/sh
   472  ping 192.168.218.10
   473  ```
   474  
   475  View the details of one of the containers:
   476  
   477  ```
   478  docker run --net=macvlan216 --ip=192.168.216.11 -it --rm alpine /bin/sh
   479  
   480  root@526f3060d759:/# ip a show eth0
   481      eth0@if92: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
   482      link/ether 8e:9a:99:25:b6:16 brd ff:ff:ff:ff:ff:ff
   483      inet 192.168.216.11/24 scope global eth0
   484         valid_lft forever preferred_lft forever
   485      inet6 2001:db8:abc4::8c9a:99ff:fe25:b616/64 scope link tentative
   486         valid_lft forever preferred_lft forever
   487      inet6 2001:db8:abc8::2/64 scope link nodad
   488         valid_lft forever preferred_lft forever
   489  
   490  # Specified v4 gateway of 192.168.216.1     
   491  root@526f3060d759:/# ip route
   492    default via 192.168.216.1 dev eth0
   493    192.168.216.0/24 dev eth0  proto kernel  scope link  src 192.168.216.11
   494  
   495  # Specified v6 gateway of 2001:db8:abc8::10
   496  root@526f3060d759:/# ip -6 route
   497    2001:db8:abc4::/64 dev eth0  proto kernel  metric 256
   498    2001:db8:abc8::/64 dev eth0  proto kernel  metric 256
   499    default via 2001:db8:abc8::10 dev eth0  metric 1024
   500  ```
   501  
   502  ### Dual Stack IPv4 IPv6 Ipvlan L2 Mode
   503  
   504  - Not only does Libnetwork give you complete control over IPv4 addressing, but it also gives you total control over IPv6 addressing as well as feature parity between the two address families.
   505  
   506  - The next example will start with IPv6 only. Start two containers on the same VLAN `139` and ping one another. Since the IPv4 subnet is not specified, the default IPAM will provision a default IPv4 subnet. That subnet is isolated unless the upstream network is explicitly routing it on VLAN `139`.
   507  
   508  ```
   509  # Create a v6 network
   510  docker network create -d ipvlan \
   511      --subnet=2001:db8:abc2::/64 --gateway=2001:db8:abc2::22 \
   512      -o parent=eth0.139 v6ipvlan139
   513      
   514  # Start a container on the network
   515  docker run --net=v6ipvlan139 -it --rm alpine /bin/sh
   516  
   517  ```
   518  
   519  View the container eth0 interface and v6 routing table:
   520  
   521  ```
   522   eth0@if55: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
   523      link/ether 00:50:56:2b:29:40 brd ff:ff:ff:ff:ff:ff
   524      inet 172.18.0.2/16 scope global eth0
   525         valid_lft forever preferred_lft forever
   526      inet6 2001:db8:abc4::250:56ff:fe2b:2940/64 scope link
   527         valid_lft forever preferred_lft forever
   528      inet6 2001:db8:abc2::1/64 scope link nodad
   529         valid_lft forever preferred_lft forever
   530         
   531  root@5c1dc74b1daa:/# ip -6 route
   532  2001:db8:abc4::/64 dev eth0  proto kernel  metric 256
   533  2001:db8:abc2::/64 dev eth0  proto kernel  metric 256
   534  default via 2001:db8:abc2::22 dev eth0  metric 1024
   535  ```
   536  
   537  Start a second container and ping the first container's v6 address. 
   538  
   539  ```
   540  $ docker run --net=v6ipvlan139 -it --rm alpine /bin/sh
   541  
   542  root@b817e42fcc54:/# ip a show eth0
   543  75: eth0@if55: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
   544      link/ether 00:50:56:2b:29:40 brd ff:ff:ff:ff:ff:ff
   545      inet 172.18.0.3/16 scope global eth0
   546         valid_lft forever preferred_lft forever
   547      inet6 2001:db8:abc4::250:56ff:fe2b:2940/64 scope link tentative dadfailed
   548         valid_lft forever preferred_lft forever
   549      inet6 2001:db8:abc2::2/64 scope link nodad
   550         valid_lft forever preferred_lft forever
   551  
   552  root@b817e42fcc54:/# ping6 2001:db8:abc2::1
   553  PING 2001:db8:abc2::1 (2001:db8:abc2::1): 56 data bytes
   554  64 bytes from 2001:db8:abc2::1%eth0: icmp_seq=0 ttl=64 time=0.044 ms
   555  64 bytes from 2001:db8:abc2::1%eth0: icmp_seq=1 ttl=64 time=0.058 ms
   556  
   557  2 packets transmitted, 2 packets received, 0% packet loss
   558  round-trip min/avg/max/stddev = 0.044/0.051/0.058/0.000 ms
   559  ```
   560  
   561  The next example with setup a dual stack IPv4/IPv6 network with an example VLAN ID of `140`.
   562  
   563  Next create a network with two IPv4 subnets and one IPv6 subnets, all of which have explicit gateways:
   564  
   565  ```
   566  docker network  create  -d ipvlan \
   567      --subnet=192.168.140.0/24 --subnet=192.168.142.0/24 \
   568      --gateway=192.168.140.1  --gateway=192.168.142.1 \
   569      --subnet=2001:db8:abc9::/64 --gateway=2001:db8:abc9::22 \
   570       -o parent=eth0.140 \
   571       -o ipvlan_mode=l2 ipvlan140
   572  ```
   573  
   574  Start a container and view eth0 and both v4 & v6 routing tables:
   575  
   576  ```
   577  docker run --net=v6ipvlan139 --ip6=2001:db8:abc2::51 -it --rm alpine /bin/sh
   578  
   579  root@3cce0d3575f3:/# ip a show eth0
   580  78: eth0@if77: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
   581      link/ether 00:50:56:2b:29:40 brd ff:ff:ff:ff:ff:ff
   582      inet 192.168.140.2/24 scope global eth0
   583         valid_lft forever preferred_lft forever
   584      inet6 2001:db8:abc4::250:56ff:fe2b:2940/64 scope link
   585         valid_lft forever preferred_lft forever
   586      inet6 2001:db8:abc9::1/64 scope link nodad
   587         valid_lft forever preferred_lft forever
   588  
   589  root@3cce0d3575f3:/# ip route
   590  default via 192.168.140.1 dev eth0
   591  192.168.140.0/24 dev eth0  proto kernel  scope link  src 192.168.140.2
   592  
   593  root@3cce0d3575f3:/# ip -6 route
   594  2001:db8:abc4::/64 dev eth0  proto kernel  metric 256
   595  2001:db8:abc9::/64 dev eth0  proto kernel  metric 256
   596  default via 2001:db8:abc9::22 dev eth0  metric 1024
   597  ```
   598  
   599  Start a second container with a specific `--ip4` address and ping the first host using ipv4 packets:
   600  
   601  ```
   602  docker run --net=ipvlan140 --ip=192.168.140.10 -it --rm alpine /bin/sh
   603  ```
   604  
   605  **Note**: Different subnets on the same parent interface in both Ipvlan `L2` mode and Macvlan `bridge` mode cannot ping one another. That requires a router to proxy-arp the requests with a secondary subnet. However, Ipvlan `L3` will route the unicast traffic between disparate subnets as long as they share the same `-o parent` parent link.
   606  
   607  
   608  
   609  ### Dual Stack IPv4 IPv6 Ipvlan L3 Mode 
   610  
   611  
   612  **Example:** IpVlan L3 Mode Dual Stack IPv4/IPv6, Multi-Subnet w/ 802.1q Vlan Tag:118
   613  
   614  As in all of the examples, a tagged VLAN interface does not have to be used. The sub-interfaces can be swapped with `eth0`, `eth1`, `bond0` or any other valid interface on the host other then the `lo` loopback.
   615  
   616  The primary difference you will see is that L3 mode does not create a default route with a next-hop but rather sets a default route pointing to `dev eth` only since ARP/Broadcasts/Multicast are all filtered by Linux as per the design. Since the parent interface is essentially acting as a router, the parent interface IP and subnet needs to be different from the container networks. That is the opposite of bridge and L2 modes, which need to be on the same subnet (broadcast domain) in order to forward broadcast and multicast packets.
   617  
   618  ```
   619  # Create an IPv6+IPv4 Dual Stack Ipvlan L3 network 
   620  # Gateways for both v4 and v6 are set to a dev e.g. 'default dev eth0'
   621  docker network  create  -d ipvlan \
   622      --subnet=192.168.110.0/24 \
   623      --subnet=192.168.112.0/24 \
   624      --subnet=2001:db8:abc6::/64 \
   625       -o parent=eth0 \
   626       -o ipvlan_mode=l3 ipnet110
   627  
   628  
   629  # Start a few of containers on the network (ipnet110) 
   630  # in seperate terminals and check connectivity
   631  docker run --net=ipnet110 -it --rm alpine /bin/sh
   632  # Start a second container specifying the v6 address
   633  docker run --net=ipnet110 --ip6=2001:db8:abc6::10 -it --rm alpine /bin/sh
   634  # Start a third specifying the IPv4 address
   635  docker run --net=ipnet110 --ip=192.168.112.50 -it --rm alpine /bin/sh
   636  # Start a 4th specifying both the IPv4 and IPv6 addresses
   637  docker run --net=ipnet110 --ip6=2001:db8:abc6::50 --ip=192.168.112.50 -it --rm alpine /bin/sh
   638  ```
   639  
   640  Interface and routing table outputs are as follows:
   641  
   642  ```
   643  root@3a368b2a982e:/# ip a show eth0
   644  63: eth0@if59: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
   645      link/ether 00:50:56:2b:29:40 brd ff:ff:ff:ff:ff:ff
   646      inet 192.168.112.2/24 scope global eth0
   647         valid_lft forever preferred_lft forever
   648      inet6 2001:db8:abc4::250:56ff:fe2b:2940/64 scope link
   649         valid_lft forever preferred_lft forever
   650      inet6 2001:db8:abc6::10/64 scope link nodad
   651         valid_lft forever preferred_lft forever
   652       
   653  # Note the default route is simply the eth device because ARPs are filtered.
   654  root@3a368b2a982e:/# ip route
   655    default dev eth0  scope link
   656    192.168.112.0/24 dev eth0  proto kernel  scope link  src 192.168.112.2
   657  
   658  root@3a368b2a982e:/# ip -6 route
   659  2001:db8:abc4::/64 dev eth0  proto kernel  metric 256
   660  2001:db8:abc6::/64 dev eth0  proto kernel  metric 256
   661  default dev eth0  metric 1024
   662  ```
   663  
   664  *Note:* There may be a bug when specifying `--ip6=` addresses when you delete a container with a specified v6 address and then start a new container with the same v6 address it throws the following like the address isn't properly being released to the v6 pool. It will fail to unmount the container and be left dead.
   665  
   666  ```
   667  docker: Error response from daemon: Address already in use.
   668  ```
   669  
   670  ### Manually Creating 802.1q Links
   671  
   672  **Vlan ID 40**
   673  
   674  If a user does not want the driver to create the vlan sub-interface it simply needs to exist prior to the `docker network create`. If you have sub-interface naming that is not `interface.vlan_id` it is honored in the `-o parent=` option again as long as the interface exists and us up.
   675  
   676  Links if manually created can be named anything you want. As long as the exist when the network is created that is all that matters. Manually created links do not get deleted regardless of the name when the network is deleted with `docker network rm`.
   677  
   678  ```
   679  # create a new sub-interface tied to dot1q vlan 40
   680  ip link add link eth0 name eth0.40 type vlan id 40
   681  
   682  # enable the new sub-interface
   683  ip link set eth0.40 up
   684  
   685  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
   686  docker network  create  -d ipvlan \
   687     --subnet=192.168.40.0/24 \
   688     --gateway=192.168.40.1 \
   689     -o parent=eth0.40 ipvlan40
   690  
   691  # in two separate terminals, start a Docker container and the containers can now ping one another.
   692  docker run --net=ipvlan40 -it --name ivlan_test5 --rm alpine /bin/sh
   693  docker run --net=ipvlan40 -it --name ivlan_test6 --rm alpine /bin/sh
   694  ```
   695  
   696  **Example:** Vlan sub-interface manually created with any name:
   697  
   698  ```
   699  # create a new sub interface tied to dot1q vlan 40
   700  ip link add link eth0 name foo type vlan id 40
   701  
   702  # enable the new sub-interface
   703  ip link set foo up
   704  
   705  # now add networks and hosts as you would normally by attaching to the master (sub)interface that is tagged
   706  docker network  create  -d ipvlan \
   707      --subnet=192.168.40.0/24 --gateway=192.168.40.1 \
   708      -o parent=foo ipvlan40
   709  
   710  # in two separate terminals, start a Docker container and the containers can now ping one another.
   711  docker run --net=ipvlan40 -it --name ivlan_test5 --rm alpine /bin/sh
   712  docker run --net=ipvlan40 -it --name ivlan_test6 --rm alpine /bin/sh
   713  ```
   714  
   715  Manually created links can be cleaned up with:
   716  
   717  ```
   718  ip link del foo
   719  ```
   720  
   721  As with all of the Libnetwork drivers, they can be mixed and matched, even as far as running 3rd party ecosystem drivers in parallel for maximum flexibility to the Docker user.