I've been having a lot of fun recently with Docker containers, from packaging and running my own Python scripts, to building the Pocket Internet proof of concept at the recent RIPE Hackathon and, finally, designing a solution for integrating a multi-datacentre, multi-environment Docker Swarm with a Cisco ACI fabric and the rest of the network for one of my customers. Below you will find my notes accumulated from going through official documentation, blog posts and experimentation in the lab.
Docker networks
When installed, docker creates 3 networks by default (and you can only change one of them):
bridge
this is thedocker0
network and where containers are attached by default- it masquerades to the outside world (PAT)
- it allows containers to talk to each other on the same host
- there's no built-in service discovery or name resolution
- can be disabled/reconfigured
none
if you attach a container to this, there's no network access and it has only the loopback interfacehost
allows the container to be attached to the host's network stack directly
User-defined networks:
bridge
- functionality as above, but can also be set up as an isolated internal network (no NAT, no gateway)
- you can expose (proxy) ports to make certain containers visible from the outside
docker_gwbridge
- used in swarms (see below) or created on demand if there's no other
bridge
network that provides external connectivity to containers
- used in swarms (see below) or created on demand if there's no other
overlay
used in swarms (see below)macvlan
- container interfaces are attached directly to a docker host (sub)interface
- supports dot1q tagging (automatically created subinterfaces or manually attached)
- each container gets its own unique MAC and IP address
- no port mapping needed to expose services as containers are directly on the network (both good and bad!)
- you may have to enable promiscuous mode on the parent interface
- you will have to disable security features in the vSwitch if using VMs (multiple MACs behind the same adapter)
- suffers from limitations in swarm mode:
- service discovery and service name load balancing are docker-host local only (you will need an external service to manage it, docker itself won't be able to)
- IP subnets need to be split into ranges allocated to each docker host in the swarm to prevent overlapping
- can be made to work for network connectivity in a swarm as detailed here
ipvlan
L2 mode
functions similarly tomacvlan
but allocates the same MAC (from the parent host interface) to all containers- sharing the same MAC address will create problems with DHCP and SLAAC
L3 mode
routes the packets between the parent interface and the subinterfaces (different subnets!)- it requires static routes (or a routing protocol in the container), but it does not decrement TTL and it does not support multicast
custom
network plugin - this can be a 3rd party module that uses the Docker API
Other notes:
- Docker provides an embedded DNS server that resolves container names to IPs on the same network
- You can publish a port which instructs the daemon to map a container port to a free (high-order) port on the host machine for external connections
- Docker uses the system's iptables to perform operations on the host (routing, port forwarding, NAT etc.)
Docker Swarm
- A swarm groups together a bunch of docker hosts and facilitates starting containers across the pool and preserving local-like network connectivity within services
- 2 types of traffic: control/management plane and application data plane
- 3 networks
overlay
(driver)- they facilitate communication between docker hosts (daemons) within the swarm
- you can attach services to one or more overlays
- the linux namespace created has static ARP entries for each running container and the interface acts as a proxy for ARP queries
- each overlay has a VXLAN id allocated to it (used for encapsulation) - see how that looks in the Docker overlays on Cisco ACI post
ingress
(special overlay)- facilitates load-balancing between a service's nodes
- requires published ports
- when traffic arrives on the port, an IP from the list of available nodes is selected and traffic sent to it via the ingress overlay (!)
docker_gwbridge
(bridge driver)- connects the overlays to an individual docker host's physical network
- can't be used for inter-container-communication
- masquerades (NAT) to the outside world
- Communication between docker hosts (daemons):
7946 TCP/UDP
for container network discovery4789 UDP
for the container overlay network- swarm nodes exchange control plane data encrypted -
AES-GCM
- you can enable per-overlay encryption with a flag ->
IPSEC
tunnels between all nodes where tasks are scheduled for services that are attached to that overlay network (so partial mesh on-demand tunnels)
- MTU considerations
- minimum extra VXLAN encapsulation
- optional IPSEC encapsulation
- overlay interfaces automatically adjust by lowering mtu (1450 for non-encrypted)
References
- Macvlan network driver
- Macvlan vs Ipvlan
- IPVLAN Driver HOWTO (Linux Kernel)
- Docker Swarm Networking Official Documentation
- Docker Overlay Security Model
- Awesome Docker - A curated list of Docker resources and projects
- Deep-dive into Docker Overlay Networks: Part 1 + Part 2 + Part 3
- Deeper Dive in Docker Overlay Networks excellent presentation/demo given at dockercon17eu by Laurent about container networking with VXLAN/BGP/EVPN
- Vincent Bernat's excellent article on VXLAN & Linux
More, more, more
Cumulus wrote a very nice article on 5 ways to design your container network which lists options for connecting Docker Swarms to a fully routed DC fabric (with links to detailed, public(!), validated design documents). If you are able to push routing all the way to the Docker host these designs look very straightforward and I might have to dig a bit deeper into the design docs to understand what security options are available when you can't just let the services roam freely (or push policy all the way into the container).
Doing something similar is a project called bagpipe-bgp, which has now become part of OpenStack. A blog post about experimenting with it lives here.
Kubernetes
Kubernetes is all the rage now as the more powerful (tradeoff - steep learning curve, more complicated to run), more scalable orchestrator option to Docker Swarm, so it's worth a mention, but mostly out of scope for this article because:
- Kubernetes does not include a network plugin (apart from host-only networking), but imposes the following requirements:
- all containers can communicate with all other containers without NAT
- all nodes can communicate with all containers (and vice-versa) without NAT
- the IP that a container sees itself as is the same IP that others see it as
- Many 3rd party implementations of networking options are listed in the Kubernetes documentation
- Reading material: