Archive for docker

DockerCon 2016 Keynote

Andrea and I presented Docker 1.12 orchestration on stage at DockerCon this year — in front of 4,000 people! I don’t even think I’ve ever met 4,000 people. It was awesome!

Comments (5)

Docker Swarm 1.0 with Multi-host Networking: Manual Setup

Jeff Nickoloff had a great Medium post recently about how to set up a Swarm 1.0 cluster using the multi-host networking features added in Docker 1.9. He uses Docker Machine’s built-in Swarm support to create a demonstration cluster in just a few minutes.

In this post, I’ll show how to recreate the same cluster manually — that is, without docker-machine provisioning. This is for advanced users who want to understand what Machine is doing under the hood.

First, let’s take a look at the layout of the cluster Jeff created in his post. There are four machines:

Swarm cluster topology

Topology of our Swarm cluster.

To approximate the provisioning that Machine is doing under the hood, we’ll use this Vagrantfile to provision four Ubuntu boxes:

Name   IP   Description
kv2   Consul (for both cluster discovery and networking)
c0-master   Swarm master
c0-n1   Swarm node 1
c0-n2   Swarm node 2

In the directory where you saved the Vagrantfile, run vagrant up. This will take 5-10 minutes, but at the end of the process you should have four running VMs with Docker 1.9 or higher running. Note how our Vagrant file starts each instance of Docker Engine (the docker daemon) with --cluster-store=consul:// and --cluster-advertise=eth1:2375. Those flags are the same ones Jeff passes to docker-machine using --engine-opt.

Because Docker networking requires a >= 3.16 kernel, we need to do one manual step on each machine to upgrade its kernel. Run these commands from your host shell prompt:

$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" kv2
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-master
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n1
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n2

(Jeff doesn’t have to do this in his tutorial because Machine provisions using an iso that contains a recent kernel.)

We’re now ready to set up a Consul key/value store just as Jeff did:

$ docker -H=tcp:// run -d -p 8500:8500 -h consul progrium/consul -server -bootstrap

Here’s how you manually start the swarm manager on the c0-master machine:

$ docker -H=tcp:// run -d -p 3375:2375 swarm manage consul://

Next we start two swarm agent containers on nodes c0-n1 and c0-n2:

$ docker -H=tcp:// run -d swarm join --advertise= consul://
$ docker -H=tcp:// run -d swarm join --advertise= consul://

Let’s test the cluster:

$ docker -H=tcp:// info
$ docker -H=tcp:// run swarm list consul://
$ docker -H=tcp:// run hello-world

Create the overlay network just as Jeff did:

$ docker -H=tcp:// network create -d overlay myStack1
$ docker -H=tcp:// network ls

Create the same two (nginx and alpine) containers that Jeff did:

$ docker -H=tcp:// run -d --name web --net myStack1 nginx
$ docker -H=tcp:// run -itd --name shell1 --net myStack1 alpine /bin/sh

And verify they can talk to each other just as Jeff did:

$ docker -H=tcp:// attach shell1
$ ping web
$ apk update && apk add curl
$ curl http://web/

You should find that shell1 is able to ping the nginx container, and vice-versa, just as was the case in Jeff’s tutorial.

Comments (18)

What is the Firmament scheduler?

Some in the Kubernetes community are considering adopting a new scheduler based on Malte Schwarzkopf’s Firmament cluster scheduler. I just finished reading Ch. 5 of Malte’s thesis. Here’s a high level summary of what Firmament is about.

Today’s container orchestration systems like Kubernetes, Mesos, Diego and Docker Swarm rely heavily on straightforward heuristics for scheduling. This works well if you want to optimize along a single dimension, like efficient bin packing of workloads to servers. But these heuristics are not designed to simultaneously handle complex tradeoffs between competing priorities like data locality, scheduling delay, soft and hard affinity constraints, inter-task dependency constraints, etc. Taking so many factors into account at once is difficult.

The Firmament scheduler tries to optimize across many tradeoffs, while still making fast scheduling decisions. How? Like Microsoft’s Quincy scheduler, it considers things from a new angle: cost. Suppose we assign a cost to every scheduling tradeoff. The problem of efficient scheduling then becomes a global cost minimization problem, which is much more tractable than trying to design a heuristic that balances many different factors.

Firmament’s technical implementation is to model the scheduling problem as a flow graph. Workloads are the flow sources, and they flow into the cluster, whose topology of machines and availability zones is modeled by vertices in the graph. Ultimately, all workloads arrive at a global sink, having either flowed through a machine on which they were scheduled or having remained unscheduled. Which path is decided by cost.

Here’s a simplified diagram I created (based on Firmament’s diagram (which is a simpler version of Quincy’s Fig. 4)):

Simplified example of Firmament's flow graph structure.

Simplified example of Firmament’s flow graph structure. By assigning costs to each edge, global cost minimization can be performed. For example, each of the three workloads may be scheduled on the cluster or remain unscheduled, depending on the relative costs of their immediate execution vs. delay.

But how are these costs determined? That’s the coolest part of Firmament: it supports pluggable cost models through a cost model API. Firmament provides several performance-based cost models as well as an interesting one that seeks to minimize data center electricity consumption. Of course, users can supply their own cost models through the API.

For more information on Firmament, here are some resources:

Comments (1)

Docker runc

I had some time recently to start playing around with Docker’s new runc / OpenContainers work.  This is basically the old libcontainer, but now it’s an industry consortium governed by the Linux Foundation.  So, Docker and CoreOS are now friends, or at least frenemies, which is very exciting.

The README over on runc doesn’t fully explain how to get runc to work, i.e., to run a simple example container.  They provide a nice example container.json file, but it comes with without a rootfs, which is confusing if you’re just getting started.  I posted a github issue comment about how to make their container.json work.

Here are the full steps to get the runc sample working:

1.  Build the runc binary if you haven’t already:

cd $GOPATH/src/
git clone
cd runc

2.  Grab their container.json from this section of the runc readme:  opencontainers/runc#ocf-container-json-format

3.  Build a rootfs. The easiest way to do this is to docker export the filesystem of an existing container:

docker run -it ubuntu bash

Now exit immediately (Ctrl+D).

docker ps -a # to find the container ID of the ubuntu container you just exited
docker export [container ID] > docker-ubuntu.tar

Then untar docker-ubuntu.tar into a directory called rootfs, which should be in the same parent directory as your container.json. You now have a rootfs that will work with the container.json linked above. Type sudo runc and you’ll be at an sh prompt, inside your container.

Leave a Comment

Techniques for Exploring Docker Containers

The preferred method of poking around inside a running Docker container is nsenter. Docker has a nice tutorial.

But what if your container doesn’t have any executable shell like /bin/sh? You can’t enter it with nsenter or docker exec. But here are a few tricks you can use to learn about it.

docker inspect -f {{.Config.Env}} – will show you the environment variables in the container

docker export | tar -tvf - – to list the filesystem inside the container (thanks to cnf on #docker-dev for teaching me this one)

docker export | tar -xvf - . – can do this from a temp directory to extract the entire container filesystem and examine it in more detail

I’ll add more tricks here in the future.

Leave a Comment

Tutorial: Setting up a Docker Swarm on Your Laptop Using VirtualBox

This tutorial shows you one method you can use to test out Docker Swarm on a single physical machine, like your laptop. We’ll create 3 VMs: two Swarm worker nodes and one Swarm manager.

Setting Up VMs with NAT Network

First, create three VirtualBox VMs, each with 1gb RAM and 1 CPU. Set up each one with Bridged Networking, meaning that your Linksys/Airport/whatever router will assign them each an IP on the same subnet.

In this example, I’ll use three Ubuntu machines with the hostnames below, plus static DHCP in the router to force them to always have the same IPs:

DockerManager = (runs swarm manager, doesn’t run any containers)
DockerNode1 = (first worker node, runs containers)
DockerNode2 = (second worker node, runs containers)

Alternate method: if you don’t want your VMs to be exposed directly on your LAN, you can use “internal networking” in VirtualBox. This will put all three VMs on the same virtual LAN within your laptop. Turn it on by doing this on the host:

$ VBoxManage natnetwork add -t nat-int-network -n “” -e
$ VBoxManage natnetwork start -t nat-int-network

Then change networking in each VM to use “NAT Network” and select new option “nat-int-network”

Now reboot the VMs. Each should have a unique IP of the form:

ping one machine from the other. Or test like this:
nc -l 12345 (one machine A)
echo “bananas” | nc 12345 (on machine B)

Basic Installs On Each Machine

You’ll need to install Docker on each machine. I won’t cover that here. Afterward, do this:

docker pull swarm

to retrieve the Swarm container, which is the same container for both the Swarm nodes and the master.

The container, by the way, just contains a single Go binary called swarm. If you re-build the binary, you can just run it directly without docker building it into a new container. I won’t cover that more advanced scenario here, though.

Running Swarm

On any machine, do this one-time operation:

docker run --rm swarm create
# gives back some token like 372cd183a188848c3d5ef0e6f4d7a963

On DockerNode1, start the Docker daemon bound to

sudo stop docker
sudo docker -d -H tcp:// -H unix:///var/run/docker.sock
(leave daemon running and open a new terminal tab)
docker -H tcp:// run -d --name node1 swarm join --addr= token://372cd183a188848c3d5ef0e6f4d7a963
docker -H tcp:// ps
# now you see agent running on DockerNode1

Note: if you do `export DOCKER_HOST=tcp://` you can omit the “-H tcp://…” part.

On DockerNode2, follow the same procedure as on DockerNode1 above, except the join command will look like this:

docker -H tcp:// run -d --name node1 swarm join --addr= token://372cd183a188848c3d5ef0e6f4d7a963

Now you can list the nodes from any machine. For instance, from DockerManager you could do:

docker run --rm swarm list token://372cd183a188848c3d5ef0e6f4d7a963

If you have built your own swarm binary, you can also use it to list the nodes without a container:

./swarm list token://372cd183a188848c3d5ef0e6f4d7a963

Now start the swarm manager. On DockerManager:

docker run -d -p 3375:2375 swarm manage token://372cd183a188848c3d5ef0e6f4d7a963

Υοu can now do commands like this on ANY machine:

docker -H tcp:// info
docker -H tcp:// run -it --rm ubuntu bash
docker -H tcp:// run -it --rm ubuntu bash
docker -H tcp:// run -it --rm ubuntu bash
docker -H tcp:// run -it --rm ubuntu bash
[...repeat as many times as you like...]

Now go back to DockerNode1 and try:

docker -H tcp:// ps

And this on Docker2:

docker -H tcp:// ps

You can see different bash processes being allocated to the two machines.

Leave a Comment