Docker Swarm 1.0 with Multi-host Networking: Manual Setup

Jeff Nickoloff had a great Medium post recently about how to set up a Swarm 1.0 cluster using the multi-host networking features added in Docker 1.9. He uses Docker Machine’s built-in Swarm support to create a demonstration cluster in just a few minutes.

In this post, I’ll show how to recreate the same cluster manually — that is, without docker-machine provisioning. This is for advanced users who want to understand what Machine is doing under the hood.

First, let’s take a look at the layout of the cluster Jeff created in his post. There are four machines:

Swarm cluster topology

Topology of our Swarm cluster.

To approximate the provisioning that Machine is doing under the hood, we’ll use this Vagrantfile to provision four Ubuntu boxes:

Name   IP   Description
kv2   192.168.33.10   Consul (for both cluster discovery and networking)
c0-master   192.168.33.11   Swarm master
c0-n1   192.168.33.12   Swarm node 1
c0-n2   192.168.33.13   Swarm node 2

In the directory where you saved the Vagrantfile, run vagrant up. This will take 5-10 minutes, but at the end of the process you should have four running VMs with Docker 1.9 or higher running. Note how our Vagrant file starts each instance of Docker Engine (the docker daemon) with --cluster-store=consul://192.168.33.10:8500 and --cluster-advertise=eth1:2375. Those flags are the same ones Jeff passes to docker-machine using --engine-opt.

Because Docker networking requires a >= 3.16 kernel, we need to do one manual step on each machine to upgrade its kernel. Run these commands from your host shell prompt:

$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" kv2
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-master
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n1
$ vagrant ssh -c "sudo apt-get install -y linux-image-generic-lts-utopic && sudo reboot" c0-n2

(Jeff doesn’t have to do this in his tutorial because Machine provisions using an iso that contains a recent kernel.)

We’re now ready to set up a Consul key/value store just as Jeff did:

$ docker -H=tcp://192.168.33.10:2375 run -d -p 8500:8500 -h consul progrium/consul -server -bootstrap

Here’s how you manually start the swarm manager on the c0-master machine:

$ docker -H=tcp://192.168.33.11:2375 run -d -p 3375:2375 swarm manage consul://192.168.33.10:8500/

Next we start two swarm agent containers on nodes c0-n1 and c0-n2:

$ docker -H=tcp://192.168.33.12:2375 run -d swarm join --advertise=192.168.33.12:2375 consul://192.168.33.10:8500/
$ docker -H=tcp://192.168.33.13:2375 run -d swarm join --advertise=192.168.33.13:2375 consul://192.168.33.10:8500/

Let’s test the cluster:

$ docker -H=tcp://192.168.33.11:3375 info
$ docker -H=tcp://192.168.33.11:3375 run swarm list consul://192.168.33.10:8500/
$ docker -H=tcp://192.168.33.11:3375 run hello-world

Create the overlay network just as Jeff did:

$ docker -H=tcp://192.168.33.11:3375 network create -d overlay myStack1
$ docker -H=tcp://192.168.33.11:3375 network ls

Create the same two (nginx and alpine) containers that Jeff did:

$ docker -H=tcp://192.168.33.11:3375 run -d --name web --net myStack1 nginx
$ docker -H=tcp://192.168.33.11:3375 run -itd --name shell1 --net myStack1 alpine /bin/sh

And verify they can talk to each other just as Jeff did:

$ docker -H=tcp://192.168.33.11:3375 attach shell1
$ ping web
$ apk update && apk add curl
$ curl http://web/

You should find that shell1 is able to ping the nginx container, and vice-versa, just as was the case in Jeff’s tutorial.

18 Comments »

  1. Alok Agarwal said

    Thanks for These Details, However nodes are not joining the cluster. I can see following in Swarm Manager Logs, I will really appreciate any help you can provide.

    “getsockopt: no route to host. Are you trying to connect to a TLS-enabled daemon without TLS?”

    • mike said

      ssh to the master and try pinging each node. “no route to host” would imply there is something wrong with the networking, e.g., nodes are not on the same subnet or something like that. What infrastructure are you on?

  2. Kristof Jozsa said

    Thanks for this article, I love that concise style. Did you consider writing a docker book yourself? :)

    • mike said

      Thank you for the kind words, Kristof. I actually work on the Swarm team at Docker. Do you use Swarm?

  3. Alok Agarwal said

    ssh is working fine between nodes, I am setting this up on 4 different VMs– looks more of certificate issue “Are you trying to connect to a TLS-enabled daemon without TLS?”

    • mike said

      Yes, I know, but I think we append the words “Are you trying to connect to a TLS…” to almost every connection failure error message because it’s such a common cause of problems. In this case, your root error message is “getsockopt: no route to host.” If you can ssh to the nodes from the master, then my next suggestion is to check that port 2375 is open and that the docker daemon is actually listening on it.

      To check that 2375 is open, do this on master `nc 2375` then press return. You should get back an HTTP 400 error.

      You can also do `sudo netstat -lnp` on the node itself and make sure you see a line like this:
      tcp6 0 0 :::2375 :::* LISTEN xxx/docker

      By the way, are you using a DOCKER_OPTS line similar to the one in the Vagrant file in my tutorial? If so, TLS should not be enabled. I am referring to these commands which are run on every box deployed by that Vagrant file:

      DOCKER_OPTS=”–cluster-store=consul://192.168.33.10:8500 –cluster-advertise=eth1:2375 -H=tcp://0.0.0.0:2375 -H=unix:///var/run/docker.sock” >> /etc/default/docker
      sudo service docker restart

  4. Alok Agarwal said

    Thanks Mike for pointing me to right direction. I am able to resolve the problem.

    Cluster is UP. overlay network created.

    I am unable to spin a new container on swarm cluster with overlay network and i understood i need 3.16, i am currently at 3.1. I will do the same tomorrow.

    Once again thanks a lot!

    One more question do you have any place/link where i can look at live migration of containers within swarm using docker tools.

  5. Matt said

    This is a great. I was trying to back into this information by running a swarm using `docker-machine create –swarm` and then inspecting the swarm containers that resulted. My question is slightly different: do you know where I can look to understand how one would do this with `–tlsverify` and related? It’s not clear to me what the requirements are for correctly setting up keys for a swarm members.

    • mike said

      Hi Matt: Yes, I was planning to make TLS be the next blog post in this series, but I haven’t had time to write it yet. I know that the Docs team at Docker is also planning to improve the TLS documentation for the January/February time frame.

      For now the best suggestion I have is to follow Jeff Nickoloff’s blog post [1], but as you go through it actually echo out the values of $(docker-machine config kv) and $(docker-machine env –swarm c0-master). Working through Jeff’s blog post only takes about 5 minutes and by looking at what command line args docker-machine is injecting behind the scenes, you will be able to see all of the TLS arguments.

      Write back if you have trouble (or success).

      [1] https://medium.com/on-docker/docker-overlay-networks-that-was-easy-8f24baebb698#.xihawcrcs

  6. Gon said

    Hi Mike,

    This post is amazing. It helps me to understand how docker swarm works. But I have a problem.
    When I create a container named shell1, i send a ping from shell1 to web but I don’t recieve any response.

    • Odravison said

      Hello Gon,

      I’m a little newbie in docker and swarm, but, i think that i can help you.

      You have that do ping on docker swarm port, (3375, in the topology layout), and the swarm will send this request to anyone of this docker agent.

      Thanks, goodbye.

  7. Derek Mulhern said

    Hi, Have you got a tutorial for this Docker Swarm with MS Azure? I’m trying to following the instructions and implement them via Azure but I’m running into many issues, thank you in advance!

    Del

  8. PMac said

    Hi there, Thanks for the great writeup. I did exactly the same but not able to get result out of the last step, ping container on a different node. Ping to a container on the same node works fine. I have all the ports open between both the nodes. I am not sure if I am missing anything.

  9. Odravison said

    Hello,

    AWESOME … really really AEWSOME TUTORIAL.

    That’s was i needed, but i have one question: i’m not using the vagrant, i have 4 machines running and, the part with vagrant was very automatically to me, specifically in this command:” sudo sh -c ‘echo DOCKER_OPTS=\\”–cluster-store=consul://192.168.33.10:8500 –cluster-advertise=eth1:2375 -H=tcp://0.0.0.0:2375 -H=unix:///var/run/docker.sock\\” >> /etc/default/docker’ ”

    i have an idea about what this does, but, you explanation will be better to me.

    Sorry about my english and about my knowledge … i’m start, about 2 month, with linux and Docker…

    Att.

  10. JCRUBIO said

    Wow!!! This rocks! I tried and works like a charm!!!
    I modified this in my test environment. I didn’t use vagrant. I used three KVM virtual machines.
    The first one is my consul and swam master. The other two are swarm nodes.
    I had to change all IP’s.
    So, i had to change /etc/default/docker

    But it is working!!!!
    God bless you!!

    Thanks from Seville, Spain!!!!!

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Your email address will not be published. Required fields are marked *