This is an old revision of the document!
Docker Cluster HA Setup
1. Setup a VPN in-between your 3,5,7 or more servers. This can be done with TincVPN for example but there are many others you can choose from.
2. Installing GlusterFS
apt-get install gpg -y curl https://download.gluster.org/pub/gluster/glusterfs/11/rsa.pub | gpg --dearmor > /usr/share/keyrings/glusterfs-archive-keyring.gpg DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"') DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+') DEBARCH=$(dpkg --print-architecture) echo "deb [signed-by=/usr/share/keyrings/glusterfs-archive-keyring.gpg] https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main" | sudo tee /etc/apt/sources.list.d/gluster.list apt-get update && apt-get install glusterfs-server -y
3. Edit /etc/glusterfs/glusterd.vol and add
This will prevent glusterfs from getting exposed to the dangerous interwebs.
option transport.socket.bind-address 10.0.X.1
rpcbind does listen on the network and we don't need it, so lets get rid of it.
apt-get remove rpcbind -y
4. Enable GlusterFS
systemctl start glusterd systemctl enable glusterd
5. Peer with your GlusterFS nodes
gluster peer probe 10.0.2.1 gluster peer probe 10.0.3.1
6. Check the peering status
gluster peer status
7. Create your first volume for Docker
mkdir -p /mnt/bricks/docker gluster volume create docker replica 3 10.0.1.1:/mnt/bricks/docker 10.0.2.1:/mnt/bricks/docker 10.0.3.1:/mnt/bricks/docker force
8. Mount your first volume
mkdir -p /mnt/data/docker mount.glusterfs 10.0.X.1:/docker /mnt/data/docker
9. Make the mount boot ready
[Unit] Description=mounts service service Wants=network-online.target glusterd.service After=network-online.target glusterd.service [Service] User=root Group=root ExecStartPre=sleep 5 ExecStart=mount.glusterfs 10.0.X.1:/docker /mnt/data/docker RemainAfterExit=true Type=oneshot [Install] WantedBy=multi-user.target
Copy this to /etc/systemd/system/mounts.service
10. Enable the mount service
systemctl enable mounts
11. You may have to edit the GlusterFS systemd file to prevent a race condition with your VPN.
GlusterFS will fail to start if your VPN isn't running already.
You can do this with
systemctl edit glusterd --full
Added one line
ExecStartPre=/bin/sh -c 'until ping -c1 10.0.X.1; do sleep 1; done;'
Profit! Next reboot GlusterFS should start up fine.
12. Install Docker
# Add Docker's official GPG key: apt-get update apt-get install ca-certificates curl -y install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null # Install Docker apt-get update && apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
13. Init the Swarm on the first Node
docker swarm init --advertise-addr 10.0.1.1 --listen-addr=10.0.1.1
advertise-addr will only advertise the swarm inside our VPN network
14. Join other Nodes
docker swarm join --token whateverthattokenis 10.0.1.1:2377 --listen-addr=10.0.2.1 docker swarm join --token whateverthattokenis 10.0.1.1:2377 --listen-addr=10.0.3.1
listen-addr will force swarm to bind to your local VPN
15. Promote the other Nodes to archive 100% True HA
docker node promote node2 docker node promote node3
16. Deploy your first service
In my case it was a ZNC bouncer.
Had to run the docker container normally to generate the config files.
docker run -it -v /mnt/data/docker/znc/:/znc-data znc --makeconf
Lets deploy the service.
docker service create --mount type=bind,src=/mnt/data/docker/znc/,dst=/znc-data --publish published=1025,target=1025 --name bouncer znc
17. If you run this, on any node.
docker node ps $(docker node ls -q)
You should be able to check your container status.
18. When you reboot the node with your container, the service should be restored in about 60s.