Optimizing Multi Data Center Deployments Using Docker Registry
USING DOCKER TO DEPLOY SERVICES IN PRODUCTION
At Infobip we run over 400 services across 6 data centers. Services are deployed by development teams every few minutes – currently totaling at around 90 deploys per day.
To provide a consistent and standardized deployment interface, we introduced services deployment as Docker containers – teams are able to package their services as Docker images and deploy them in the same way, regardless of technology or language used to build the service.
Since Docker images can be quite large (hundreds of megabytes) we knew very early on that distributing Docker images efficiently to all datacenters will be a challenge.
DISTRIBUTING DOCKER SERVICE IMAGES EFFICIENTLY TO ALL DATA CENTERS
When deploying a Docker image packaged service as a Docker container on a data center machine we want to use docker pull to download Docker images to that machine and run the container.
We store Docker images of our applications in a private central Docker registry – for this we use Artifactory which acts as Docker registry. It was an obvious choice since we already use it for storing all our other artifacts.
We could docker pull directly from the central Artifactory Docker repo on every deploy but this would add significant cross data center network overhead when transporting the same Docker image to different machines.
To illustrate: one Infobip Java application Docker image consists of two layers: a base image layer around 160MB (alpine linux + oracle jdk installed) and an application image layer around 60MB (Spring Java app)
To deploy the same app on 10 different machines in the same data center, we would have to transfer a total of 10*60MB across data centers (from the central repo to machines in remote datacenters) – and this is best case scenario where we assume the base Java Docker image is already on the target machine.
This additional network overhead can be minimized by using the local data center Docker registry acting as a proxy cache to the central private Docker repo.
When using local data center Docker proxy cache registry for the same scenario of 10 machines, only the 60MB application Docker layer would be transported across data centers; the other 9 machines would get the cached app layer from local registry cache. (This assumes that these 10 machines are deployed in a sequential manner which currently is the case).
Alternatively, we could hack a custom Docker image export-transport-cache-import flow, but we decided to try out Docker open source registry that has built-in capability to act as proxy and as cache.
TAKING THE DOCKER REGISTRY OUT FOR A SPIN
Running a private Docker registry is simple and straightforward as described here: https://docs.Docker.com/registry.
Setting up a Docker registry to act as proxy cache is also not complicated and a guide exists how to do it, so we were able to quickly start Docker registry mirror in a pull through cache mode just to find out it doesn’t work as we’d hoped 🙁
As stated in this document, it is currently not possible to mirror another private registry.
Since we really liked the idea of saving bandwidth and reducing deployment time, we were determined to somehow work around this missing feature of Docker registry.
MAKING IT WORK
Artifactory exposes the Docker API for talking to every Docker repository it hosts – for example, to obtain available tags of busybox images stored in Artifactory Docker-local repository we can send this request:
$ curl -u mzagar http://artifactory:8081/artifactory/api/docker/docker-local/v2/busybox/tags/list
Enter host password for user 'mzagar':
{
"name" : "busybox",
"tags" : [ "latest" ]
}
Since the Docker registry is currently able to only mirror the central public Docker Hub, we had an idea to intercept every HTTP request that the Docker registry mirror would make to the remote Docker registry URL and rewrite it so it would fit our Artifactory Docker API URL.
Basically we would make Docker registry mirror think it is talking to central Docker hub, but instead it would talk to our Artifactory Docker repository.
A LITTLE BIT OF HAPROXY MAGIC
To rewrite HTTP request Docker registry sends to central Docker registry we used HAProxy – running as a Docker container of course.
Here’s the HAProxy configuration which specifies how to do the rewrite:
haproxy.cfg
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
defaults
log global
mode http
option httplog
option dontlognull
Every /v2/* request sent by Docker mirror is rewritten to /artifactory/api/docker/docker-local/v2/* and sent to our Artifactory server.
We are using environment variables to specify the Artifactory IP and port, and also to add fixed Authorization header to authenticate as a valid Artifactory user.
This configuration also exposes HAProxy statistics for front end and back end servers which provides a nice way to check if Artifactory is live and accessible and how much traffic the mirror generates.
HOOKING IT UP WITH DOCKER REGISTRY
Next we need to hook up Docker registry to talk to HAProxy which will then route the HTTP request where and how we want them – we did this using docker-compose. Here’s how the complete docker-compose.yml file looks like:
docker-compose.yml
haproxy:
image: haproxy:latest
container_name: hap
restart: always
ports:
- 80:80
volumes:
- ${WORK}/haproxy/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
environment:
- ARTIFACTORY_IP=${ARTIFACTORY_IP}
- ARTIFACTORY_PORT=${ARTIFACTORY_PORT}
- BASIC_AUTH_PASSWORD=${BASIC_AUTH_PASSWORD}
log_driver: "json-file"
log_opt:
max-size: "10m"
max-file: "10"
mirror:
image: registry:2.4.0
container_name: registry
restart: always
ports:
- 5000:5000
volumes:
- ${WORK}/docker-registry:/var/lib/registry
- ${REGISTRY_CERTIFICATE_FOLDER}:/certs
environment:
- REGISTRY_HTTP_TLS_CERTIFICATE=${REGISTRY_HTTP_TLS_CERTIFICATE}
- REGISTRY_HTTP_TLS_KEY=${REGISTRY_HTTP_TLS_KEY}
command: serve /var/lib/registry/config.yml
links:
- haproxy:haproxy
log_driver: "json-file"
log_opt:
max-size: "10m"
max-file: "10"
We use a lot of environment variables to be flexible when starting this application duo. Compose expects haproxy.cfg to live in a HAProxy folder, and the Docker registry config.yml to live in the docker-registry folder relative to specified WORK folder.
We’ll use a simple bash script to fire up docker-compose with all necessary environment variables:
run-mirror.sh
#!/bin/sh
export WORK=`pwd`
export ARTIFACTORY_IP='10.10.10.10'
export ARTIFACTORY_PORT='8081'
export REGISTRY_CERTIFICATE_FOLDER='/etc/ssl/example'
export REGISTRY_HTTP_TLS_CERTIFICATE='/certs/cert.pem'
export REGISTRY_HTTP_TLS_KEY='/certs/cert.pem'
export BASIC_AUTH_PASSWORD='basicauthbase64encodedstring=='
docker-compose up --force-recreate
After starting docker-compose our mirror is available at mirror.example.com:5000 – first we check the contents of our mirror:
$ ./run-mirror.sh
$ curl https://mirror.example.com:5000/v2/_catalog
{"repositories":[]}
As expected, the mirror cache is empty. Now we pull the busybox image from our central Docker repo and measure how long it takes to download the image:
$ time docker pull mirror.example.com:5000/busybox
Using default tag: latest
latest: Pulling from busybox
9d7588d3c063: Pull complete
a3ed95caeb02: Pull complete
Digest: sha256:000409ca75cd0b754155d790402405fdc35f051af1917ae35a9f4d96ec06ae50
Status: Downloaded newer image for mirror.example.com:5000/busybox:latest
real 0m 3.66s
user 0m 0.02s
sys 0m 0.00s
It took about 3.5 seconds to download the image for the first time.
We can see that the busybox image is now stored in Docker mirror cache:
$ curl https://mirror.example.com:5000/v2/_catalog
{"repositories":["busybox"]}
Let’s remove the local busybox image and pull it again – we expect the second pull to be performed faster than the first, since the image is cached, and mirror doesn’t have to fetch it from the central repository:
$ docker rmi mirror.example.com:5000/busybox
Untagged: mirror.example.com:5000/busybox:latest
Deleted: sha256:a84c36ecc374f680d00a625d1f0ba52426a536775ee7277f21728369dc42499b
Deleted: sha256:1a879e2f481d67c4537144f80f5f6d776542c7d3a0bd7721fdf6aa1ec024af24
Deleted: sha256:a193ed10c686545c776af2bb8cfe20d3e5badf5c936fbf0e8f389d769018a3f9
$ time docker pull mirror.example.com:5000/busybox
Using default tag: latest
latest: Pulling from busybox
9d7588d3c063: Pull complete
a3ed95caeb02: Pull complete
Digest: sha256:000409ca75cd0b754155d790402405fdc35f051af1917ae35a9f4d96ec06ae50
Status: Downloaded newer image for mirror.example.com:5000/busybox:latest
real 0m 0.28s
user 0m 0.01s
sys 0m 0.00s
Hooray! Since the registry had the image locally cached, the second pull took only 0.28 seconds!
CONCLUSION
We have successfully set up local Docker registry mirror cache instances in each remote data center to mirror our private central Artifactory Docker registry and optimized the amount of data we need to transfer between data centers when deploying our applications.
There is still lots of room to improve this process, but we accomplished our primary goal of optimizing the distribution of a Docker image across remote data centers in a relatively simple and transparent way.
We continue to improve our Docker deployment process daily to give a smooth deployment experience to our development teams.
(By Mario Zagar, Senior Software Architect / Division Lead)