Infrastructure as Code: Automated Deployments with Ansible
Automate, automate, automate.
Now that our server is up and running, we want to install our app on it, using our Docker image and container.
We could do this manually, but a key insight of modern software engineering is that small, frequent deployments are a must.
| This insight about the importance of frequent deployments we owe to Nicole Forsgren and the State of DevOps reports. They are some of the only really firm science we have in the field of software engineering. |
Frequent deployments rely on automation,[1] so we’ll use Ansible.
Automation is also key to making sure our tests give us true confidence over our deployments. If we go to the trouble of building a staging server,[2] we want to make sure that it’s as similar as possible to the production environment. By automating the way we deploy, and using the same automation for staging and prod, we give ourselves much more confidence.
The buzzword for automating your deployments these days is "infrastructure as code" (IaC).
| Why not ping me a note once your site is live on the web, and send me the URL? It always gives me a warm and fuzzy feeling…Email me at [email protected]. |
A First Cut of an Ansible Playbook for Deployment
Let’s start using Ansible a little more seriously. We’re not going to jump all the way to the end though! Baby steps, as always. Let’s see if we can get it to run a simple "hello world" Docker container on our server.
Let’s delete the old content, which had the "ping", and replace it with something like this:
---
- hosts: all
tasks:
- name: Install docker (1)
ansible.builtin.apt: (2)
name: docker.io (3)
state: latest
update_cache: true
become: true
- name: Run test container
community.docker.docker_container:
name: testcontainer
state: started
image: busybox
command: echo hello world
become: true
| 1 | An Ansible playbook is a series of "tasks"; we now have more than one.
In that sense, it’s still quite sequential and procedural,
but the individual tasks themselves are quite declarative.
Each one usually has a human-readable name attribute. |
| 2 | Each task uses an Ansible "module" to do its work.
This one uses the builtin.apt module, which provides a wrapper
around the apt Debian and Ubuntu package management tool. |
| 3 | Each module then provides a bunch of parameters that control how it works.
Here, we specify the name of the package we want to install ("docker.io"[3])
and tell it to update its cache first, which is required on a fresh server. |
Most Ansible modules have pretty good documentation—check out the builtin.apt one for example;
I often skip to the
"Examples" section.
Let’s rerun our deployment command, ansible-playbook,
with the same flags we used in the last chapter:
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml -vv
ansible-playbook [core 2.16.3]
config file = None
[...]
No config file found; using defaults
BECOME password:
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
PLAYBOOK: deploy-playbook.yaml ************************************************
1 plays in infra/deploy-playbook.yaml
PLAY [all] ********************************************************************
TASK [Gathering Facts] ********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:2
ok: [staging.ottg.co.uk]
PLAYBOOK: deploy-playbook.yaml ************************************************
1 plays in infra/deploy-playbook.yaml
TASK [Install docker] *********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:6
ok: [staging.ottg.co.uk] => {"cache_update_time": 1708981325, "cache_updated":
true, "changed": false}
TASK [Install docker] *********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:6
changed: [staging.ottg.co.uk] => {"cache_update_time": [...]
"cache_updated": true, "changed": true, "stderr": "", "stderr_lines": [],
"stdout": "Reading package lists...\nBuilding dependency tree...\nReading [...]
information...\nThe following additional packages will be installed:\n
wmdocker\nThe following NEW packages will be installed:\n docker wmdocker\n0
TASK [Run test container] *****************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:13
changed: [staging.ottg.co.uk] => {"changed": true, "container":
{"AppArmorProfile": "docker-default", "Args": ["hello", "world"], "Config":
[...]
PLAY RECAP ********************************************************************
staging.ottg.co.uk : ok=3 changed=2 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
I don’t know about you, but whenever I make a terminal spew out a stream of output, I like to make little brrp brrp brrp noises—a bit like the computer, Mother, in Alien. Ansible scripts are particularly satisfying in this regard.
You may need to use the --ask-become-pass argument to ansible-playbook
if you get an error, "Missing sudo password".[4]
|
SSHing Into the Server and Viewing Container Logs
Ansible looks like it’s doing its job, but let’s practice our SSH skills, and do some good old-fashioned system admin. Let’s log in to our server and see if we can see any actual evidence that our container has run.
After we ssh in, we can use docker ps, just like we do on our own machine.
We pass the -a flag to view all containers, including old/stopped ones.
Then we can use docker logs to view the output from one of them:
$ ssh [email protected] Welcome to Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-67-generic x86_64) [...] elspeth@server$ sudo docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a2e600fbe77 busybox "echo hello world" 2 days ago Exited (0) 10 minutes ago testcontainer elspeth@server:$ sudo docker logs testcontainer hello world
Look out for that elspeth@server
in the command-line listings in this chapter.
It indicates commands that must be run on the server,
as opposed to commands you run on your own PC.
|
SSHing in to check things worked is a key server debugging skill! It’s something we want to practice on our staging server, because ideally we’ll want to avoid doing it on production machines.
Allowing Rootless Docker Access
Having to use sudo or become=True to run Docker commands is a bit of a pain.
If we add our user to the docker group, we can run Docker commands without sudo:
- name: Install docker
[...]
- name: Add our user to the docker group, so we don't need sudo/become
ansible.builtin.user: (1)
name: '{{ ansible_user }}' (2)
groups: docker
append: true # don't remove any existing groups.
become: true
- name: Reset ssh connection to allow the user/group change to take effect
ansible.builtin.meta: reset_connection (3)
- name: Run test container (4)
[...]
| 1 | We use the builtin.user module to add our user to the docker group. |
| 2 | The {{ ... }} syntax enables us to interpolate some variables into
our config file, much like in a Django template.
ansible_user will be the user we’re using to connect to the server—i.e., "elspeth", in my case. |
| 3 | As per the task name, we need this for the user/group change to take effect.
Strictly speaking, this is only needed the first time we run the script;
if you’ve got some time, you can read up on how to
make tasks conditional
and configure it to only run if the builtin.user tasks has actually made a change. |
| 4 | We can remove the become: true from this task and it should still work. |
Let’s run that:
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml -vv
PLAYBOOK: deploy-playbook.yaml ************************************************
1 plays in infra/deploy-playbook.yaml
PLAY [all] ********************************************************************
TASK [Gathering Facts] ********************************************************
[...]
ok: [staging.ottg.co.uk]
TASK [Install docker] *********************************************************
[...]
ok: [staging.ottg.co.uk] => {"cache_update_time": 1738767216, "cache_updated":
true, "changed": false}
TASK [Add our user to the docker group, so we don't need sudo/become] *********
[...]
changed: [staging.ottg.co.uk] => {"append": false, "changed": true, [...]
"", "group": 1000, "groups": "docker", [...]
TASK [Reset ssh connection to allow the user/group change to take effect] *****
[...]
META: reset connection
TASK [Run test container] *****************************************************
[...]
changed: [staging.ottg.co.uk] => {"changed": true, "container": [...]
PLAY RECAP ********************************************************************
staging.ottg.co.uk : ok=4 changed=2 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
And check that it worked:
elspeth@server$ docker ps -a # no sudo yay! CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES bd3114e43f55 busybox "echo hello world" 12 minutes ago Exited (0) 6 seconds ago testcontainer elsepth@server$ docker logs testcontainer hello world hello world
Sure enough, we no longer need sudo,
and we can see that a new version of the container just ran.
You know, that’s worthy of a commit!
$ git add infra/deploy-playbook.yaml $ git commit -m "Made a start on an ansible playbook for deployment"
Let’s move on to trying to get our actual Docker container running on the server. As we go through, you’ll see that we’re going to work through very similar issues to the ones we’ve already figured our way through in the last couple of chapters:
-
Configuration
-
Networking
-
The database
Getting Our Image Onto the Server
Typically, you can "push" and "pull" container images to a "container registry"—Docker offers a public one called Docker Hub, and organisations will often run private ones, hosted by cloud providers like AWS.
So your process of getting an image onto a server is usually:
-
Push the image from your machine to the registry.
-
Pull the image from the registry onto the server. Usually this step is implicit, in that you just specify the image name in the format
registry-url/image-name:tag, and thendocker runtakes care of pulling down the image for you.
But I don’t want to ask you to create a Docker Hub account, nor implicitly endorse any particular provider, so we’re going to "simulate" this process by doing it manually.
It turns out you can "export" a container image to an archive format, manually copy that to the server, and then reimport it. In Ansible config, it looks like this:
- name: Install docker
[...]
- name: Add our user to the docker group, so we don't need sudo/become
[...]
- name: Reset ssh connection to allow the user/group change to take effect
[...]
- name: Export container image locally (1)
community.docker.docker_image:
name: superlists
archive_path: /tmp/superlists-img.tar
source: local
delegate_to: 127.0.0.1
- name: Upload image to server (2)
ansible.builtin.copy:
src: /tmp/superlists-img.tar
dest: /tmp/superlists-img.tar
- name: Import container image on server (3)
community.docker.docker_image:
name: superlists
load_path: /tmp/superlists-img.tar
source: load
force_source: true (4)
state: present
- name: Run container
community.docker.docker_container:
name: superlists
image: superlists (5)
state: started
recreate: true (6)
| 1 | We export the Docker image to a .tar file by using the docker_image module
with the archive_path set to a tempfile, and setting the delegate_to attribute
to say we’re running that command on our local machine rather than the server. |
| 2 | We then use the copy module to upload the .tar file to the server. |
| 3 | And we use docker_image again, but this time with load_path and source: load
to import the image back on the server. |
| 4 | The force_source flag tells the server to attempt the import,
even if an image of that name already exists. |
| 5 | We change our "run container" task to use the superlists image,
and we’ll use that as the container name too. |
| 6 | Similarly to source: load, the recreate argument tells Ansible
to re-create the container even if there’s already one running
whose name and image match "superlists". |
If you see an error saying "Error connecting: Error while fetching server API version",
it may be because the Python Docker software development kit (SDK) can’t find your Docker daemon.
Try restarting Docker Desktop if you’re on Windows or a Mac.
If you’re not using the standard Docker engine—with Colima or Podman, for example—you may need to set the DOCKER_HOST environment variable
(e.g., DOCKER_HOST=unix:///$HOME/.colima/default/docker.sock)
or use a symlink to point to the right place.
See the
Colima FAQ
or Podman docs.
|
Let’s run the new version of our playbook, and see if we can upload a Docker image to our server and get it running:
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml -vv
[...]
PLAYBOOK: deploy-playbook.yaml **********************************************
1 plays in infra/deploy-playbook.yaml
PLAY [all] ********************************************************************
TASK [Gathering Facts] ********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:2
ok: [staging.ottg.co.uk]
TASK [Install docker] *********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:5
ok: [staging.ottg.co.uk] => {"cache_update_time": 1708982855, "cache_updated":
false, "changed": false}
TASK [Add our user to the docker group, so we don't need sudo/become] *********
task path: ...goat-book/infra/deploy-playbook.yaml:11
ok: [staging.ottg.co.uk] => {"append": false, "changed": false, [...]
TASK [Reset ssh connection to allow the user/group change to take effect] *****
task path: ...goat-book/infra/deploy-playbook.yaml:17
META: reset connection
TASK [Export container image locally] *****************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:20
changed: [staging.ottg.co.uk -> 127.0.0.1] => {"actions": ["Archived image
superlists:latest to /tmp/superlists-img.tar, overwriting archive with image
11ff3b83873f0fea93f8ed01bb4bf8b3a02afa15637ce45d71eca1fe98beab34 named
superlists:latest"], "changed": true, "image": {"Architecture": "amd64",
[...]
TASK [Upload image to server] *************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:27
changed: [staging.ottg.co.uk] => {"changed": true, "checksum":
"313602fc0c056c9255eec52e38283522745b612c", "dest": "/tmp/superlists-img.tar",
[...]
TASK [Import container image on server] ***************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:32
changed: [staging.ottg.co.uk] => {"actions": ["Loaded image superlists:latest
from /tmp/superlists-img.tar"], "changed": true, "image": {"Architecture":
"amd64", "Author": "", "Comment": "buildkit.dockerfile.v0", "Config":
[...]
TASK [Run container] **********************************************************
task path: ...goat-book/superlists/infra/deploy-playbook.yaml:40
changed: [staging.ottg.co.uk] => {"changed": true, "container":
{"AppArmorProfile": "docker-default", "Args": ["--bind", ":8888",
"superlists.wsgi:application"], "Config": {"AttachStderr": true, "AttachStdin":
false, "AttachStdout": true, "Cmd": ["gunicorn", "--bind", ":8888",
"superlists.wsgi:application"], "Domainname": "", "Entrypoint": null, "Env":
[...]
staging.ottg.co.uk : ok=7 changed=4 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
That looks good!
For completeness, let’s also add a step to explicitly build the image locally
(this means we aren’t dependent on having run docker build locally):
- name: Reset ssh connection to allow the user/group change to take effect
[...]
- name: Build container image locally
community.docker.docker_image:
name: superlists
source: build
state: present
build:
path: ..
platform: linux/amd64 (1)
force_source: true
delegate_to: 127.0.0.1
- name: Export container image locally
[...]
| 1 | I needed this platform attribute to work around an issue
with compatibility between Apple’s new ARM-based chips and our server’s
x86/AMD64 architecture.
You could also use this platform: to cross-build Docker images
for a Raspberry Pi from a regular PC, or vice versa.
It does no harm in any case. |
Taking a Look Around Manually
Time to take another proverbial look under the hood, to check whether it really worked. Hopefully we’ll see a container that looks like ours:
$ ssh [email protected] Welcome to Ubuntu 22.04.4 LTS (GNU/Linux 5.15.0-67-generic x86_64) [...] elspeth@server$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3a2e600fbe77 busybox "echo hello world" 2 days ago Exited (0) 10 minutes ago testcontainer 129e36a42190 superlists "/bin/sh -c \'gunicor…" About a minute ago Exited (3) About a minute ago superlists
OK! We can see our "superlists" container is there now, both named "superlists" and based on an image called "superlists".
The Status: Exited is a bit more worrying though.
Still, that’s a good bit of progress, so let’s do a commit (back on your own machine):
$ git commit -am"Build our image, use export/import to get it on the server, try and run it"
Docker logs
Now, back on the server, let’s take a look at the logs of our new container to see if we can figure out what’s happened:
elspeth@server:$ docker logs superlists
[2024-02-26 22:19:15 +0000] [1] [INFO] Starting gunicorn 21.2.0
[2024-02-26 22:19:15 +0000] [1] [INFO] Listening at: http://0.0.0.0:8888 (1)
[2024-02-26 22:19:15 +0000] [1] [INFO] Using worker: sync
[...]
File "/src/superlists/settings.py", line 22, in <module>
SECRET_KEY = os.environ["DJANGO_SECRET_KEY"]
~~~~^^^^^^^
File "<frozen os>", line 685, in getitem
KeyError: DJANGO_SECRET_KEY
[2024-02-26 22:19:15 +0000] [7] [INFO] Worker exiting (pid: 7)
[2024-02-26 22:19:15 +0000] [1] [ERROR] Worker (pid:7) exited with code 3
[2024-02-26 22:19:15 +0000] [1] [ERROR] Shutting down: Master
[2024-02-26 22:19:15 +0000] [1] [ERROR] Reason: Worker failed to boot.
Oh, whoops; it can’t find the DJANGO_SECRET_KEY environment variable.
We need to set those environment variables on the server too.
Setting Environment Variables and Secrets
When we run our container manually locally with docker run,
we can pass in environment variables with the -e flag.
As we’ll see, it’s fairly straightforward to replicate that with Ansible,
using the env parameter for the docker.docker_container module
that we’re already using.
But there is at least one "secret" value that we don’t want to hardcode
into our Ansible YAML file: the Django SECRET_KEY setting.
There are many different ways of dealing with secrets; different cloud providers have their own tools. There’s also HashiCorp Vault—it has varying levels of complexity and security.
We don’t have time to go into detail on those in this book. Instead, we’ll generate a one-off secret key value from a random string, and we’ll store it to a file on disk on the server. That’s a reasonable amount of security for our purposes.
So, here’s the plan:
-
We generate a random, one-off secret key the first time we deploy to a new server, and we store it in a file on disk.
-
We read the secret key value back from that file to put it into the container’s environment variables.
-
We set the rest of the env vars we need as well.
Here’s what it looks like:
- name: Import container image on server
[...]
- name: Ensure .secret-key file exists
# the intention is that this only happens once per server
ansible.builtin.copy: (1)
dest: ~/.secret-key
content: "{{ lookup('password', '/dev/null length=32 chars=ascii_letters') }}" (2)
mode: 0600
force: false # do not recreate file if it already exists.
- name: Read secret key back from file
ansible.builtin.slurp: (3)
src: ~/.secret-key
register: secret_key
- name: Run container
community.docker.docker_container:
name: superlists
image: superlists
state: started
recreate: true
env: (4)
DJANGO_DEBUG_FALSE: "1"
DJANGO_SECRET_KEY: "{{ secret_key.content | b64decode }}" (5)
DJANGO_ALLOWED_HOST: "{{ inventory_hostname }}" (6)
DJANGO_DB_PATH: "/home/nonroot/db.sqlite3"
| 1 | The builtin.copy module can be used to copy local files up to the server,
and also, as we’re demonstrating here, to populate a file
with an arbitrary string content. |
| 2 | This lookup('password') thing is how we’ll get a random string of characters.
I copy-pasted it from Stack Overflow. Come on; there’s no shame in that.
The rest of the builtin.copy directive is designed to save the value to disk,
but only if the file doesn’t already exist.
The 0600 permission will ensure that only the "elspeth" user can read it. |
| 3 | The slurp command reads the contents of a file on the server,
and we can register its contents into a variable.
Slightly annoyingly, it uses base64 encoding
(it’s so you can also use it to read binary files).
Anyway, the idea is, even though we don’t rewrite the file on every deploy,
we do reread the value on every deploy. |
| 4 | Here’s the env parameter for our container. |
| 5 | Here’s how we get our original value for the secret key,
using the | b64decode to decode it back to a regular string. |
| 6 | inventory_hostname represents the hostname of the current server
we’re deploying to, so staging.ottg.co.uk in our case. |
Let’s run this latest version of our playbook now:
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml -v
[...]
PLAYBOOK: deploy-playbook.yaml **********************************************
1 plays in infra/deploy-playbook.yaml
PLAY [all] ********************************************************************
TASK [Gathering Facts] ********************************************************
ok: [staging.ottg.co.uk]
TASK [Install docker] *********************************************************
ok: [staging.ottg.co.uk] => {"cache_update_time": 1709136057, "cache_updated":
false, "changed": false}
TASK [Build container image locally] ******************************************
changed: [staging.ottg.co.uk -> 127.0.0.1] => {"actions": ["Built image [...]
TASK [Export container image locally] *****************************************
changed: [staging.ottg.co.uk -> 127.0.0.1] => {"actions": ["Archived image [...]
TASK [Upload image to server] *************************************************
changed: [staging.ottg.co.uk] => {"changed": true, [...]
TASK [Import container image on server] ***************************************
changed: [staging.ottg.co.uk] => {"actions": ["Loaded image [...]
TASK [Ensure .env file exists] ************************************************
changed: [staging.ottg.co.uk] => {"changed": true, [...]
TASK [Run container] **********************************************************
changed: [staging.ottg.co.uk] => {"changed": true, "container": [...]
PLAY RECAP ********************************************************************
staging.ottg.co.uk : ok=8 changed=6 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
Manually Checking Environment Variables for Running Containers
We’ll do one more manual check with SSH, to see if those env vars were set correctly. There’s a couple of ways we can do this.
Let’s start with a docker ps to check whether our container is running:
elspeth@server:$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 96d867b42a31 superlists "gunicorn --bind :88…" 6 seconds ago Up 5 seconds superlists
Looking good! The STATUS: Up 5 Seconds is better than the Exited we had before;
that means the container is up and running.
Let’s take a look at the docker logs too:
elspeth@server:~$ docker logs superlists [2025-05-02 17:55:18 +0000] [1] [INFO] Starting gunicorn 23.0.0 [2025-05-02 17:55:18 +0000] [1] [INFO] Listening at: http://0.0.0.0:8888 (1) [2025-05-02 17:55:18 +0000] [1] [INFO] Using worker: sync [2025-05-02 17:55:18 +0000] [7] [INFO] Booting worker with pid: 7
Also looking good; no sign of an error. Now let’s check on those environment variables.
There are two ways we can do this: docker exec env and docker inspect.
docker exec env
One way is to run the standard shell env command,
which prints out all environment variables.
We run it "inside" the container with docker exec:
elspeth@server:~$ docker exec superlists env PATH=/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=96d867b42a31 DJANGO_DEBUG_FALSE=1 DJANGO_SECRET_KEY=cXACJZTvoPfWFSBSTdixJTlXCWYTnJlC DJANGO_ALLOWED_HOST=staging.ottg.co.uk DJANGO_DB_PATH=/home/nonroot/db.sqlite3 GPG_KEY=7169605F62C751356D054A26A821E680E5FA6305 PYTHON_VERSION=3.14.3 PYTHON_SHA256=40f868bcbdeb8149a3149580bb9bfd407b3321cd48f0be631af955ac92c0e041 HOME=/home/nonroot
docker inspect
Another option—useful for debugging other things too,
like image IDs and mounts—is to use docker inspect:
elspeth@server:~$ docker inspect superlists
[
{
[...]
"Config": {
[...]
"Env": [
"DJANGO_DEBUG_FALSE=1",
"DJANGO_SECRET_KEY=cXACJZTvoPfWFSBSTdixJTlXCWYTnJlC",
"DJANGO_ALLOWED_HOST=staging.ottg.co.uk",
"DJANGO_DB_PATH=/home/nonroot/db.sqlite3",
"PATH=/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/[...]
"GPG_KEY=7169605F62C751356D054A26A821E680E5FA6305",
"PYTHON_VERSION=3.14.3",
"PYTHON_SHA256=40f868bcbdeb8149a3149580bb9bfd407b332[...]
],
"Cmd": [
"gunicorn",
"--bind",
":8888",
"superlists.wsgi:application"
],
"Image": "superlists",
"Volumes": null,
"WorkingDir": "/src",
"Entrypoint": null,
"OnBuild": null,
"Labels": {}
},
"NetworkSettings": {
[...]
}
}
]
There’s a lot of output!
It’s more or less everything that Docker knows about the container.
But if you scroll around, you can usually get some useful info for debugging
and diagnostics—like, in this case,
the Env parameter which tells us what environment variables were set for the container.
docker inspect is also useful
for checking exactly which image ID a container is using,
and which filesystem mounts are configured.
|
Looking good!
Running FTs to Check on Our Deploy
Enough manual checking via SSH; let’s see what our tests think.
The TEST_SERVER adaptation we made in [chapter_09_docker]
can also be used to check against our staging server.
Let’s see what they think:
$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests [...] selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=connectionFailure&u=http%3A//staging.ottg.co.uk/[...] [...] Ran 3 tests in 5.014s FAILED (errors=3)
None of them passed. Hmm.
That neterror makes me think it’s another networking problem.
| If your domain provider puts up a temporary holding page, you may get a 404 rather than a connection error at this point, and the traceback might have "NoSuchElementException" instead. |
Manual Debugging with curl Against the Staging Server
Let’s try our standard debugging technique of using curl
both locally and then from inside the container on the server.
First, on our own machine:
$ curl -iv staging.ottg.co.uk [...] curl: (7) Failed to connect to staging.ottg.co.uk port 80 after 25 ms: Couldn't connect to server
Similarly, depending on your domain/hosting provider,
you may see "Host not found" here instead.
Or, if your version of curl is different, you might see
"Connection refused".
|
Now let’s SSH in to our server and take a look at the Docker logs:
elspeth@server$ docker logs superlists [2024-02-28 22:14:43 +0000] [7] [INFO] Starting gunicorn 21.2.0 [2024-02-28 22:14:43 +0000] [7] [INFO] Listening at: http://0.0.0.0:8888 (7) [2024-02-28 22:14:43 +0000] [7] [INFO] Using worker: sync [2024-02-28 22:14:43 +0000] [8] [INFO] Booting worker with pid: 8
No errors there. Let’s try our curl:
elspeth@server$ curl -iv localhost * Trying 127.0.0.1:80... * connect to 127.0.0.1 port 80 failed: Connection refused * Trying ::1:80... * connect to ::1 port 80 failed: Connection refused * Failed to connect to localhost port 80 after 0 ms: Connection refused * Closing connection 0 curl: (7) Failed to connect to localhost port 80 after 0 ms: Connection refused
Hmm, curl fails on the server too.
But all this talk of port 80, both locally and on the server, might be giving us a clue.
Let’s check docker ps:
elspeth@server:$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 1dd87cbfa874 superlists "/bin/sh -c 'gunicor…" 9 minutes ago Up 9 minutes superlists
This might be ringing a bell now—we forgot the ports.
We want to map port 8888 inside the container as port 80 (the default web/HTTP port)
on the server:
- name: Run container
community.docker.docker_container:
name: superlists
image: superlists
state: started
recreate: true
env:
DJANGO_DEBUG_FALSE: "1"
DJANGO_SECRET_KEY: "{{ secret_key.content | b64decode }}"
DJANGO_ALLOWED_HOST: "{{ inventory_hostname }}"
DJANGO_DB_PATH: "/home/nonroot/db.sqlite3"
ports: 80:8888
You can map a different port on the outside
to the one that’s "inside" the Docker container.
In this case, we can map the public-facing standard HTTP port 80 on the host
to the arbitrarily chosen port 8888 on the inside.
|
Let’s push that up with ansible-playbook:
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, \ infra/deploy-playbook.yaml -v [...]
And now give the FTs another go:
$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests [...] selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [id="id_list_table"]; [...] [...] Ran 3 tests in 21.047s FAILED (errors=3)
So, 3/3 failed again, but the FTs did get a little further along. If you saw what was happening, or if you go and visit the site manually in your browser, you’ll see that the home page loads fine, but as soon as we try and create a new list item, it crashes with a 500 error.
Mounting the Database on the Server and Running Migrations
Let’s do another bit of manual debugging,
and take a look at the logs from our container with docker logs.
You’ll see an OperationalError:
$ ssh elspeth@server docker logs superlists [...] django.db.utils.OperationalError: no such table: lists_list
It looks like our database isn’t initialised. Aha! Another of those deployment "danger areas".
Just like we did on our own machine,
we need to mount the db.sqlite3 file from the filesystem outside the container.
We’ll also want to run migrations to create the database
and, in fact, each time we deploy,
so that any updates to the database schema
get applied to the database on the server.
Here’s the plan:
-
On the host machine, we’ll store the database in elspeth’s home folder; it’s as good a place as any.
-
We’ll set its UID to
1234, just like we did in [chapter_10_production_readiness], to match the UID of thenonrootuser inside the container. -
Inside the container, we’ll use the path
/home/nonroot/db.sqlite3—again, just like in the last chapter. -
We’ll run the migrations with a
docker exec, or the Ansible equivalent thereof.
Here’s what that looks like:
- name: Ensure db.sqlite3 file exists outside container
ansible.builtin.file:
path: "{{ ansible_env.HOME }}/db.sqlite3" (1)
state: touch (2)
owner: 1234 # so nonroot user can access it in container
become: true # needed for ownership change
- name: Run container
community.docker.docker_container:
name: superlists
image: superlists
state: started
recreate: true
env:
DJANGO_DEBUG_FALSE: "1"
DJANGO_SECRET_KEY: "{{ secret_key.content | b64decode }}"
DJANGO_ALLOWED_HOST: "{{ inventory_hostname }}"
DJANGO_DB_PATH: "/home/nonroot/db.sqlite3"
mounts: (3)
- type: bind
source: "{{ ansible_env.HOME }}/db.sqlite3" (1)
target: /home/nonroot/db.sqlite3
ports: 80:8888
- name: Run migration inside container
community.docker.docker_container_exec: (4)
container: superlists
command: ./manage.py migrate
| 1 | ansible_env gives us access to the environment variables on the server,
including HOME, which is the path to the home folder (/home/elspeth/ in my case). |
| 2 | We use file with state=touch to make sure a placeholder file exists
before we try and mount it in. |
| 3 | Here is the mounts config, which works a lot like the --mount flag to
docker run. |
| 4 | And we use the docker.container_exec module
to give us the functionality of docker exec,
to run the migration command inside the container. |
Let’s give that playbook a run and…
$ ansible-playbook --user=elspeth -i staging.ottg.co.uk, infra/deploy-playbook.yaml -v
[...]
TASK [Run migration inside container] *****************************************
changed: [staging.ottg.co.uk] => {"changed": true, "rc": 0, "stderr": "",
"stderr_lines": [], "stdout": "Operations to perform:\n Apply all migrations:
auth, contenttypes, lists, sessions\nRunning migrations:\n Applying
contenttypes.0001_initial... OK\n Applying
contenttypes.0002_remove_content_type_name... OK\n Applying
auth.0001_initial... OK\n Applying
auth.0002_alter_permission_name_max_length... OK\n Applying
[...]
PLAY RECAP ********************************************************************
staging.ottg.co.uk : ok=9 changed=2 unreachable=0 failed=0
skipped=0 rescued=0 ignored=0
It Workssss
Try the tests…
$ TEST_SERVER=staging.ottg.co.uk python src/manage.py test functional_tests Found 3 test(s). [...] ... --------------------------------------------------------------------- Ran 3 tests in 13.537s OK
Hooray!
All the tests pass! That gives us confidence that our automated deploy script can reproduce a fully working app, on a server, hosted on the public internet.
That’s worthy of a commit:
$ git diff # should show our changes in deploy-playbook yaml $ git commit -am"Save secret key, set env vars, mount db, run migrations. It works :)"
Deploying to Prod
Now that we are confident in our deploy script, let’s try using it for our live site!
The main change is to the -i flag, where we pass in the production
domain name, instead of the staging one:
$ ansible-playbook --user=elspeth -i www.ottg.co.uk, infra/deploy-playbook.yaml -vv [...] Done. Disconnecting from [email protected]... done.
Brrp brrp brpp. Looking good? Go take a click around your live site!
Git Tag the Release
One final bit of admin. To preserve a historical marker, we’ll use Git tags to mark the state of the codebase that reflects what’s currently live on the server:
$ git tag LIVE $ export TAG=$(date +DEPLOYED-%F/%H%M) # this generates a timestamp $ echo $TAG # should show "DEPLOYED-" and then the timestamp $ git tag $TAG $ git push origin LIVE $TAG # pushes the tags up to GitHub
Now it’s easy, at any time, to check what the difference is between our current codebase and what’s live on the servers. This will come in handy in a few chapters, when we look at database migrations. Have a look at the tag in the history:
$ git log --graph --oneline --decorate * 1d4d814 (HEAD -> main) Save secret key, set env vars, mount db, run migrations. It works :) * 95e0fe0 Build our image, use export/import to get it on the server, try and run it * 5a36957 Made a start on an ansible playbook for deployment [...]
| Once again, this use of Git tags isn’t meant to be the one true way. We just need some sort of way to keep track of what was deployed when. |
Tell Everyone!
You now have a live website! Tell all your friends! Tell your mum, if no one else is interested! Or, tell me! I’m always delighted to see a new reader’s site: [email protected]!
Congratulations again for getting through this block of deployment chapters; I know they can be challenging. I hope you got something out of them—seeing a practical example of how to take these kinds of complex changes and break them down into small, incremental steps, getting frequent feedback from our tests and manual investigations along the way.
| Our next deploy won’t be until [chapter_18_second_deploy], so you can switch off your servers until then if you want to. If you’re using a platform where you only get one month of free hosting, it might run out by then. You might have to shell out a few bucks, or see if there’s some way of getting another free month. |
In the next chapter, it’s back to coding again.
Further Reading
There’s no such thing as the one true way in deployment; I’ve tried to set you off on a reasonably sane path, but there are plenty of things you could do differently—and lots, lots more to learn besides. Here are some resources I used for inspiration, (including a couple I’ve already mentioned):
-
The original Twelve-Factor App manifesto from the Heroku team
-
The official Django docs' Deployment Checklist
-
"How to Write Deployment-friendly Applications" by Hynek Schlawack
-
The deployment chapter of Two Scoops of Django by Daniel and Audrey Roy Greenfield
-
The PythonSpeed "Docker packaging for Python developers" guide
Comments