Making Our App Production-Ready
Our container is working fine but it’s not production-ready. Let’s try to get it there, using the tests to keep us safe.
In a way we’re applying the Red-Green-Refactor cycle to our productionisation process. Our hacky container config got us to Green, and now we’re going to Refactor, working incrementally (just as we would while coding), trying to move from working state to working state, and using the FTs to detect any regressions.
What We Need to Do
What’s wrong with our hacky container image? A few things: first, we need to host our app on the "normal" port 80 so that people can access it using a regular URL.
Perhaps more importantly, we shouldn’t use the Django dev server for production; it’s not designed for real-life workloads. Instead, we’ll use the popular Gunicorn Python WSGI HTTP server.
Django’s runserver is built and optimised for local development and debugging.
It’s designed to handle one user at a time,
it handles automatic reloading upon saving of the source code,
but it isn’t optimised for performance,
nor has it been hardened against security vulnerabilities.
|
In addition, several options in settings.py are currently unacceptable.
DEBUG=True
, is strongly discouraged for production,
we’ll want to set a unique SECRET_KEY
,
and, as we’ll see, other things will come up.
DEBUG=True is considered a security risk, because the django debug page will display sensitive information like the values of variables, and most of the settings in settings.py. |
Let’s go through and see if we can fix things one by one.
Switching to Gunicorn
Do you know why the Django mascot is a pony? The story is that Django comes with so many things you want: an ORM, all sorts of middleware, the admin site… "What else do you want, a pony?" Well, Gunicorn stands for "Green Unicorn", which I guess is what you’d want next if you already had a pony…
We’ll need to first install Gunicorn into our container,
and then use it instead of runserver
:
$ python -m pip install gunicorn Collecting gunicorn [...] Successfully installed gunicorn-2[...]
Gunicorn will need to know a path to a "WSGI server"[1]
which is usually a function called application
.
Django provides one in superlists/wsgi.py.
Let’s change the command our image runs:
[...]
RUN pip install "django<6" gunicorn (1)
COPY src /src
WORKDIR /src
CMD gunicorn --bind :8888 superlists.wsgi:application (2)
1 | Installation is a standard pip install. |
2 | Gunicorn has its own command line, gunicorn .
It needs to know a path to a WSGI server,
which is usually a function called application .
Django provides one in superlists/wsgi.py. |
As in the previous chapter, we can use the docker build && docker run
pattern to try out our changes by rebuilding and rerunning our container:
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -it superlists
The FTs catch a problem with static files
As we run the functional tests, you’ll see them warning us of a problem, once again. The test for adding list items passes happily, but the test for layout + styling fails. Good job, tests!
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] AssertionError: 102.5 != 512 within 10 delta (409.5 difference) FAILED (failures=1)
And indeed, if you take a look at the site, you’ll find the CSS is all broken, as in Broken CSS.
The reason that we have no CSS is that although the Django dev server will serve static files magically for you, Gunicorn doesn’t.
One step forward, one step backward, but once again we’ve identified the problem nice and early. Moving on!
Serving Static Files with Whitenoise
Serving static files is very different from serving dynamically rendered content from Python and Django. There are many ways to serve them in production: you can use a web server like Nginx, or a CDN like Amazon S3, but in our case, the most straightforward thing to do is to use Whitenoise, a Python library expressly designed for serving static[2] files from Python.
First we install Whitenoise into our local environment:
pip install whitenoise
Then we tell Django to enable it, in settings.py:
MIDDLEWARE = [
"django.middleware.security.SecurityMiddleware",
"whitenoise.middleware.WhiteNoiseMiddleware",
"django.contrib.sessions.middleware.SessionMiddleware",
[...]
And then we need to add it to our pip installs in the Dockerfile:
RUN pip install "django<6" gunicorn whitenoise
This manual list of pip installs is getting a little fiddly! We’ll come back to that in a moment. First let’s rebuild and try re-running our FTs:
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -it superlists
And if you take another manual look at your site, things should look much healthier. Let’s rerun our FTs to confirm:
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] ... --------------------------------------------------------------------- Ran 3 tests in 10.718s OK
Phew. Let’s commit that:
$ git commit -am"Switch to Gunicorn and Whitenoise"
Using requirements.txt
Let’s deal with that fiddly list of pip installs.
To reproduce our local virtualenv, rather than just manually pip installing things one by one, and having to remember to sync things between local dev and docker, we can "save" the list of packages we’re using by creating a requirements.txt file.[3]
The pip freeze
command will show us everything that’s installed in our virtualenv at the moment:
$ pip freeze asgiref==3.8.1 attrs==23.2.0 certifi==2024.2.2 django==5.1.1 gunicorn==21.2.0 h11==0.14.0 idna==3.6 outcome==1.3.0.post0 packaging==24.0 pysocks==1.7.1 selenium==4.18.1 sniffio==1.3.1 sortedcontainers==2.4.0 sqlparse==0.4.4 trio==0.25.0 trio-websocket==0.11.1 typing-extensions==4.10.0 urllib3==2.2.1 whitenoise==6.6.0 wsproto==1.2.0
That shows all the packages in our virtualenv, along with their version numbers. Let’s pull out just the "top-level" dependencies, Django, Gunicorn and Whitenoise:
$ pip freeze | grep -i django Django==5.1.[...] $ pip freeze | grep -i django >> requirements.txt $ pip freeze | grep -i gunicorn >> requirements.txt $ pip freeze | grep -i whitenoise >> requirements.txt
That should give us a requirements.txt file that looks like this:
django==5.1.1
gunicorn==21.2.0
whitenoise==6.6.0
That’s a good first cut, let’s commit it:
$ git add requirements.txt $ git commit -m "Add a requirements.txt with Django, gunicorn and whitenoise"
You may be wondering why we didn’t add our other dependency,
Selenium, to our requirements,
or why we didn’t just add all the dependencies,
including the "transitive" ones
(eg, Django has its own dependencies like asgiref
and sqlparse
etc).
As always, I have to gloss over some nuance and tradeoffs, but the short answer is first, Selenium is only a dependency for the tests, not the application code; we’re never going to run the tests directly on our production servers. As to transitive dependencies, they’re fiddly to manage without bringing in more tools, and I didn’t want to do that for this book.[4]
Itamar Turner-Traurig has a great guide to Docker Packaging for Python Developers, which I cannot recommend highly enough. Read that before you’re too much older. |
Now let’s see how we use that requirements file in our Dockerfile:
FROM python:slim
RUN python -m venv /venv
ENV PATH="/venv/bin:$PATH"
COPY requirements.txt requirements.txt (1)
RUN pip install -r requirements.txt (2)
COPY src /src
WORKDIR /src
CMD gunicorn --bind :8888 superlists.wsgi:application
1 | We COPY our requirements file in, just like the src folder. |
2 | Now instead of just installing Django, we install all our dependencies
by pointing pip at the requirements.txt using the -r flag.
Notice the -r . |
Forgetting the -r and running pip install requirements.txt
is such a common error, that I recommend you do it right now
and get familiar with the error message
(which is thankfully much more helpful than it used to be).
It’s a mistake I still make, all the time.
|
Let’s build & run:
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -it superlists
And then test to check everything still works:
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] OK
Using Environment Variables to Adjust Settings for Production
We know there are several things in settings.py that we want to change for production:
-
DEBUG
mode is all very well for hacking about on your own server, but it isn’t secure. For example, exposing raw tracebacks to the world is a bad idea. -
SECRET_KEY
is used by Django for some of its crypto—things like cookies and CSRF protection. It’s good practice to make sure the secret key in production is different from the one in your source code repo, because that code might be visible to strangers. We’ll want to generate a new, random one but then keep it the same for the foreseeable future (find out more in the Django docs).
Development, staging and production sites always have some differences in their configuration. Environment variables are a good place to store those different settings. See "The 12-Factor App".[5]
Setting DEBUG=True and SECRET_KEY
There are lots of ways you might do this.
Here’s what I propose; it may seem a little fiddly, but I’ll provide a little justification for each choice. Let them be an inspiration (but not a template) for your own choices!
Note that this if statement replaces the DEBUG and SECRET_KEY lines that are included by default in the settings.py file:
import os
[...]
if "DJANGO_DEBUG_FALSE" in os.environ: (1)
DEBUG = False
SECRET_KEY = os.environ["DJANGO_SECRET_KEY"] (2)
else:
DEBUG = True (3)
SECRET_KEY = "insecure-key-for-dev"
1 | We say we’ll use an environment variable called DJANGO_DEBUG_FALSE
to switch debug mode off, and in effect require production settings
(it doesn’t matter what we set it to, just that it’s there). |
2 | And now we say that, if debug mode is off,
we require the SECRET_KEY to be set by a second environment variable. |
3 | Otherwise we fall-back to the insecure, debug mode settings that are useful for Dev. |
The end result is that you don’t need to set any env vars for dev, but production needs both to be set explicitly, and it will error if any are missing. I think this gives us a little bit of protection against accidentally forgetting to set one.
Better to fail hard than allow a typo in an environment variable name to leave you running with insecure settings. |
Setting environment variables inside the Dockerfile
Now let’s set that environment variable in our Dockerfile using the ENV
directive:
WORKDIR /src
ENV DJANGO_DEBUG_FALSE=1
CMD gunicorn --bind :8888 superlists.wsgi:application
And try it out…
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -it superlists [...] File "/src/superlists/settings.py", line 22, in <module> SECRET_KEY = os.environ["DJANGO_SECRET_KEY"] ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ [...] KeyError: 'DJANGO_SECRET_KEY'
Oops. I forgot to set said secret key env var, mere seconds after having dreamt it up!
Setting Environment Variables at the Docker Command Line
We’ve said we can’t keep the secret key in our source code, so the Dockerfile isn’t an option; where else can we put it?
For now, we can set it at the command line using the -e
flag for docker run
:
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -e DJANGO_SECRET_KEY=sekrit \ -it superlists
With that running, we can use our FT again to see if we’re back to a working state.
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] AssertionError: 'To-Do' not found in 'Bad Request (400)'
ALLOWED_HOSTS is Required When Debug Mode is Turned Off
It’s not quite working yet! Let’s take a look manually: An ugly 400 error.
We’ve set our two environment variables but doing so seems to have broken things. But once again, by running our FTs frequently, we’re able to identify the problem early, before we’ve changed too many things at the same time. We’ve only changed two settings—which one might be at fault?
Let’s use the "Googling the error message" technique again, with the search terms "django debug false" and "400 bad request".
Well, the very first link in my search results
was Stackoverflow suggesting that a 400 error is usually to do with ALLOWED_HOSTS
,
and the second was the official Django docs,
which takes a bit more scrolling, but confirms it
(see Search results for "django debug false 400 bad request").
ALLOWED_HOSTS
is a security setting
designed to reject requests that are likely to be forged, broken or malicious
because they don’t appear to be asking for your site
(HTTP requests contain the address they were intended for in a header called "Host").
By default, when DEBUG=True, ALLOWED_HOSTS
effectively allows localhost,
our own machine, so that’s why it was working OK until now.
There’s more information in the Django docs.
The upshot is that we need to adjust ALLOWED_HOSTS
in settings.py.
Let’s use another environment variable for that:
if "DJANGO_DEBUG_FALSE" in os.environ:
DEBUG = False
SECRET_KEY = os.environ["DJANGO_SECRET_KEY"]
ALLOWED_HOSTS = [os.environ["DJANGO_ALLOWED_HOST"]]
else:
DEBUG = True
SECRET_KEY = "insecure-key-for-dev"
ALLOWED_HOSTS = []
This is a setting that we want to change,
depending on whether our Docker image is running locally,
or on a server, so we’ll use the -e
flag again:
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -e DJANGO_SECRET_KEY=sekrit \ -e DJANGO_ALLOWED_HOST=localhost \ -it superlists
Collectstatic is Required when Debug is Turned Off
An FT run (or just looking at the site) reveals that we’ve had a regression in our static files:
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] AssertionError: 102.5 != 512 within 10 delta (409.5 difference) FAILED (failures=1)
We saw this before when switching from the Django dev server to Gunicorn,
so we introduced Whitenoise.
Similarly, when we switch DEBUG off,
Whitenoise stops automagically finding static files in our code,
and instead we need to run collectstatic
:
WORKDIR /src
RUN python manage.py collectstatic
ENV DJANGO_DEBUG_FALSE=1
CMD gunicorn --bind :8888 superlists.wsgi:application
Well, it was fiddly, but that should get us to passing tests after we build & run the docker container!
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -e DJANGO_SECRET_KEY=sekrit \ -e DJANGO_ALLOWED_HOST=localhost \ -it superlists
and…
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] OK
We’re nearly ready to ship to production!
Let’s quickly adjust our gitignore, since the static folder is in a new place:
$ git status # should show dockerfile and untracked src/static folder $ echo src/static >> .gitignore $ git status # should now be clean $ git commit -am "Add collectstatic to dockerfile, and new location to gitignore"
Switching to a nonroot user
TODO: apologies, WIP, this is definitely a good idea for security, needs writing up.
Dockerfile should gain some lines a bit like this:
RUN addgroup --system nonroot && adduser --system --no-create-home --disabled-password --group nonroot
USER nonroot
Configuring logging
One last thing we’ll want to do is make sure that we can get logs out of our server. If things go wrong, we want to be able to get to the tracebacks, and as we’ll soon see, switching DEBUG off means that Django’s default logging configuration changes.
Provoking a deliberate error
To test this, we’ll provoke a deliberate error by deleting the database file.
$ rm src/db.sqlite3 $ touch src/db.sqlite3 # otherwise the --mount type=bind will complain
Now if you run the tests, you’ll see they fail;
$ TEST_SERVER=localhost:8888 python src/manage.py test functional_tests --failfast [...] selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: [id="id_list_table"]; [...]
And you might spot in the browser that we just see a minimal error page, with no debug info (try it manually if you like):
But if you look in your docker terminal, you’ll see there is no traceback:
[2024-02-28 10:41:53 +0000] [7] [INFO] Starting gunicorn 21.2.0 [2024-02-28 10:41:53 +0000] [7] [INFO] Listening at: http://0.0.0.0:8888 (7) [2024-02-28 10:41:53 +0000] [7] [INFO] Using worker: sync [2024-02-28 10:41:53 +0000] [8] [INFO] Booting worker with pid: 8
Where have the tracebacks gone? You might have been expecting that the django debug page and its tracebacks would disappear from our web browser, but it’s more of shock to see that they are no longer appearing in the terminal either! If you’re like me you might find yourself wondering if we really did see them earlier and starting to doubt your own sanity. But the explanation is that Django’s default logging configuration changes when DEBUG is turned off.
This means we need to interact with the standard library’s logging
module,
unfortunately one of the most fiddly parts of the Python standard library[6].
Here’s pretty much the simplest possible logging config which just prints everything to the console (i.e. standard out). I’ve added this code to the very end of the settings.py file.
LOGGING = {
"version": 1,
"disable_existing_loggers": False,
"handlers": {
"console": {"class": "logging.StreamHandler"},
},
"loggers": {
"root": {"handlers": ["console"], "level": "INFO"},
},
}
Rebuild and restart our container…
$ docker build -t superlists . && docker run \ -p 8888:8888 \ --mount type=bind,source=./src/db.sqlite3,target=/src/db.sqlite3 \ -e DJANGO_SECRET_KEY=sekrit \ -e DJANGO_ALLOWED_HOST=localhost \ -it superlists
Then try the FT again (or submitting a new list item manually) and we now should see a clear error message:
Internal Server Error: /lists/new Traceback (most recent call last): [...] File "/src/lists/views.py", line 10, in new_list nulist = List.objects.create() ^^^^^^^^^^^^^^^^^^^^^ [...] File "/venv/lib/python3.12/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute return super().execute(query, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ django.db.utils.OperationalError: no such table: lists_list
Re-create the database with ./src/manage.py migrate
and we’ll be back to a working state.
Don’t forget to commit our changes to settings.py and Dockerfile, and I think we can call it job done! We’ve at least touched on many or most of the things you might need to think about when considering production-readiness, we’ve worked in small steps and used our tests all the way along, and we’re now ready to deploy our container to a real server!
Find out how, in our next exciting instalment…
Comments