Splitting Our Tests into Multiple Files, and a Generic Wait Helper

Splitting Our Tests into Multiple Files, and a Generic Wait Helper

🚧 Warning, chapter update in progress

This chapter is currently in the process of being rewritten for the 3e.

The code listings should all be valid, and work with Python3.12 + Django 4, but I haven’t reviewed the chapter text in detail yet.

Back to local development! The next feature we might like to implement is a little input validation. But as we start writing new tests, we’ll notice that it’s getting hard to find our way around a single functional_tests.py, and tests.py, so we’ll reorganise them into multiple files—a little refactor of our tests, if you will.

We’ll also build a generic explicit wait helper.

Start on a Validation FT: Preventing Blank Items

As our first few users start using the site, we’ve noticed they sometimes make mistakes that mess up their lists, like accidentally submitting blank list items, or inputting two identical items to a list. Computers are meant to help stop us from making silly mistakes, so let’s see if we can get our site to help.

Here’s the outline of an FT:

src/functional_tests/tests.py (ch11l001)

def test_cannot_add_empty_list_items(self):
    # Edith goes to the home page and accidentally tries to submit
    # an empty list item. She hits Enter on the empty input box

    # The home page refreshes, and there is an error message saying
    # that list items cannot be blank

    # She tries again with some text for the item, which now works

    # Perversely, she now decides to submit a second blank list item

    # She receives a similar warning on the list page

    # And she can correct it by filling some text in
    self.fail("write me!")

That’s all very well, but before we go any further—our functional tests file is beginning to get a little crowded. Let’s split it out into several files, in which each has a single test method.

Remember that functional tests are closely linked to "user stories" and features. One way of organising your FTs might be to have one per high-level feature.

We’ll also have one base test class which they can all inherit from. Here’s how to get there step by step.

Skipping a Test

We’re back to local development now. Double check that the TEST_SERVER environment variable is unset in your terminal.

It’s always nice, when doing refactoring, to have a fully passing test suite. We’ve just written a test with a deliberate failure. Let’s temporarily switch it off, using a decorator called "skip" from unittest:

src/functional_tests/tests.py (ch11l001-1)

from unittest import skip
[...]

    @skip
    def test_cannot_add_empty_list_items(self):

This tells the test runner to ignore this test. You can see it works—if we rerun the tests, you’ll see it’s a pass, but it explicitly mentions the skipped test:

$ python src/manage.py test functional_tests
[...]
Ran 4 tests in 11.577s
OK (skipped=1)

Skips are dangerous—you need to remember to remove them before you commit your changes back to the repo. This is why line-by-line reviews of each of your diffs are a good idea!

Don’t Forget the "Refactor" in "Red, Green, Refactor"

A criticism that’s sometimes levelled at TDD is that it leads to badly architected code, as the developer just focuses on getting tests to pass rather than stopping to think about how the whole system should be designed. I think it’s slightly unfair.

TDD is no silver bullet. You still have to spend time thinking about good design. But what often happens is that people forget the "Refactor" in "Red, Green, Refactor". The methodology allows you to throw together any old code to get your tests to pass, but it also asks you to then spend some time refactoring it to improve its design. Otherwise, it’s too easy to allow "technical debt" to build up.

Often, however, the best ideas for how to refactor code don’t occur to you straight away. They may occur to you days, weeks, even months after you wrote a piece of code, when you’re working on something totally unrelated and you happen to see some old code again with fresh eyes. But if you’re halfway through something else, should you stop to refactor the old code?

The answer is that it depends. In the case at the beginning of the chapter, we haven’t even started writing our new code. We know we are in a working state, so we can justify putting a skip on our new FT (to get back to fully passing tests) and do a bit of refactoring straight away.

Later in the chapter we’ll spot other bits of code we want to alter. In those cases, rather than taking the risk of refactoring an application that’s not in a working state, we’ll make a note of the thing we want to change on our scratchpad and wait until we’re back to a fully passing test suite before refactoring.

Kent Beck has a book-length exploration of the tradeoffs of refactor-now vs refactor-later, called Tidy First?

Splitting Functional Tests Out into Many Files

We start putting each test into its own class, still in the same file:

src/functional_tests/tests.py (ch11l002)

class FunctionalTest(StaticLiveServerTestCase):
    def setUp(self):
        [...]
    def tearDown(self):
        [...]
    def wait_for_row_in_list_table(self, row_text):
        [...]


class NewVisitorTest(FunctionalTest):
    def test_can_start_a_todo_list(self):
        [...]
    def test_multiple_users_can_start_lists_at_different_urls(self):
        [...]


class LayoutAndStylingTest(FunctionalTest):
    def test_layout_and_styling(self):
        [...]


class ItemValidationTest(FunctionalTest):
    @skip
    def test_cannot_add_empty_list_items(self):
        [...]

At this point we can rerun the FTs and see they all still work:

Ran 4 tests in 11.577s

OK (skipped=1)

That’s labouring it a little bit, and we could probably get away with doing this stuff in fewer steps, but, as I keep saying, practising the step-by-step method on the easy cases makes it that much easier when we have a complex case.

Now we switch from a single tests file to using one for each class, and one "base" file to contain the base class all the tests will inherit from. We’ll make four copies of tests.py, naming them appropriately, and then delete the parts we don’t need from each:

$ git mv src/functional_tests/tests.py src/functional_tests/base.py
$ cp src/functional_tests/base.py src/functional_tests/test_simple_list_creation.py
$ cp src/functional_tests/base.py src/functional_tests/test_layout_and_styling.py
$ cp src/functional_tests/base.py src/functional_tests/test_list_item_validation.py

base.py can be cut down to just the FunctionalTest class. We leave the helper method on the base class, because we suspect we’re about to reuse it in our new FT:

src/functional_tests/base.py (ch11l003)

import os
import time
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import WebDriverException

MAX_WAIT = 5


class FunctionalTest(StaticLiveServerTestCase):
    def setUp(self):
        [...]
    def tearDown(self):
        [...]
    def wait_for_row_in_list_table(self, row_text):
        [...]

Keeping helper methods in a base FunctionalTest class is one useful way of preventing duplication in FTs. Later in the book (in [chapter_page_pattern]) we’ll use the "Page pattern", which is related, but prefers composition over inheritance—always a good thing.

Our first FT is now in its own file, and should be just one class and one test method:

src/functional_tests/test_simple_list_creation.py (ch11l004)

from .base import FunctionalTest
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys


class NewVisitorTest(FunctionalTest):
    def test_can_start_a_todo_list(self):
        [...]
    def test_multiple_users_can_start_lists_at_different_urls(self):
        [...]

I used a relative import (from .base). Some people like to use them a lot in Django code (e.g., your views might import models using from .models import List, instead of from list.models). Ultimately this is a matter of personal preference. I prefer to use relative imports only when I’m super-super that the relative position of the thing I’m importing won’t change. That applies in this case because I know for sure all the tests will sit next to base.py, which they inherit from.

The layout and styling FT should now be one file and one class:

src/functional_tests/test_layout_and_styling.py (ch11l005)

from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from .base import FunctionalTest


class LayoutAndStylingTest(FunctionalTest):
        [...]

Lastly our new validation test is in a file of its own too:

src/functional_tests/test_list_item_validation.py (ch11l006)

from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from unittest import skip
from .base import FunctionalTest


class ItemValidationTest(FunctionalTest):
    @skip
    def test_cannot_add_empty_list_items(self):
        [...]

And we can test that everything worked by rerunning manage.py test functional_tests, and checking once again that all four tests are run:

Ran 4 tests in 11.577s

OK (skipped=1)

Now we can remove our skip:

src/functional_tests/test_list_item_validation.py (ch11l007)

class ItemValidationTest(FunctionalTest):
    def test_cannot_add_empty_list_items(self):
        [...]

Running a Single Test File

As a side bonus, we’re now able to run an individual test file, like this:

$ python src/manage.py test functional_tests.test_list_item_validation
[...]
AssertionError: write me!

Brilliant—no need to sit around waiting for all the FTs when we’re only interested in a single one. Although we need to remember to run all of them now and again, to check for regressions. Later in the book we’ll see how to give that task over to an automated Continuous Integration loop. For now, let’s commit!

$ git status
$ git add src/functional_tests
$ git commit -m "Moved Fts into their own individual files"

Great. We’ve split our functional tests nicely out into different files. Next we’ll start writing our FT, but before long, as you may be guessing, we’ll do something similar to our unit test files.

A New Functional Test Tool: A Generic Explicit Wait Helper

First let’s start implementing the test, or at least the beginning of it:

src/functional_tests/test_list_item_validation.py (ch11l008)

def test_cannot_add_empty_list_items(self):
    # Edith goes to the home page and accidentally tries to submit
    # an empty list item. She hits Enter on the empty input box
    self.browser.get(self.live_server_url)
    self.browser.find_element(By.ID, "id_new_item").send_keys(Keys.ENTER)

    # The home page refreshes, and there is an error message saying
    # that list items cannot be blank
    self.assertEqual(
        self.browser.find_element(By.CSS_SELECTOR, ".invalid-feedback").text,  (1)
        "You can't have an empty list item",  (2)
    )

    # She tries again with some text for the item, which now works
    self.fail("finish this test!")
    [...]

This is how we might write the test naively:

1	We specify we’re going to use a CSS class called `.invalid-feedback` to mark our error text. We’ll see that Bootstrap has some useful styling for those.
2	And we can check that our error displays the message we want.

But can you guess what the potential problem is with the test as it’s written now?

OK, I gave it away in the section header, but whenever we do something that causes a page refresh, we need an explicit wait; otherwise, Selenium might go looking for the .invalid-feedback element before the page has had a chance to load.

Whenever you submit a form with Keys.ENTER or click something that is going to cause a page to load, you probably want an explicit wait for your next assertion.

Our first explicit wait was built into a helper method. For this one, we might decide that building a specific helper method is overkill at this stage, but it might be nice to have some generic way of saying, in our tests, "wait until this assertion passes". Something like this:

src/functional_tests/test_list_item_validation.py (ch11l009)

[...]
    # The home page refreshes, and there is an error message saying
    # that list items cannot be blank
    self.wait_for(
        lambda: self.assertEqual(  (1)
            self.browser.find_element(By.CSS_SELECTOR, ".invalid-feedback").text,
            "You can't have an empty list item",
        )
    )

1	Rather than calling the assertion directly, we wrap it in a lambda function, and we pass it to a new helper method we imagine called `wait_for`.

If you’ve never seen lambda functions in Python before, see Lambda Functions.

So how would this magical wait_for method work? Let’s head over to base.py, make a copy of our existing wait_for_row_in_list_table method, and we’ll adapt it slightly:

src/functional_tests/base.py (ch11l010)

    def wait_for(self, fn):  (1)
        start_time = time.time()
        while True:
            try:
                table = self.browser.find_element(By.ID, "id_list_table")  (2)
                rows = table.find_element(By.TAG_NAME, "tr")
                self.assertIn(row_text, [row.text for row in rows])
                return
            except (AssertionError, WebDriverException):
                if time.time() - start_time > MAX_WAIT:
                    raise
                time.sleep(0.5)

1	We make a copy of the method, but we name it `wait_for`, and we change its argument. It is expecting to be passed a function.
2	For now we’ve still got the old code that’s checking table rows. How to transform this into something that works for any generic `fn` that’s been passed in?

Like this:

src/functional_tests/base.py (ch11l011)

    def wait_for(self, fn):
        start_time = time.time()
        while True:
            try:
                return fn()  (1)
            except (AssertionError, WebDriverException):
                if time.time() - start_time > MAX_WAIT:
                    raise
                time.sleep(0.5)

1	The body of our try/except, instead of being the specific code for examining table rows, just becomes a call to the function we passed in. We also `return` its result, to be able to exit the loop immediately if no exception is raised.

Lambda Functions

lambda in Python is the syntax for making a one-line, throwaway function—it saves you from having to use def..(): and an indented block:

>>> myfn = lambda x: x+1
>>> myfn(2)
3
>>> myfn(5)
6
>>> adder = lambda x, y: x + y
>>> adder(3, 2)
5

In our case, we’re using it to transform a bit of code that would otherwise be executed immediately into a function that we can pass as an argument, and that can be executed later, and multiple times:

>>> def addthree(x):
...     return x + 3
...
>>> addthree(2)
5
>>> myfn = lambda: addthree(2)  # note addthree is not called immediately here
>>> myfn
<function <lambda> at 0x7f3b140339d8>
>>> myfn()
5
>>> myfn()
5

Let’s see our funky wait_for helper in action:

$ python src/manage.py test functional_tests.test_list_item_validation
[...]

======================================================================
ERROR: test_cannot_add_empty_list_items (functional_tests.test_list_item_valida
tion.ItemValidationTest.test_cannot_add_empty_list_items)
 ---------------------------------------------------------------------
[...]
Traceback (most recent call last):
  File "...goat-book/src/functional_tests/test_list_item_validation.py", line
15, in test_cannot_add_empty_list_items
    self.wait_for((1)
  File "...goat-book/src/functional_tests/base.py", line 25, in wait_for
    return fn()(2)
           ^^^^
  File "...goat-book/src/functional_tests/test_list_item_validation.py", line
17, in <lambda>(3)
    self.browser.find_element(By.CSS_SELECTOR, ".invalid-feedback").text,(3)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[...]
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate
element: .invalid-feedback; [...]

 ---------------------------------------------------------------------
Ran 1 test in 10.575s

FAILED (errors=1)

The order of the traceback is a little confusing, but we can more or less follow through what happened:

1	At line 15 in our FT, we go into our `self.wait_for` helper, passing it the `lambda`-ified version of the `assertEqual`.
2	We go into `self.wait_for` in base.py, where we can see that we’re inside `fn()`, which is the name for our lambda inside the helper.
3	To explain where the exception has actually come from, the traceback takes us back into test_list_item_validation.py and inside the body of the `lambda` function, and tells us that it the attempt to find the `.invalid-feedback` element that failed.

We’re into the realm of functional programming now, passing functions as arguments to other functions, and it can be a little mind-bending. I know it took me a little while to get used to! Have a couple of read-throughs of this code, and the code back in the FT, to let it sink in; and if you’re still confused, don’t worry about it too much, and let your confidence grow from working with it. We’ll use it a few more times in this book and make it even more functionally fun, you’ll see.

Finishing Off the FT

We’ll finish off the FT like this:

src/functional_tests/test_list_item_validation.py (ch11l012)

    # The home page refreshes, and there is an error message saying
    # that list items cannot be blank
    self.wait_for(
        lambda: self.assertEqual(
            self.browser.find_element(By.CSS_SELECTOR, ".invalid-feedback").text,
            "You can't have an empty list item",
        )
    )

    # She tries again with some text for the item, which now works
    self.browser.find_element(By.ID, "id_new_item").send_keys("Buy milk")
    self.browser.find_element(By.ID, "id_new_item").send_keys(Keys.ENTER)
    self.wait_for_row_in_list_table("1: Buy milk")

    # Perversely, she now decides to submit a second blank list item
    self.browser.find_element(By.ID, "id_new_item").send_keys(Keys.ENTER)

    # She receives a similar warning on the list page
    self.wait_for(
        lambda: self.assertEqual(
            self.browser.find_element(By.CSS_SELECTOR, ".invalid-feedback").text,
            "You can't have an empty list item",
        )
    )

    # And she can correct it by filling some text in
    self.browser.find_element(By.ID, "id_new_item").send_keys("Make tea")
    self.browser.find_element(By.ID, "id_new_item").send_keys(Keys.ENTER)
    self.wait_for_row_in_list_table("1: Buy milk")
    self.wait_for_row_in_list_table("2: Make tea")

Helper Methods in FTs

We’ve got two helper methods now, our generic self.wait_for helper, and wait_for_row_in_list_table. The former is a general utility—any of our FTs might need to do a wait.

The second also helps prevent duplication across your functional test code. The day we decide to change the implementation of how our list table works, we want to make sure we only have to change our FT code in one place, not in dozens of places across loads of FTs…

See also [chapter_page_pattern] and [appendix_bdd] for more on structuring your FT code.

I’ll let you do your own "first-cut FT" commit.

Refactoring Unit Tests into Several Files

When we (finally!) start coding our solution, we’re going to want to add another test for our models.py. Before we do so, it’s time to tidy up our unit tests in a similar way to the functional tests.

A difference will be that, because the lists app contains real application code as well as tests, we’ll separate out the tests into their own folder:

$ mkdir src/lists/tests
$ touch src/lists/tests/__init__.py
$ git mv src/lists/tests.py src/lists/tests/test_all.py
$ git status
$ git add src/lists/tests
$ python src/manage.py test lists
[...]
Ran 9 tests in 0.034s

OK
$ git commit -m "Move unit tests into a folder with single file"

If you get a message saying "Ran 0 tests", you probably forgot to add the dunderinit—it needs to be there or else the tests folder isn’t a valid Python package…^[1]

Now we turn test_all.py into two files, one called test_views.py, which will only contains view tests, and one called test_models.py. I’ll start by making two copies:

$ git mv src/lists/tests/test_all.py src/lists/tests/test_views.py
$ cp src/lists/tests/test_views.py src/lists/tests/test_models.py

And strip test_models.py down to being just the one test—it means it needs far fewer imports:

src/lists/tests/test_models.py (ch11l016)

from django.test import TestCase
from lists.models import Item, List


class ListAndItemModelsTest(TestCase):
        [...]

Whereas test_views.py just loses one class:

src/lists/tests/test_views.py (ch11l017)

--- a/src/lists/tests/test_views.py
+++ b/src/lists/tests/test_views.py
@@ -103,34 +104,3 @@ class ListViewTest(TestCase):
         self.assertNotContains(response, "other list item 1")
         self.assertNotContains(response, "other list item 2")

-
-
-class ListAndItemModelsTest(TestCase):
-    def test_saving_and_retrieving_items(self):
[...]

We rerun the tests to check that everything is still there:

$ python src/manage.py test lists
[...]
Ran 9 tests in 0.040s

OK

Great! That’s another small, working step:

$ git add src/lists/tests
$ git commit -m "Split out unit tests into two files"

Some people like to make their unit tests into a tests folder straight away, as soon as they start a project. That’s a perfectly good idea; I just thought I’d wait until it became necessary, to avoid doing too much housekeeping all in the first chapter!

Well, that’s our FTs and unit test nicely reorganised. In the next chapter we’ll get down to some validation proper.

Tips on Organising Tests and Refactoring

Use a tests folder

Just as you use multiple files to hold your application code, you should split your tests out into multiple files.

For functional tests, group them into tests for a particular feature or user story.
For unit tests, use a folder called tests, with a __init__.py.
You probably want a separate test file for each tested source code file. For Django, that’s typically test_models.py, test_views.py, and test_forms.py.
Have at least a placeholder test for every function and class.

Don’t forget the "Refactor" in "Red, Green, Refactor"

The whole point of having tests is to allow you to refactor your code! Use them, and make your code (including your tests) as clean as you can.

Don’t refactor against failing tests

In general!
But the FT you’re currently working on doesn’t count.
You can occasionally put a skip on a test which is testing something you haven’t written yet.
More commonly, make a note of the refactor you want to do, finish what you’re working on, and do the refactor a little later, when you’re back to a working state.
Don’t forget to remove any skips before you commit your code! You should always review your diffs line by line to catch things like this.

Try a generic wait_for helper

Having specific helper methods that do explicit waits is great, and it helps to make your tests readable. But you’ll also often need an ad-hoc one-line assertion or Selenium interaction that you’ll want to add a wait to. self.wait_for does the job well for me, but you might find a slightly different pattern works for you.

1. "Dunder" is shorthand for double-underscore, so "dunderinit" means __init__.py.