Unit Testing

Python examples, and general unit testing strategies.

If I've learned anything over the years in my career as a software engineer, it's the value of the unit test. Or rather, it's the value of a large battery of unit tests, covering many different parts of the application. With such a collection, you can be pretty confident when it comes to refactoring and bug fixing. And when you need to add new functionality, unit testing gives you a chance to methodically verify both normal and boundary conditions.

But every good thing comes with a price: writing good unit tests is not always easy. Sometimes it's hard, and it's a skill one learns over time. On this page I offer some of the techniques I’ve learned, techniques which have helped me make sure that, for every project I'm working on, I have a nice suite of unit tests I can depend on.

Python Examples

Expected Exceptions

In a Nutshell

Here's an example of how to assert that a ValueError exception is thrown in a unit test when you call some imaginary function Output( ) with a parameter set to None.

with self.assertRaises(ValueError):
    Output(None)

Details

Typically we think of unit testing as verifying correct behavior in "normal" application processing. But unit tests also need to verify correct behavior when users or programmers make a mistake, too.

For example, suppose a user unintentionally tries to create an object with one of its properties set to None. How should your program behave? In a circumstance like this, correct error handling will involve checking for valid parameters and, if the validation fails, throwing an exception. But how do you know you haven't made a mistake in the error handling code?

The answer is, by writing a unit test that verifies that a snippet of code throws a specific type of exception.

Here's a class that throws a ValueError if a user tries to initialize it with a parameter set to None.

class Output(object):
    """
    A file output object with a given path and a list of lines to be output.
    """
    def __init__(self, outputDir, lines):
        if not os.path.exists(outputDir):
                raise ValueError("Output directory does not exist: " + outputDir)
        self.outputDir = outputDir
        self.lines = lines

To verify that this works, write a unit test that uses assertRaises() as a context manager, thus:

def testInit(self):
    lines = ''
    # Verify an invalid init throws an exception.
    with self.assertRaises(ValueError):
        Output(None, lines)

If you want to test on the error message, you have to name the context manager. The test itself depends on your Python version. Here's the 3.5 or greater version:

def testInit(self):
    lines = ''
    # Verify an invalid init throws an exception.
    with self.assertRaises(ValueError) as context:
        Output(None, lines)
    self.assertTrue("Output directory does not exist" in str(context.exception))

Here's the 2.7 - 3.4 version:

def testInit(self):
    lines = ''
    # Verify an invalid init throws an exception.
    with self.assertRaises(ValueError) as context:
        Output(None, lines)
    self.assertTrue("Output directory does not exist" in context.exception)

Ignoring Unit Tests

Here's an example of how to ignore a unit test in Python for an entire test class.

@unittest.skip("Write some message here.")
class TestRunJavaExtractorsWithHttpConnection(unittest.TestCase):

Here's how to ignore a class of tests—but only if the programmer's environment is Windows.

@unittest.skipIf(platform.system() == 'Windows',
            "Write some message here.")
class TestSubprocessUtil(unittest.TestCase):

Temporary Directories and Files

Generally your unit tests should avoid writing to the file system. But sometimes you have no choice—e.g. if you are writing code that is supposed to delete files, you probably need to create a file before you can verify that your code has correctly deleted it.

The best way I've found to work with temporary files and directories is to use the tempfile package. This packages provides ways of letting the temporary files and directory delete themselves automatically at the end of the test.

Temp Directory for a Single Test

Here's how to create and use a temporary directory and file for a single test. The test also shows how to create temporary files inside temporary directories, and it uses assertions to verify that the directory and file are deleted outside of the context provided by the with statement.

import tempfile
from pathlib import Path
(...)
class TestTempDir(TestCase):
(...)
    def test_temp(self):
        print("\ntest_temp")
        dirPath = None
        filePath = None
        with tempfile.TemporaryDirectory() as dir:
            # Print the temp dir's full name.
            print(f"  dir: {dir.name}")
            # Show that `dir` is a string, not a file object.
            print(f"  dir type: {type(dir)}")
            # Create a Path with the temporary path and show it exists.
            dirPath = Path(dir)
            self.assertTrue(dirPath.exists())
            # Create a temp file in the temporary directory and show it exists.
            filePath = dirPath / 'file.txt'
            filePath.touch()
            self.assertTrue(filePath.exists())

        # Show that the temp file and directory do not exist outside of
        # the `with` context.
        self.assertFalse(filePath.exists())
        self.assertFalse(dirPath.exists())

Here is the output. (Note that the directory name is randomly generated.)

test_temp
  dir: /tmp/tmpadua6cqp
  dir type: <class 'str'>

Verify deletion of directory:

$ ls /tmp/tmpadua6cqp
ls: cannot access '/tmp/tmpxipaidz3': No such file or directory

If for some reason you don't care for the with clause, you could take the same approach for a single unit test that I recommend for an entire unit test class. See next.

For Every Test in a Class

If I want to use a temporary directory in more than one unit test, the with context block in the example above seems a little heavy-handed to me.

import tempfile
(...)
class TestTempDir(TestCase):
    temp_directory = None

    @classmethod
    def setUpClass(cls):
        cls.temp_directory = tempfile.TemporaryDirectory()

    def test_temp(self):
        print("\ntest_temp")
        print(f"   TestSubmittedFile.temp_directory: { TestSubmittedFile.temp_directory}")

# Output
test_temp
   TestSubmittedFile.temp_directory: <TemporaryDirectory '/tmp/tmpxipaidz3'>

Verify deletion of directory:

$ ls /tmp/tmpxipaidz3
ls: cannot access '/tmp/tmpxipaidz3': No such file or directory

Replacing Django's MEDIA_ROOT

In Django, MEDIA_ROOT is a global variable which specifies where physical files should be placed. You don't want unit tests to be using this directory, because then they could easily interfere with the real files there. But how do we use some other location for unit testing?

I have found several online solutions to this problem, and have found them wanting either because they set MEDIA_ROOT to /tmp instead of a directory inside /tmp, or because they create a temporary directory which is not automatically deleted immediately after the test.

To my liking, here's the best solution:

import tempfile
(...)
class TestTempDir(TestCase):

    temp_directory = None
    _original_media_root = None

    @classmethod
    def setUpClass(cls):
        cls._original_media_root = settings.MEDIA_ROOT
        cls.temp_directory = tempfile.TemporaryDirectory()
        settings.MEDIA_ROOT = cls.temp_directory.name

    @classmethod
    def tearDownClass(cls):
        print("\ntearDownClass before cleaning up")
        print(f"   settings.MEDIA_ROOT: {settings.MEDIA_ROOT}")
        print(f"   temp_directory: {cls.temp_directory}")
        settings.MEDIA_ROOT = cls._original_media_root
        print("\ntearDownClass after cleaning up")
        print(f"   settings.MEDIA_ROOT: {settings.MEDIA_ROOT}")
        print(f"   temp_directory: {cls.temp_directory}")

    def test_test(self):
        print("\n test_test")
        print(f"   settings.MEDIA_ROOT: {settings.MEDIA_ROOT}")
        print(f"   TestSubmittedFile.temp_directory: {TestSubmittedFile.temp_directory}")
        dirPath = Path(settings.MEDIA_ROOT)
        self.assertTrue(dirPath.exists())
        filePath = dirPath / 'aaabbbqqq.txt'
        self.assertFalse(filePath.exists())
        filePath.touch()
        self.assertTrue(filePath.exists())

Running the test class creates this output:

test_test
   settings.MEDIA_ROOT: /tmp/tmp1grvlmyj
   TestSubmittedFile.temp_directory: <TemporaryDirectory '/tmp/tmp1grvlmyj'>

tearDownClass before cleaning up
   settings.MEDIA_ROOT: /tmp/tmp1grvlmyj
   temp_directory: <TemporaryDirectory '/tmp/tmp1grvlmyj'>

tearDownClass after cleaning up
   settings.MEDIA_ROOT: /data
   temp_directory: <TemporaryDirectory '/tmp/tmp1grvlmyj'>

Finally, let's verify that the temporary directory is deleted:

$ ls /tmp/tmp1grvlmyj
ls: cannot access '/tmp/tmp1grvlmyj': No such file or directory

The unittest.mock Library

I hope to have more on the unittest.mock library eventually, but for starters let me link you to Akshar Raaj's pretty good Medium blog.

General Testing Strategies

Make Sure Your Unit Test Fails

Often I've seen unit tests that don't actually test what their authors think they're testing. The application code had a bug in it, but so did the unit test which was meant to validate that code. Therefore the test always passed, failing to detect the bug.

You can avoid this trap by following the advice in this heading—always make sure your unit test fails. Only once it has failed should you modify the test, or your code, so that it passes. Over the years I've seen many cases where failing to follow this guideline resulted in egg in the face—either my own, or that of another programmer.

There are an infinite number of ways to write a bad unit test. So many, that it's hardly worth trying to list them. But the gotcha which has more than once bitten me is the classic cut-and-paste error. A typical scenario goes like this:

Start writing a unit test by copying an existing test that passes.
Paste the existing test.
Find yourself interrupted by a colleague, a Slack message, a fire drill, etc.
Run your unit tests, see that they all pass, move on to the next coding task.

In the scenario above, the unit test was not renamed. In most Python testing frameworks, this will lead to the test's not even being run (since it has the same name as some other test in the file). If you're lucky you'll notice that the count of the tests run in the file is not what you would have expected, which will lead to investigate. But it's even worse if the interruption takes place after you've renamed the test. In this situation you'll get the expected number of passing tests with no hint that your new test is otherwise identical to some other test in the same file.

Keep Your Tests Short

Tests often require a lot of setup before they can be run. Maybe you want to test a boundary case, which means you have to create the boundary case first. While you're writing the setup code everything makes sense, but to another person—or to yourself, a couple of months later—the setup code can be difficult to follow and therefore it can obscure what is actually being tested.

Here are some ways to avoid this trap:

Use mock objects. Don't create a full-fledged object. Instead, use a mock library, or your own mock object, to build an object that pretends to be the thing you're testing, but with just enough properties so that the test can fail or pass appropriately.
Write separate test functionality. Write methods or even test classes that can be invoked from within a test. This outside functionality keeps your tests short and readable. For example, the code that reads in an input file, if longer than a single line, should not be in the unit test itself.
Create a test data factory. This is actually a specific example of the tip above, but I want to give this idea special emphasis. At the same time, I have to admit I haven't tried it myself. But Jamie Morris 's "Stop Filling Your Tests With Test Data" has inspired me to see if this technique can help me keep my unit tests short and to the point.

Unit Tests from the Very Beginning

Test-driven development is good programming practice. While I don't follow its precepts dogmatically, I strongly agree that the earlier you introduce unit tests into a new project, or into new functionality within an existing project, the better off you'll be.

When you write a program you often have a choice over how to implement something, and sometimes one option is better than another as far as unit tests are concerned. For example, suppose you have a class that takes text input, and in the application the input will come from a file. So it makes perfect sense to write the class to take a file path on instantiation, and in the constructor read in the file to get the text, setting the text to a property of the class. Suppose now you decide to write the unit tests. Creating test files for unit testing can be very laborious, but this is the only way to test the class as you've created it. It might have been better if you'd created your class to allow instantiation with a string—the text itself—rather than the path of the file containing the text.

Compound this example by several decisions made over the course of a couple of weeks, all oblivious to the necessity of making the program amenable to unit testing, and what you often end up with is a program that is very, very difficult to unit test at all.

Close to the Machine