Python Testing#

Overview

Questions:

How is a Python module tested?

Objectives:

Explain the overall structure of testing.
Explain the reasons why testing is important.
Understand how to write tests using the pytest framework.

Follow Along with This Lesson

To follow along with this lesson, you can complete the previous lessons, or you can download a pre-made workshop repository that is at the starting point.

You will need to make sure that you have git installed and configured, as described in the set-up instructions.

SHELL

git clone https://github.com/MolSSI-Education/molecool.git
cd molecool
git checkout python-testing-start
git switch -c main

You can also download the pre-made workshop repository as a zip file. If downloading as a zip file, you will need to initialize git in the repository and make an initial commit in order to use git.

Until now, we have been writing functions and checking their behavior using an interactive Python interpreter and manually inspecting the output. While this seems to work, it can be tedious, and prone to error. In this lesson, we’ll discuss how to write tests and run them using the pytest testing framework.

Using a testing framework will allow us to easily define tests for all of our functions and modules, and to test these each time we make a change. This will ensure that our code is behaving the way we expect and that we do not break any features existing in the code by making new changes.

This episode explains the importance of code testing and demonstrates the possible capabilities.

Why testing#

Software should be tested regularly throughout the development cycle to ensure correct operation. Thorough testing is typically an afterthought, but for larger projects, it is essential to ensure changes in some parts of the code do not negatively affect other parts.

Software testing is checking the behavior of part of the code (such as a method, class, or module) by comparing its expected output or behavior with the observed one. We will explain this in more detail shortly.

Levels of Testing#

There are three main levels of testing:

Unit tests: The purpose is to verify that each part of the code is functioning as expected. Unit testing is done on smaller units (such as single functions or classes) as you work on your code. This is helpful for catching errors in uncommonly-used parts of the code. Unit tests should be added as new features are added, resulting in better code coverage. In unit tests, you are testing a part of your code independent of any other factors; therefore, you should avoid using the file system, databases, network, or any other resources unless you are testing a function directly related to that resource.
Integration tests: This is a more holistic approach where you test the interface between modules, and how they combine and integrate together.
System tests: Where you test your system as a whole to check if it meets all the requirements.

Another important type of testing is Regression tests. In Regression tests, the software is checked to ensure that it consistently returns correct values given known inputs. This kind of testing can catch problems in previously working code that may have been broken by new changes or new features.

It is highly encouraged to have Unit tests that cover most of your code. It is also helpful to have some Integration and System tests.

In this lesson, we are focusing on unit testing. The same concepts can be utilized to conduct integration tests throughout various modules.

The pytest testing framework#

MolSSI recommends using the pytest testing framework. Other testing frameworks are available (such as unittest and nose tests); however, the combination of easy implementation, parametrization of tests, fixtures, and test marking make pytest an ideal testing framework.

If you don’t have pytest installed, or it’s not updated to version 3, install it using:

SHELL

pip install -U pytest-cov

Testing Expected Exceptions#

If you expect your code to raise exceptions, you can test this behavior with pytest. For example in our calculate_angle function, our inputs must be numpy arrays, or the function will give an error.

Consider our build_bond_list function. We may want to raise a ValueError if min_bond is set to be less than zero. We can add a type check to the function so that a more informative message is given to the user.

Add the following to your build_bond_list function right below the docstring.

PYTHON

if min_bond < 0:
        raise ValueError("Invalid minimum bond distance entered! Minimum bond distance must be greater than zero!")

We can test that this ValueError is raised in our testing functions.

In the test_molecule.py file, copy and modify your first test to check for this. Note that you must alter your imports to also import pytest.

TEST_MOLECULE.PY

import pytest

def test_build_bond_failure():

    coordinates = np.array([
        [1, 1, 1],
        [2.4, 1, 1],
        [-0.4, 1, 1],
        [1, 1, 2.4],
        [1, 1, -0.4],
    ])

    with pytest.raises(ValueError):
        bonds = molecool.build_bond_list(coordinates, min_bond=-1)

The test will pass if the build_bond_list method raises a ValueError, otherwise, the test will fail.

Exercise - Test Driven Development#

Sometimes, tests are written before writing the code. This is called “Test Driven Development” or TDD. In this case, you would first write tests that define the behavior of your code, then run these tests to see if they pass, and finally write code to pass each of these tests. TDD is a common approach when developing a library with well-defined interfaces and features.

TDD has another benefit of never having false positives. If you ensure that your tests first fail and THEN pass, you know that you have written a function that works and that your test is not just passing by default.

Exercise 1

For this homework assignment, you are given a function specification and a test. You should write a function to make the test pass.

Add the following function definition and docstring to molecule.py.

PYTHON

def calculate_molecular_mass(symbols):
    """Calculate the mass of a molecule.
    
    Parameters
    ----------
    symbols : list
        A list of elements.
    
    Returns
    -------
    mass : float
        The mass of the molecule
    """
    pass

This defines a function, its inputs, and what the function should return. Next, add the following test into test_molecule.py.

TEST_MOLECULE.PY

def test_molecular_mass():
    symbols = ['C', 'H', 'H', 'H', 'H']
    calculated_mass = molecool.calculate_molecular_mass(symbols)

    actual_mass = 16.04

    assert pytest.approx(actual_mass, abs=1e-2) == calculated_mass

If you run pytest, this test should fail. Your assignment is to write the function to make the tests pass. You should use the atomic_weights data in the atom_data module.

Solution

Here is a potential solution.

from .atom_data import atomic_weights


def calculate_molecular_mass(symbols):
    """Calculate the mass of a molecule.
    
    Parameters
    ----------
    symbols : list
        A list of elements.
    
    Returns
    -------
    mass : float
        The mass of the molecule
    """

    mass = 0
    for atom in symbols:
        mass += atomic_weights[atom]
    
    return mass

Also, don’t forget to add the calculate_molecular_mass function to __init__.py

__INIT__.PY

# Add imports here
from .measure import calculate_distance, calculate_angle
from .molecule import build_bond_list, calculate_molecular_mass
from .visualize import draw_molecule, bond_histogram
from .io import open_pdb, open_xyz, write_xyz

Exercise 2

Consider the following function definition inside the molecule module.

MOLECULE.PY

import numpy as np

def calculate_center_of_mass(symbols, coordinates):
    """Calculate the center of mass of a molecule.
    
    The center of mass is weighted by each atom's weight.
    
    Parameters
    ----------
    symbols : list
        A list of elements for the molecule
    coordinates : np.ndarray
        The coordinates of the molecule.
    
    Returns
    -------
    center_of_mass: np.ndarray
        The center of mass of the molecule.
    
    Notes
    -----
    The center of mass is calculated with the formula
    
    .. math:: \\vec{R}=\\frac{1}{M} \\sum_{i=1}^{n} m_{i}\\vec{r_{}i}
    
    """
    
    return np.array([])

Don’t forget to add the calculate_center_of_mass function to __init__.py

__INIT__.PY

# Add imports here
from .measure import calculate_distance, calculate_angle
from .molecule import build_bond_list, calculate_molecular_mass, calculate_center_of_mass
from .visualize import draw_molecule, bond_histogram
from .io import open_pdb, open_xyz, write_xyz

And the test for the function above.

TEST_MOLECULE.PY

def test_center_of_mass():
    symbols = np.array(['C', 'H', 'H', 'H', 'H'])
    coordinates = np.array([[1,1,1], [2.4,1,1], [-0.4, 1, 1], [1, 1, 2.4], [1, 1, -0.4]])

    center_of_mass = molecool.calculate_center_of_mass(symbols, coordinates)

    expected_center = np.array([1,1,1])

    assert center_of_mass.all() == expected_center.all()

Notice that this test always passes. Even if we were to write our function, we would not know it was right.

Fix this test so that it fails. Hint - You will have to compare two arrays (look into numpy functions which compare two arrays.)

Solution

The problem with .all() is that it does not compare arrays element-wise, it simply evaluates as True if all values in the array are True, or False if not. The numpy function array_equal returns True if two arrays have the same shape and elements.

def test_center_of_mass():
    symbols = np.array(['C', 'H', 'H', 'H', 'H'])
    coordinates = np.array([[1,1,1], [2.4,1,1], [-0.4, 1, 1], [1, 1, 2.4], [1, 1, -0.4]])

    center_of_mass = molecool.calculate_center_of_mass(symbols, coordinates)

    expected_center = np.array([1,1,1])

    assert np.array_equal(center_of_mass, expected_center)

Below is an implementation of the function which meets the specification outlined by the test.

def calculate_center_of_mass(symbols, coordinates):
    """Calculate the center of mass of a molecule.
    
    The center of mass is weighted by each atom's weight.
    
    Parameters
    ----------
    symbols : list
        A list of elements for the molecule
    coordinates : np.ndarray
        The coordinates of the molecule.
    
    Returns
    -------
    center_of_mass: np.ndarray
        The center of mass of the molecule.
    
    Notes
    -----
    The center of mass is calculated with the formula
    
    .. math:: \\vec{R}=\\frac{1}{M} \\sum_{i=1}^{n} m_{i}\\vec{r_{}i}
    
    """
    
    total_mass = calculate_molecular_mass(symbols)
    
    mass_array = np.zeros([len(symbols), 1])
    
    for i in range(len(symbols)):
        mass_array[i] = atomic_weights[symbols[i]]
    
    center_of_mass = sum(coordinates * mass_array) / total_mass
    
    return center_of_mass

Code Coverage Part I#

Now that we have a set of modules and associated tests, we want to see how much of our package is “covered” by our tests. We’ll measure this by counting the lines of our packages that are touched, i.e. used, during our tests.

We already have everything we need for this since we installed pytest-cov earlier which includes the coverage tools on top of the pytest package.

We can assess our code coverage as follows:

SHELL

pytest --cov=molecool

OUTPUT

=========================================================================== test session starts ===========================================================================
platform darwin -- Python 3.11.6, pytest-7.4.2, pluggy-1.3.0
rootdir: /Users/jessica/lessons/molecool
plugins: cov-2.8.1
collected 10 items                                                                                                                                                        

molecool/tests/test_measure.py ......                                                                                                                               [ 60%]
molecool/tests/test_molecule.py ....                                                                                                                                [100%]

---------- coverage: platform darwin, Python 3.11.6-final-0 -----------
Name                      Stmts   Miss  Cover
---------------------------------------------
molecool/__init__.py          9      0   100%
molecool/atom_data.py         2      0   100%
molecool/io/__init__.py       2      0   100%
molecool/io/pdb.py           14     12    14%
molecool/io/xyz.py           14     11    21%
molecool/measure.py          12      1    92%
molecool/molecule.py         28      1    96%
molecool/visualize.py        33     28    15%
---------------------------------------------
TOTAL                       114     53    54%


=========================================================================== 10 passed in 0.70s ============================================================================

The output shows how many statements (i.e. not comments) are in a file, how many weren’t executed during testing, and the percentage of statements that were.

To improve our coverage, we also want to see exactly which lines we missed and we can determine this using the .coverage file produced by pytest. Unfortunately, this strategy becomes impractical when we are working with anything larger than our test package because the .coverage file becomes too convoluted to read. We will need more tools to help us determine how to improve our tests and that will be the subject of Code Coverage Part II, which we will cover later in the workshop.

Do we need to get 100% coverage?

Short answer: no. Code coverage is a useful tool to assess how comprehensive are our set of tests, and in general the higher our code coverage the better. However, trying to achieve 100% coverage on packages any larger than this sample package is a bit unrealistic and would require more time than that last bit of coverage is worth.

Final Repository State

You can see the final state of the repository after this section here.

You can also download a zip of the repository here.

Key Points#

Key Points

A good set of tests covers individual functions/features and behavior of the software as a whole.
It’s better to write tests during development so you can check your progress along the way.

Python Testing#

Why testing#

Levels of Testing#

The pytest testing framework#

Running our first test#

Adding tests to our package#

Failing tests#

Exercise - `calculate_angle` test#

Testing Expected Exceptions#

Exercise - Test Driven Development#

Advanced features of pytest#

Pytest Marks#

Pytest Fixtures#

Fixture Scope#

Pytest Parametrize#

Edge cases

Corner cases

Combining multiple parameters#

Testing Documentation Examples#

Code Coverage Part I#

Key Points#

Python Testing#

Why testing#

Levels of Testing#

The pytest testing framework#

Running our first test#

Adding tests to our package#

Failing tests#

Exercise - calculate_angle test#

Testing Expected Exceptions#

Exercise - Test Driven Development#

Advanced features of pytest#

Pytest Marks#

Pytest Fixtures#

Fixture Scope#

Pytest Parametrize#

Edge cases

Corner cases

Combining multiple parameters#

Testing Documentation Examples#

Code Coverage Part I#

Key Points#

Exercise - `calculate_angle` test#