2019-10-11

Pytest and Testcontainers

Intro

I’ve been a Java programmer for a very long time, but nowadays I’m involved in a Python project at work. One of the things I was missing from the Java ecosystem was TestContainers. Long story short Testcontainers is a project enabling your project to run Docker containers for integration testing purposes. It turns out there’s a port for Python and I’d like to figure out how to use it.

The challenge

I’m creating a persistence layer with SQLAlchemy and I’d like to use TestContainers:

to startup a PostgreSQL container
to create the schema
to run the tests against the database
to truncate all tables before running each test
to stop the database at the end of the test suite (end of pytest session)

The source code of this entry can be found at Github

Here’s my Pipfile:

Dependencies

[[source]]
name = "pytest_testcontainers"
url = 'https://pypi.python.org/simple'
version = "0.1.0"
description = ""

[packages]
sqlalchemy = ">=1.3"
pg8000 = ">=1.13"
psycopg2-binary = "*"

[dev-packages]
pytest = ">=3.0"
pytest-cov = ">=2.7.1"
flake8 = ">=3.7"
testcontainers = ">=2.5"

[requires]
python = ">=3.7"

Notice that althoug I’ve added bonth pg8000 or psycopg2. I’m only using the latter in this example. There’s a note about why at the end of the blog entry.

Repository

I’m creating a repository which wraps the calls to the SQLAlchemy session. In a Python project, or at least what I’m seeing lately in some Python projects, the SQLAlchemy session is created statically somewhere from the configuration file and repositories import that reference and use it. The problem with that is that TestContainers by default creates a Docker container with your database with a random port meaning that you won’t know all the connection details until the container is running. Bottom line, the cleanest solution I could think of was dependency injection via class constructor injection.

Although by default TestContainers look for a random port by default you can specify one port explicitly.

This is how the repository may look like:

repository with injected session

from blog.models import BlogEntry

class BlogEntryRepository:
    def __init__(self, session):
        self.session = session

    def find_by_id(self, id):
        return self.session.query(BlogEntry).filter_by(id=id).first()

    def find_all_by_title_starts_with(self, title_starts_with):
        by_startswith = BlogEntry.title.startswith(title_starts_with)

        return self.session.query(BlogEntry).filter(by_startswith).all()

    def save(self, blog_entry):
        self.session.add(blog_entry)
        self.session.flush()

        return blog_entry

Pytest fixtures (conftest.py)

Next step is to optimize the creation and disposal of the database. Basically I don’t want to create an instance of the database per each test function, that would be such a waste. Instead I would like to create an instance of the database spanning the whole pytest session and eventually shutdown the database. Apart from that, once the database is up and running I need to create a session and pass it to every test in case they may need it for injecting it to, for instance, a repository. So the idea is:

to start the container
to create the session with the container details
to inject the session in every test so that it can be used to initialize the repository

In order to do that I’m using Pytest fixtures which will help me reusing some parts among my tests (SQLAlchemy session) and run the Docker container in a more efficient way (One database instance for all my tests). There’re several places where you can put your fixtures so that test can be aware of them. I’m using the conftest.py file to put all my fixtures there.

Although one conftest.py file at the top of your project modules can be visible to all your tests, you can also use one conftest.py file per directory

These are the required imports to create my database session fixture:

imports

import logging

import pytest

from sqlalchemy import create_engine
from sqlalchemy.orm import (scoped_session, sessionmaker)

from testcontainers.postgres import PostgresContainer

Session fixtures

Session fixtures are shared among all tests during a Pytest execution. So it makes sense to create a fixture starting up the Docker container, and shutting it down once the session ends.

fixtures for pytest session and per function

log = logging.getLogger()


@pytest.fixture(scope="session")
def session(request):
    log.info("[fixture] starting db container")

    postgres = PostgresContainer("postgres:9.5") (1)
    postgres.start()

    log.info("[fixture] connecting to: {}".format(postgres.get_connection_url()))

    # create session with db container information
    engine = create_engine(postgres.get_connection_url()) (2)
    session = scoped_session(sessionmaker(autocommit=False,autoflush=False,bind=engine))

    # create schema in database
    Base.metadata.create_all(engine) (3)

    def stop_db(): (4)
        log.info("[fixture] stopping db container")
        postgres.stop()

    request.addfinalizer(stop_db) (5)

    return session (6)

1	Create and start the PostgreSQL container (9.5)
2	Create database session from container’s detail
3	Recreate database schema
4	Create a session finalizer to stop the container once the session ends
5	Add the finalizer to pytest
6	Returns the database session. That will inject the database session in whatever function demanding it

An alternative option for executing teardown code is to make use of the addfinalizer method of the request-context object to register finalization functions. That’s why we declare the request parameter in our fixtures, to be able to get the request-context object.

Dependency injection also works with session fixtures theirselves. If you would like to provide a pytest fixture that requires a previous configured pytest session fixture you only have to declare the dependency as a fixture parameter:

injecting fixtures as parameters in another fixture

@pytest.fixture(scope="session")
def factories(request, session):
    return Factories(session)

Here to create some factories to create domain objects in my tests I need the configured database session. Declaring the dependency as a parameter will execute first the dependency and then pytest will inject it in this fixture.

Function fixtures

Something I’d like to happen before I’m running a new test is to make sure all data from other tests has been erased prior to run the current test. Therefore it seems like I good idea to create a function fixture that truncates all tables before each new test execution.

fixtures for pytest session and per function

@pytest.fixture(scope="function",autouse=True)
def cleanup(request, session): (1)
    log.info("[fixture] truncating all tables")

    # truncating all tables
    for table in reversed(Base.metadata.sorted_tables): (2)
        session.execute(table.delete())

    def function_ends(): (3)
        log.info("[fixture] closing db session")
        session.commit()
        session.close()

    request.addfinalizer(function_ends) (4)

1	injecting database session from previous pytest session fixture
2	truncating all tables from schema
3	create a finalizer function to commit and close session after each test
4	add the finalizer function to pytest lifecycle to happen at the end of each test

Tests

Now that the database is running and the database session is created thanks to the pytest fixtures we just created, we can use them in our tests, just declaring them as test parameters.

tests using db session and factories

from blog.repositories import BlogEntryRepository


def test_find_by_id(session, factories):
    # given: a new blog entry
    saved_entry = factories.create_blog_entry()

    # and: a blog entry repository
    repository = BlogEntryRepository(session)

    # when: trying to find it by its id
    blog_entry = repository.find_by_id(saved_entry.id)

    # then: I should be able to get it back
    assert str(blog_entry.id) == str(saved_entry.id)


def test_find_all_by_startswith(session, factories):
    # given: a new blog entry
    for i in range(2):
        factories.create_blog_entry(
            title="Pytest introduction {}".format(i)
        )

    # and: saving all those entries
    repository = BlogEntryRepository(session)

    # when: looking for entries starting with Pytest
    results = repository.find_all_by_title_starts_with("Pytest")

    # then: there should be
    assert len(results) == 2

Improvements

Well, maybe to be coherent with the rest of the application, I could create a test configuration file, and fullfil the container startup configuration with configuration values there. Most of the connection parameters are available programatically when bootstraping a PostgreSQL container.

In TestContainers 2.5 the PostgreSQL connection dialect is hardcoded to psycopg2+postgres if you’d like to change it to, for example, pg8000 you’ll have to inherit from PostgresContainer and overwrite the get_connection_url function.