[[source]]
name = "pytest_testcontainers"
url = 'https://pypi.python.org/simple'
version = "0.1.0"
description = ""
[packages]
sqlalchemy = ">=1.3"
pg8000 = ">=1.13"
psycopg2-binary = "*"
[dev-packages]
pytest = ">=3.0"
pytest-cov = ">=2.7.1"
flake8 = ">=3.7"
testcontainers = ">=2.5"
[requires]
python = ">=3.7"
Pytest and Testcontainers
Intro
I’ve been a Java programmer for a very long time, but nowadays I’m involved in a Python project at work. One of the things I was missing from the Java ecosystem was TestContainers. Long story short Testcontainers is a project enabling your project to run Docker containers for integration testing purposes. It turns out there’s a port for Python and I’d like to figure out how to use it.
The challenge
I’m creating a persistence layer with SQLAlchemy and I’d like to use TestContainers:
-
to startup a PostgreSQL container
-
to create the schema
-
to run the tests against the database
-
to truncate all tables before running each test
-
to stop the database at the end of the test suite (end of pytest session)
The source code of this entry can be found at Github |
Here’s my Pipfile:
Notice that althoug I’ve added bonth pg8000 or psycopg2. I’m only using the latter in this example. There’s a note about why at the end of the blog entry.
Repository
I’m creating a repository which wraps the calls to the SQLAlchemy session. In a Python project, or at least what I’m seeing lately in some Python projects, the SQLAlchemy session is created statically somewhere from the configuration file and repositories import that reference and use it. The problem with that is that TestContainers by default creates a Docker container with your database with a random port meaning that you won’t know all the connection details until the container is running. Bottom line, the cleanest solution I could think of was dependency injection via class constructor injection.
Although by default TestContainers look for a random port by default you can specify one port explicitly. |
This is how the repository may look like:
from blog.models import BlogEntry
class BlogEntryRepository:
def __init__(self, session):
self.session = session
def find_by_id(self, id):
return self.session.query(BlogEntry).filter_by(id=id).first()
def find_all_by_title_starts_with(self, title_starts_with):
by_startswith = BlogEntry.title.startswith(title_starts_with)
return self.session.query(BlogEntry).filter(by_startswith).all()
def save(self, blog_entry):
self.session.add(blog_entry)
self.session.flush()
return blog_entry
Pytest fixtures (conftest.py)
Next step is to optimize the creation and disposal of the database. Basically I don’t want to create an instance of the database per each test function, that would be such a waste. Instead I would like to create an instance of the database spanning the whole pytest session and eventually shutdown the database. Apart from that, once the database is up and running I need to create a session and pass it to every test in case they may need it for injecting it to, for instance, a repository. So the idea is:
-
to start the container
-
to create the session with the container details
-
to inject the session in every test so that it can be used to initialize the repository
In order to do that I’m using Pytest fixtures which will help me reusing some parts among my tests (SQLAlchemy session) and run the Docker container in a more efficient way (One database instance for all my tests). There’re several places where you can put your fixtures so that test can be aware of them. I’m using the conftest.py file to put all my fixtures there.
Although one conftest.py file at the top of your project modules can be visible
to all your tests, you can also use one conftest.py file
per directory
|
These are the required imports to create my database session fixture:
import logging
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import (scoped_session, sessionmaker)
from testcontainers.postgres import PostgresContainer
Session fixtures
Session fixtures are shared among all tests during a Pytest execution. So it makes sense to create a fixture starting up the Docker container, and shutting it down once the session ends.
log = logging.getLogger()
@pytest.fixture(scope="session")
def session(request):
log.info("[fixture] starting db container")
postgres = PostgresContainer("postgres:9.5") (1)
postgres.start()
log.info("[fixture] connecting to: {}".format(postgres.get_connection_url()))
# create session with db container information
engine = create_engine(postgres.get_connection_url()) (2)
session = scoped_session(sessionmaker(autocommit=False,autoflush=False,bind=engine))
# create schema in database
Base.metadata.create_all(engine) (3)
def stop_db(): (4)
log.info("[fixture] stopping db container")
postgres.stop()
request.addfinalizer(stop_db) (5)
return session (6)
1 | Create and start the PostgreSQL container (9.5) |
2 | Create database session from container’s detail |
3 | Recreate database schema |
4 | Create a session finalizer to stop the container once the session ends |
5 | Add the finalizer to pytest |
6 | Returns the database session. That will inject the database session in whatever function demanding it |
An alternative option for executing teardown code is to make use of the addfinalizer method of the
request-context object to register finalization functions. That’s why we declare the request parameter
in our fixtures, to be able to get the request-context object.
|
Dependency injection also works with session fixtures theirselves. If you would like to provide a pytest fixture that requires a previous configured pytest session fixture you only have to declare the dependency as a fixture parameter:
@pytest.fixture(scope="session")
def factories(request, session):
return Factories(session)
Here to create some factories to create domain objects in my tests I need the configured database session. Declaring the dependency as a parameter will execute first the dependency and then pytest will inject it in this fixture.
Function fixtures
Something I’d like to happen before I’m running a new test is to make sure all data from other tests has been erased prior to run the current test. Therefore it seems like I good idea to create a function fixture that truncates all tables before each new test execution.
@pytest.fixture(scope="function",autouse=True)
def cleanup(request, session): (1)
log.info("[fixture] truncating all tables")
# truncating all tables
for table in reversed(Base.metadata.sorted_tables): (2)
session.execute(table.delete())
def function_ends(): (3)
log.info("[fixture] closing db session")
session.commit()
session.close()
request.addfinalizer(function_ends) (4)
1 | injecting database session from previous pytest session fixture |
2 | truncating all tables from schema |
3 | create a finalizer function to commit and close session after each test |
4 | add the finalizer function to pytest lifecycle to happen at the end of each test |
Tests
Now that the database is running and the database session is created thanks to the pytest fixtures we just created, we can use them in our tests, just declaring them as test parameters.
from blog.repositories import BlogEntryRepository
def test_find_by_id(session, factories):
# given: a new blog entry
saved_entry = factories.create_blog_entry()
# and: a blog entry repository
repository = BlogEntryRepository(session)
# when: trying to find it by its id
blog_entry = repository.find_by_id(saved_entry.id)
# then: I should be able to get it back
assert str(blog_entry.id) == str(saved_entry.id)
def test_find_all_by_startswith(session, factories):
# given: a new blog entry
for i in range(2):
factories.create_blog_entry(
title="Pytest introduction {}".format(i)
)
# and: saving all those entries
repository = BlogEntryRepository(session)
# when: looking for entries starting with Pytest
results = repository.find_all_by_title_starts_with("Pytest")
# then: there should be
assert len(results) == 2
Improvements
Well, maybe to be coherent with the rest of the application, I could create a test configuration file, and fullfil the container startup configuration with configuration values there. Most of the connection parameters are available programatically when bootstraping a PostgreSQL container.
In TestContainers 2.5 the PostgreSQL connection dialect is
hardcoded to psycopg2+postgres if you’d like to change it to, for example,
pg8000 you’ll have to inherit from PostgresContainer and
overwrite the get_connection_url function.
|