Tutorial¶
In this section we present examples of setting up tests for workflow managers. The examples start out from a simple setting adding complexity with each subsection. The tests and example workflow files can be found in the tests directory that ships with the package distribution.
Snakemake workflow¶
This section describes how to setup tests of a snakemake workflow. In order to use the test fixtures with the snakemake workflow manager, we need to setup a Snakefile and a test file.
The Snakefile defines rules that declare what to do with the input data. In a real-life scenario, we would run various bioinformatics applications to transform the input into some meaningful output. Here, we perform operations using basic shell commands, but the same principle applies to a bioinformatics workflow. The Snakefile in this example looks as follows:
# -*- snakemake -*-
rule all:
input: ["results.txt"]
rule results:
input: ["s1_1.fastq.gz", "s1_2.fastq.gz"]
output: "results.txt"
shell: "echo {input} > {output}"
Setting up a workflow with a predefined sample layout¶
The test file test_workflow_simple.py defines a test test_workflow
that depends on the default
samples
fixture. As the
Snakefile resides in the same directory, the
snakefile
fixture will
automatically detect its presence and setup the file. If there is no
Snakefile present, the full path has to be passed with the snakefile
argument to pytest.mark.snakefile.
By default, fixtures are setup to copy files to the test directory. By passing copy=False to the pytest.mark helpers, we use symlinks instead. In addition, we pass the option numbered=True to generate numbered output directories.
# -*- coding: utf-8 -*-
import pytest
from pytest_ngsfixtures.wm.snakemake import snakefile, run as snakemake_run
@pytest.mark.samples(copy=False, numbered=True)
@pytest.mark.snakefile(copy=False, dirname="snakefile", numbered=True)
def test_workflow(samples, snakefile):
snakemake_run(snakefile, options=["-d", str(samples)])
assert samples.join("results.txt").exists()
The test_workflow()
function requires the two fixtures
samples and snakefile, and the workflow is run with the
run()
wrapper. Finally, we
assert that the test has run to completion by asserting the existence
of the output file results.txt. Now, the tests can be run with the
command
pytest -v -s tests/test_workflow_simple.py
Setting up a docker container fixture¶
In the following test script (test_workflow_docker.py), we add functionality to deal with container-based fixtures.
First, we import the docker
module to interact with docker,
along with os
and pytest
:
import os
import docker
import pytest
Then, we add a container fixture that sets up a container based on the snakemake docker image quay.io/biocontainers/snakemake:X.Y.Z–py36_0, where X.Y.Z corresponds to the installed snakemake version:
@pytest.fixture(scope="session")
def container(request):
def rm():
try:
print("Removing container ", container.name)
container.remove(force=True)
except:
raise
finally:
pass
request.addfinalizer(rm)
client = docker.from_env()
try:
image = client.images.get(pytest.snakemake_image)
except docker.errors.ImageNotFound:
print("docker image '{}' not found; pulling".format(pytest.snakemake_image))
client.images.pull(pytest.snakemake_image)
image = client.images.get(pytest.snakemake_image)
except:
raise
container = client.containers.create(image, tty=True,
user="{}:{}".format(os.getuid(), os.getgid()),
volumes={'/tmp': {'bind': '/tmp', 'mode': 'rw'}},
working_dir="/tmp")
return container
Finally, we add the container fixture to the test function call:
@pytest.mark.samples(numbered=True)
@pytest.mark.snakefile(dirname="snakefile", numbered=True)
def test_workflow(snakefile, samples, container):
container.start()
for r in snakemake_run(snakefile,
options=["-d", str(samples), "-s",
str(snakefile)], container=container,
read=True, iterable=True):
print(r)
assert samples.join("results.txt").exists()
and run the test as
pytest -v -s tests/test_workflow_docker.py
Parametrized tests on running environment¶
We have now written tests for local execution and execution in a container. By parametrizing the tests, we can combine the two cases in one test function. To do this we modify the test function as follows (see test_workflow_parametrize.py):
@pytest.mark.samples(numbered=True)
@pytest.mark.snakefile(numbered=True, dirname="snakefile")
@pytest.mark.parametrize("container", ["local", "docker"], indirect=["container"])
def test_workflow(snakefile, samples, container):
if container is not None:
container.start()
for r in snakemake_run(snakefile,
options=["-d", str(samples), "-s",
str(snakefile)], container=container,
read=True, iterable=True):
print(r)
assert samples.join("results.txt").exists()
The parametrization is done indirectly via the container fixture. We modify this fixture to return None if the request parameter equals local, and if the parameter equals docker we return the container:
@pytest.fixture(scope="function")
def container(request):
def rm():
try:
print("Removing container ", container.name)
container.remove(force=True)
except:
raise
finally:
pass
if request.param == "local":
return None
request.addfinalizer(rm)
client = docker.from_env()
try:
image = client.images.get(pytest.snakemake_image)
except docker.error.ImageNotFound:
print("docker image {} not found; pulling".format(pytest.snakemake_image))
client.images.pull(pytest.snakemake_image)
image = client.images.get(pytest.snakemake_image)
except:
raise
container = client.containers.create(image, tty=True,
user="{}:{}".format(os.getuid(), os.getgid()),
volumes={'/tmp': {'bind': '/tmp', 'mode': 'rw'}},
working_dir="/tmp")
return container
Now, running
pytest -v -s tests/test_workflow_parametrize.py
will execute two tests, one for the local environment, one in the docker container.