Usage¶
This section gives examples of general usage. For a tutorial on setting up workflow tests, jump to the Tutorial section. The Module Index contains more code examples related to each module and function.
Next-generation sequencing fixtures¶
One of the main purposes of pytest_ngsfixtures
is to provide
functionality for setting up fixtures that can be used to test
applications, such as workflows. The predefined test fixtures consist
of a test path (formally a py._path.local.LocalPath
object) in which test files have been setup following some file
organization setup, henceforth referred to as test layout or simply
layout. Basically, a layout is a set of links to (or copies of) the
test data files. Currently there are predefined test fixtures for
sequence data and reference data, with the main purpose of being used
for testing analysis workflows from scratch.
Fixtures¶
There are three main fixtures that can be configured with the pytest.mark helper. In general, the test data files are defined as a dictionary of key:value pairs that are passed via the data option (or similar option) to the pytest.mark.testdata helper. Some fixtures predefine output directories which can be configured with the dirname option. The key corresponds the test fixture file path relative to the pytest root directory, whereas the value is the path to the test data file. In addition, there is a testunit option that allows grouping fixtures in the same test directory.
Under the hood, the fixtures call the class
Fixture
to setup the fixture.
See Creating fixtures with the Fixture class for more information.
pytest_ngsfixtures.plugin.testdata()
¶
A generic fixture for setting up test data.
pytest_ngsfixtures.plugin.samples()
¶
A fixture for setting up sequence read data. Data files are defined via the layout option and are placed in the data directory. The layout and dirname options can also be configured via pytest.mark.parametrize, which enables the parametrization over different sample layouts:
@pytest.mark.parametrize("layout", [{'s1.fastq.gz': '/path/to/foo.fastq.gz'},
{'s2.fastq.gz': '/path/to/foo.fastq.gz'}])
def test_samples(samples, layout):
print(samples.listdir())
There are a number of predefined layouts defined in
the pytest_ngsfixtures.config.layout
dictionary.
pytest_ngsfixtures.plugin.ref()
¶
A fixture for setting up reference data, by default in the data directory.
Files¶
Fixture files live in subdirectories of the
pytest_ngsfixtures/data
directory:
ref/
Reference data files which are used by default by theref
fixture.
seq/
Sequence files.
The sequence directory consists of the following files:
File name Sample ID Type Population
-------------------------- ------------ ----------------- ------------
CHS.HG00512_1.fastq.gz CHS.HG00512 Individual Han-Chinese
CHS.HG00513_1.fastq.gz CHS.HG00513 Individual Han-Chinese
CHS_1.fastq.gz CHS Pool Han-Chinese
PUR.HG00731.A_1.fastq.gz PUR.HG00731.A Individual, run A Puerto Rico
PUR.HG00731.B_1.fastq.gz PUR.HG00731.B Individual, run B Puerto Rico
PUR.HG00733.A_1.fastq.gz PUR.HG00733.A Individual, run A Puerto Rico
PUR.HG00733.B_1.fastq.gz PUR.HG00733.B Individual, run B Puerto Rico
PUR_1.fastq.gz PUR Pool, run A Puerto Rico
YRI.NA19238_1.fastq.gz YRI.NA19238 Individual Yoruban
YRI.NA19239_1.fastq.gz YRI.NA19238 Individual Yoruban
YRI_1.fastq.gz YRI Pool Yoruban
and similarly for read 2. The sequence files have been generated from the 1000 genomes project, two each from the populations CHS (Han-Chinese), PUR (Puerto Rico) and YRI (Yoruban). They have been selected based on mappings to a variable region on chromosome 6 to ensure that running variant callers on the different data sets will generate differing variant call sets. The pools are simply concatenated versions of the individual files, with a ploidy of 4.
Advanced usage¶
Parametrizing existing sample layouts¶
pytest supports parametrizing tests over fixtures. The following code example shows how to parametrize over the predefined layouts:
@pytest.fixture(scope="function", autouse=False)
def data(request):
return request.getfuncargvalue(request.param)
@pytest.mark.parametrize("data", pytest.config.getoption("ngs_layout", ["sample"]), indirect=["data"])
def test_run(data):
# Do something with data
Here, we define an indirect fixture that calls one of the predefined
layout fixtures by use of the request.getfuncargvalue
function.
Grouping fixtures in test directories¶
When parametrizing fixtures over several conditions, it may be of interest to group fixtures in separate parametrized test directories. This can be achieved by using the testunit fixture option, as the following example shows:
@pytest.mark.parametrize("testunit", ["context1", "context2"])
def test_with_context(samples, ref, testunit):
# Do something with data
# Sample data will end up in context1/data, reference data in
# context1/ref for context1 and so on
Creating fixtures with the Fixture
class¶
In addition to using and configuring the predefined fixtures, you can
setup fixtures by directly calling the
Fixture
class. The path
option can be used to override invocation of the tmpdir_factory that
otherwise is called at fixture setup. This feature is primarily useful
when fixtures have to take parametrized values into account.
import pytest
from pytest_ngsfixtures.plugin import Fixture
@pytest.fixture
def metadata(request):
p = Fixture(request, path=request.getfixturevalue("samples"))
@pytest.mark.parametrize("layout", [layout1, layout2])
def test_layout(samples, layout, metadata):
# Do something with data