Skip to content

Testing in Splink

Tests in Splink make use of the pytest framework. You can find the tests themselves in the tests folder.

Splink tests can be broadly categorised into three sets:

  • 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing InputColumn and testing the latitude-longitude distance calculation.
  • Backend-agnostic tests - these are tests which run against some SQL backend, but which are written in such a way that they can run against many backends by making use of the backend-agnostic testing framework. The majority of tests are of this type.
  • Backend-specific tests - these are tests which run against a specific SQL backend, and test some feature particular to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used.

Running tests

Running tests locally

To run tests locally, simply run:

python3 -m pytest tests/
or alternatively
pytest tests/

To run a single test file, append the filename to the tests/ folder call, for example:

pytest tests/test_u_train.py
or for a single test, additionally append the test name after a pair of colons, as:
pytest tests/test_u_train.py::test_u_train_multilink
Further useful pytest options

There may be many warnings emitted, for instance by library dependencies, cluttering your output in which case you can use --disable-pytest-warnings or -W ignore so that these will not be displayed. Some additional command-line options that may be useful:

  • -s to disable output capture, so that test output is displayed in the terminal in all cases
  • -v for verbose mode, where each test instance will be displayed on a separate line with status
  • -q for quiet mode, where output is extremely minimal
  • -x to fail on first error/failure rather than continuing to run all selected tests
  • -m some_mark run only those tests marked with some_mark - see below for useful options here

For instance usage might be:

# ignore warnings, display output
pytest -W ignore -s tests/

or

# ignore warnings, verbose output, fail on first error/failure
pytest -W ignore -v -x tests/

You can find a host of other available options using pytest's in-built help:

pytest -h

Running tests for specific backends or backend groups

You may wish to run tests relating to to specific backends, tests which are backend-independent, or any combinations of these. Splink allows for various combinations by making use of pytest's mark feature.

If when you invoke pytest you pass no marks explicitly, there will be an implicit mark of default, as per the pyproject.toml pytest.ini configuration.

The available options are:

Run core tests

Option for running only the backend-independent 'core' tests:

  • pytest tests/ -m core - run only the 'core' tests, meaning those without dialect-dependence. In practice this means any test that hasn't been decorated using mark_with_dialects_excluding or mark_with_dialects_including.
Run tests on a specific backend

Options for running tests on one backend only - this includes tests written specifically for that backend, as well as backend-agnostic tests supported for that backend.

  • pytest tests/ -m duckdb - run all duckdb tests, and all core tests
    • & similarly for other dialects
  • pytest tests/ -m duckdb_only - run all duckdb tests only, and not the core tests
    • & similarly for other dialects
Run tests across multiple backends

Options for running tests on multiple backends (including all backends) - this includes tests written specifically for those backends, as well as backend-agnostic tests supported for those backends.

  • pytest tests/ -m default or equivalently pytest tests/ - run all tests in the default group. The default group consists of the core tests, and those dialects in the default group - currently spark and duckdb.
    • Other groups of dialects can be added and will similarly run with pytest tests/ -m new_dialect_group. Dialects within the current scope of testing and the groups they belong to are defined in the dialect_groups dictionary in tests/decorator.py
  • pytest tests/ -m all run all tests for all available dialects

These all work alongside all the other pytest options, so for instance to run the tests for training probability_two_random_records_match for only duckdb, ignoring warnings, with quiet output, and exiting on the first failure/error:

pytest -W ignore -q -x -m duckdb tests/test_estimate_prob_two_rr_match.py
Running tests against a specific version of Python

Testing Splink against a specific version of Python, especially newer versions not included in our GitHub Actions, is vital for identifying compatibility issues early and reviewing errors reported by users.

If you're a conda user, you can create a isolated environment according to the instructions in the development quickstart.

Another method is to utilise docker 🐳.

A pre-built Dockerfile for running tests against python version 3.9.10 can be located within scripts/run_tests.Dockerfile.

To run, simply use the following docker command from within a terminal and the root folder of a Splink clone:

docker build -t run_tests:testing -f scripts/run_tests.Dockerfile . && docker run --rm --name splink-test run_tests:testing

This will both build and run the tests library.

Feel free to replace run_tests:testing with an image name and tag you're happy with.

Reusing the same image and tag will overwrite your existing image.

You can also overwrite the default CMD if you want a different set of pytest command-line options, for example

docker run --rm --name splink-test run_tests:testing pytest -W ignore -m spark tests/test_u_train.py

Running with a pre-existing Postgres database

If you have a pre-existing Postgres server you wish to use to run the tests against, you will need to specify environment variables for the credentials where they differ from default (in parentheses):

  • SPLINKTEST_PG_USER (splinkognito)
  • SPLINKTEST_PG_PASSWORD (splink123!)
  • SPLINKTEST_PG_HOST (localhost)
  • SPLINKTEST_PG_PORT (5432)
  • SPLINKTEST_PG_DB (splink_db) - tests will not actually run against this, but it is from a connection to this that the temporary test database + user will be created

While care has been taken to ensure that tests are run using minimal permissions, and are cleaned up after, it is probably wise to run tests connected to a non-important database, in case anything goes wrong. In addition to the standard privileges for Splink usage, in order to run the tests you will need:

  • CREATE DATABASE to create a temporary testing database
  • CREATEROLE to create a temporary user role with limited privileges, which will be actually used for all the SQL execution in the tests

Tests in CI

Splink utilises GitHub actions to run tests for each pull request. This consists of a few independent checks:

  • The full test suite is run separately against several different python versions
  • The example notebooks are checked to ensure they run without error
  • The tutorial notebooks are checked to ensure they run without error

Writing tests

Core tests

Core tests are treated the same way as ordinary pytest tests. Any test is marked as core by default, and will only be excluded from being a core test if it is decorated using either:

from the test decorator file.

Backend-agnostic testing

The majority of tests should be written using the backend-agnostic testing framework. This just provides some small tools which allow tests to be written in a backend-independent way. This means the tests can then by run against all available SQL backends (or a subset, if some lack necessary features for the test).

As an example, let's consider a test that will run on all dialects, and then break down the various parts to see what each is doing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from tests.decorator import mark_with_dialects_excluding

@mark_with_dialects_excluding()
def test_feature_that_works_for_all_backends(test_helpers, dialect, some_other_test_fixture):
    helper = test_helpers[dialect]

    df = helper.load_frame_from_csv("./tests/datasets/fake_1000_from_splink_demos.csv")

    settings_dict = {
        "link_type": "dedupe_only",
        "blocking_rules_to_generate_predictions": ["l.city = r.city", "l.surname = r.surname", "l.dob = r.dob"],
        "comparisons": [
            helper.cl.exact_match("city"),
            helper.cl.levenshtein_at_thresholds("first_name", [1, 2]),
            helper.cl.levenshtein_at_thresholds("surname"),
            {
                "output_column_name": "email",
                "comparison_description": "Email",
                "comparison_levels": [
                    helper.cll.null_level("email"),
                    helper.cll.exact_match_level("email"),
                    helper.cll.levenshtein_level("email", 2),
                    {
                        "sql_condition": "substr(email_l, 1) = substr(email_r, 1)",
                        "label_for_charts": "email first character matches",
                    },
                    helper.cll.else_level(),
                ]
            }
        ]
    }

    linker = helper.Linker(df, settings_dict, **helper.extra_linker_args())

    # and then some actual testing logic

Firstly you should import the decorator-factory mark_with_dialects_excluding, which will decorate each test function:

1
from tests.decorator import mark_with_dialects_excluding

Then we define the function, and pass parameters:

3
4
@mark_with_dialects_excluding()
def test_feature_that_works_for_all_backends(test_helpers, dialect, some_other_test_fixture):

The decorator @mark_with_dialects_excluding() will do two things:

  • marks the test it decorates with the appropriate custom pytest marks. This ensures that it will be run with tests for each dialect, excluding any that are passed as arguments; in this case it will be run for all dialects, as we have passed no arguments.
  • parameterises the test with a string parameter dialect, which will be used to configure the test for that dialect. The test will run for each value of dialect possible, excluding any passed to the decorator (none in this case).

You should aim to exclude as few dialects as possible - consider if you really need to exclude any. Dialects should only be excluded if the test doesn't make sense for them due to features they lack. The default choice should be the decorator with no arguments @mark_with_dialects_excluding(), meaning the test runs for all dialects.

3
4
@mark_with_dialects_excluding()
def test_feature_that_works_for_all_backends(test_helpers, dialect, some_other_test_fixture):

As well as the parameter dialect (which is provided by the decorator), we must also pass the helper-factory fixture test_helpers. We can additionally pass further fixtures if needed - in this case some_other_test_fixture. We could similarly provide an explicit parameterisation to the test, in which case we would also pass these parameters - see the pytest docs on parameterisation for more information.

5
    helper = test_helpers[dialect]

The fixture test_helpers is simply a dictionary of the specific-dialect test helpers - here we pick the appropriate one for our test.

Each helper has the same set of methods and properties, which encapsulate all of the dialect-dependencies. You can find the full set of properties and methods by examining the source for the base class TestHelper.

7
    df = helper.load_frame_from_csv("./tests/datasets/fake_1000_from_splink_demos.csv")

Here we are now actually using a method of the test helper - in this case we are loading a table from a csv to the database and returning it in a form suitable for passing to a Splink linker.

12
13
14
15
16
17
    "comparisons": [
        helper.cl.exact_match("city"),
        helper.cl.levenshtein_at_thresholds("first_name", [1, 2]),
        helper.cl.levenshtein_at_thresholds("surname"),
        {
            "output_column_name": "email",
We reference the dialect-specific comparison library as helper.cl,

16
17
18
19
20
21
22
23
24
25
26
27
28
29
    {
        "output_column_name": "email",
        "comparison_description": "Email",
        "comparison_levels": [
            helper.cll.null_level("email"),
            helper.cll.exact_match_level("email"),
            helper.cll.levenshtein_level("email", 2),
            {
                "sql_condition": "substr(email_l, 1) = substr(email_r, 1)",
                "label_for_charts": "email first character matches",
            }
            helper.cll.else_level(),
        ]
    }
and the dialect-specific comparison level library as helper.cll.

23
24
25
26
    {
        "sql_condition": "substr(email_l, 1) = substr(email_r, 1)",
        "label_for_charts": "email first character matches",
    },
We can include raw SQL statements, but we must ensure they are valid for all dialects we are considering, so we should avoid any unusual functions that are not likely to be universal.

33
    linker = helper.Linker(df, settings_dict, **helper.extra_linker_args())
Finally we instantiate the linker, passing any default set of extra arguments provided by the helper, which some dialects require.

From this point onwards we will be working with the instantiated linker, and so will not need to refer to helper any more - the rest of the test can be written as usual.

Excluding some backends

Now let's have a small look at a similar example - only this time we are going to exclude the sqlite backend, as the test relies on features not directly available for that backend. In this example that will be the SQL function split_part which does not exist in the sqlite dialect.

Warning

Tests should be made available to the widest range of backends possible. Only exclude backends if features not shared by all backends are crucial to the test-logic - otherwise consider rewriting things so that all backends are covered.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from tests.decorator import mark_with_dialects_excluding

@mark_with_dialects_excluding("sqlite")
def test_feature_that_doesnt_work_with_sqlite(test_helpers, dialect, some_other_test_fixture):
    helper = test_helpers[dialect]

    df = helper.load_frame_from_csv("./tests/datasets/fake_1000_from_splink_demos.csv")

    settings_dict = {
        "link_type": "dedupe_only",
        "blocking_rules_to_generate_predictions": ["l.city = r.city", "l.surname = r.surname", "l.dob = r.dob"],
        "comparisons": [
            helper.cl.exact_match("city"),
            helper.cl.levenshtein_at_thresholds("first_name", [1, 2]),
            helper.cl.levenshtein_at_thresholds("surname"),
            {
                "output_column_name": "email",
                "comparison_description": "Email",
                "comparison_levels": [
                    helper.cll.null_level("email"),
                    helper.cll.exact_match_level("email"),
                    helper.cll.levenshtein_level("email", 2),
                    {
                        "sql_condition": "split_part(email_l, '@', 1) = split_part(email_r, '@', 1)",
                        "label_for_charts": "email local-part matches",
                    },
                    helper.cll.else_level(),
                ]
            }
        ]
    }

    linker = helper.Linker(df, settings_dict, **helper.extra_linker_args())

    # and then some actual testing logic

The key difference is the argument we pass to the decorator:

3
4
@mark_with_dialects_excluding("sqlite")
def test_feature_that_doesnt_work_with_sqlite(test_helpers, dialect, some_other_test_fixture):
As above this marks the test it decorates with the appropriate custom pytest marks, but in this case it ensures that it will be run with tests for each dialect excluding sqlite. Again dialect is passed as a parameter, and the test will run in turn for each value of dialect except for sqlite.

23
24
25
26
    {
        "sql_condition": "split_part(email_l, '@', 1) = split_part(email_r, '@', 1)",
        "label_for_charts": "email local-part matches",
    }
This line is why we cannot allow sqlite for this test - we make use of the function split_part which is not available in the sqlite dialect, hence its exclusion. We suppose that this particular comparison level is crucial for the test to make sense, otherwise we would rewrite this line to make it run universally. When you come to run the tests, this test will not run on the sqlite backend.

If you need to exclude multiple dialects this is also possible - just pass each as an argument. For example, to decorate a test that is not supported on spark or sqlite, use the decorator @mark_with_dialects_excluding("sqlite", "spark").

Backend-specific tests

If you intend to write a test for a specific backend, first consider whether it is definitely specific to that backend - if not then a backend-agnostic test would be preferable, as then your test will be run against many backends. If you really do need to test features peculiar to one backend, then you can write it simply as you would an ordinary pytest test. The only difference is that you should decorate it with @mark_with_dialects_including (from tests/decorator.py) - for example:

@mark_with_dialects_including("duckdb")
def test_some_specific_duckdb_feature():
    ...
@mark_with_dialects_including("spark")
def test_some_specific_spark_feature():
    ...
@mark_with_dialects_including("sqlite")
def test_some_specific_sqlite_feature():
    ...

This ensures that the test gets marked appropriately for running when the Spark tests should be run, and excludes it from the set of core tests.

Note that unlike the exclusive mark_with_dialects_excluding, this decorator will not parameterise the test with the dialect argument. This is because usage of the inclusive form is largely designed for single-dialect tests. If you wish to override this behaviour and parameterise the test you can use the argument pass_dialect, for example @mark_with_dialects_including("spark", "sqlite", pass_dialect=True), in which case you would need to write the test in a backend-independent manner.