Skip to content

Methods in Linker.evaluation¶

Evaluate the performance of a Splink model. Accessed via linker.evaluation

prediction_errors_from_labels_table(labels_splinkdataframe_or_table_name, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5) ¶

Find false positives and false negatives based on the comparison between the clerical_match_score in the labels table compared with the splink predicted match probability

The table of labels should be in the following format, and should be registered as a table with your database using

labels_table = linker.table_management.register_labels_table(my_df)

source_dataset_l unique_id_l source_dataset_r unique_id_r clerical_match_score
df_1 1 df_2 2 0.99
df_1 1 df_2 3 0.2

Parameters:

Name Type Description Default
labels_splinkdataframe_or_table_name str | SplinkDataFrame

Name of table containing labels in the database

required
include_false_positives bool

Defaults to True.

True
include_false_negatives bool

Defaults to True.

True
threshold_match_probability float

Threshold probability above which a prediction considered to be a match. Defaults to 0.5.

0.5

Examples:

labels_table = linker.table_management.register_labels_table(df_labels)

linker.evaluation.prediction_errors_from_labels_table(
   labels_table, include_false_negatives=True, include_false_positives=False
).as_pandas_dataframe()

Returns:

Name Type Description
SplinkDataFrame SplinkDataFrame

Table containing false positives and negatives

accuracy_analysis_from_labels_column(labels_column_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[], positives_not_captured_by_blocking_rules_scored_as_zero=True) ¶

Generate an accuracy chart or table from ground truth data, where the ground truth is in a column in the input dataset called labels_column_name

Parameters:

Name Type Description Default
labels_column_name str

Column name containing labels in the input table

required
threshold_match_probability float

Where the clerical_match_score provided by the user is a probability rather than binary, this value is used as the threshold to classify clerical_match_scores as binary matches or non matches. Defaults to 0.5.

0.5
match_weight_round_to_nearest float

When provided, thresholds are rounded. When large numbers of labels are provided, this is sometimes necessary to reduce the size of the ROC table, and therefore the number of points plotted on the chart. Defaults to None.

0.1
add_metrics list(str)

Precision and recall metrics are always included. Where provided, add_metrics specifies additional metrics to show, with the following options:

  • "specificity": specificity, selectivity, true negative rate (TNR)
  • "npv": negative predictive value (NPV)
  • "accuracy": overall accuracy (TP+TN)/(P+N)
  • "f1"/"f2"/"f0_5": F-scores for β=1 (balanced), β=2 (emphasis on recall) and β=0.5 (emphasis on precision)
  • "p4" - an extended F1 score with specificity and NPV included
  • "phi" - φ coefficient or Matthews correlation coefficient (MCC)
[]

Examples:

linker.evaluation.accuracy_analysis_from_labels_column("ground_truth", add_metrics=["f1"])

Returns:

Name Type Description
chart Union[ChartReturnType, SplinkDataFrame]

An altair chart

accuracy_analysis_from_labels_table(labels_splinkdataframe_or_table_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[]) ¶

Generate an accuracy chart or table from labelled (ground truth) data.

The table of labels should be in the following format, and should be registered as a table with your database using labels_table = linker.register_labels_table(my_df)

source_dataset_l unique_id_l source_dataset_r unique_id_r clerical_match_score
df_1 1 df_2 2 0.99
df_1 1 df_2 3 0.2

Note that source_dataset and unique_id should correspond to the values specified in the settings dict, and the input_table_aliases passed to the linker object.

For dedupe_only links, the source_dataset columns can be ommitted.

Parameters:

Name Type Description Default
labels_splinkdataframe_or_table_name str | SplinkDataFrame

Name of table containing labels in the database

required
threshold_match_probability float

Where the clerical_match_score provided by the user is a probability rather than binary, this value is used as the threshold to classify clerical_match_scores as binary matches or non matches. Defaults to 0.5.

0.5
match_weight_round_to_nearest float

When provided, thresholds are rounded. When large numbers of labels are provided, this is sometimes necessary to reduce the size of the ROC table, and therefore the number of points plotted on the chart. Defaults to None.

0.1
add_metrics list(str)

Precision and recall metrics are always included. Where provided, add_metrics specifies additional metrics to show, with the following options:

  • "specificity": specificity, selectivity, true negative rate (TNR)
  • "npv": negative predictive value (NPV)
  • "accuracy": overall accuracy (TP+TN)/(P+N)
  • "f1"/"f2"/"f0_5": F-scores for β=1 (balanced), β=2 (emphasis on recall) and β=0.5 (emphasis on precision)
  • "p4" - an extended F1 score with specificity and NPV included
  • "phi" - φ coefficient or Matthews correlation coefficient (MCC)
[]

Returns:

Type Description
Union[ChartReturnType, SplinkDataFrame]

altair.Chart: An altair chart

Examples:

linker.accuracy_analysis_from_labels_table("ground_truth", add_metrics=["f1"])

prediction_errors_from_labels_column(label_colname, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5) ¶

Generate a dataframe containing false positives and false negatives based on the comparison between the splink match probability and the labels column. A label column is a column in the input dataset that contains the 'ground truth' cluster to which the record belongs

Parameters:

Name Type Description Default
label_colname str

Name of labels column in input data

required
include_false_positives bool

Defaults to True.

True
include_false_negatives bool

Defaults to True.

True
threshold_match_probability float

Threshold above which a score is considered to be a match. Defaults to 0.5.

0.5

Returns:

Name Type Description
SplinkDataFrame SplinkDataFrame

Table containing false positives and negatives

Examples:

linker.evaluation.prediction_errors_from_labels_column(
    "ground_truth_cluster",
    include_false_negatives=True,
    include_false_positives=False
).as_pandas_dataframe()

unlinkables_chart(x_col='match_weight', name_of_data_in_title=None, as_dict=False) ¶

Generate an interactive chart displaying the proportion of records that are "unlinkable" for a given splink score threshold and model parameters.

Unlinkable records are those that, even when compared with themselves, do not contain enough information to confirm a match.

Parameters:

Name Type Description Default
x_col str

Column to use for the x-axis. Defaults to "match_weight".

'match_weight'
name_of_data_in_title str

Name of the source dataset to use for the title of the output chart.

None
as_dict bool

If True, return a dict version of the chart.

False

Returns:

Type Description
ChartReturnType

altair.Chart: An altair chart

Examples:

After estimating the parameters of the model, run:

linker.evaluation.unlinkables_chart()

labelling_tool_for_specific_record(unique_id, source_dataset=None, out_path='labelling_tool.html', overwrite=False, match_weight_threshold=-4, view_in_jupyter=False, show_splink_predictions_in_interface=True) ¶

Create a standalone, offline labelling dashboard for a specific record as identified by its unique id

Parameters:

Name Type Description Default
unique_id str

The unique id of the record for which to create the labelling tool

required
source_dataset str

If there are multiple datasets, to identify the record you must also specify the source_dataset. Defaults to None.

None
out_path str

The output path for the labelling tool. Defaults to "labelling_tool.html".

'labelling_tool.html'
overwrite bool

If true, overwrite files at the output path if they exist. Defaults to False.

False
match_weight_threshold int

Include possible matches in the output which score above this threshold. Defaults to -4.

-4
view_in_jupyter bool

If you're viewing in the Jupyter html viewer, set this to True to extract your labels. Defaults to False.

False
show_splink_predictions_in_interface bool

Whether to show information about the Splink model's predictions that could potentially bias the decision of the clerical labeller. Defaults to True.

True