Methods in Linker.evaluation¶

Evaluate the performance of a Splink model. Accessed via linker.evaluation

`prediction_errors_from_labels_table(labels_splinkdataframe_or_table_name, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5)` ¶

Find false positives and false negatives based on the comparison between the clerical_match_score in the labels table compared with the splink predicted match probability

The table of labels should be in the following format, and should be registered as a table with your database using

labels_table = linker.table_management.register_labels_table(my_df)

source_dataset_l	unique_id_l	source_dataset_r	unique_id_r	clerical_match_score
df_1	1	df_2	2	0.99
df_1	1	df_2	3	0.2

Parameters:

Name	Type	Description	Default
`labels_splinkdataframe_or_table_name`	`str \| SplinkDataFrame`	Name of table containing labels in the database	required
`include_false_positives`	`bool`	Defaults to True.	`True`
`include_false_negatives`	`bool`	Defaults to True.	`True`
`threshold_match_probability`	`float`	Threshold probability above which a prediction considered to be a match. Defaults to 0.5.	`0.5`

Examples:

labels_table = linker.table_management.register_labels_table(df_labels)

linker.evaluation.prediction_errors_from_labels_table(
   labels_table, include_false_negatives=True, include_false_positives=False
).as_pandas_dataframe()

Returns:

Name	Type	Description
`SplinkDataFrame`	`SplinkDataFrame`	Table containing false positives and negatives

`accuracy_analysis_from_labels_column(labels_column_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[], positives_not_captured_by_blocking_rules_scored_as_zero=True)` ¶

Generate an accuracy chart or table from ground truth data, where the ground truth is in a column in the input dataset called labels_column_name

Parameters:

Name	Type	Description	Default
`labels_column_name`	`str`	Column name containing labels in the input table	required
`threshold_match_probability`	`float`	Where the `clerical_match_score` provided by the user is a probability rather than binary, this value is used as the threshold to classify `clerical_match_score`s as binary matches or non matches. Defaults to 0.5.	`0.5`
`match_weight_round_to_nearest`	`float`	When provided, thresholds are rounded. When large numbers of labels are provided, this is sometimes necessary to reduce the size of the ROC table, and therefore the number of points plotted on the chart. Defaults to None.	`0.1`
`add_metrics`	`list(str)`	Precision and recall metrics are always included. Where provided, `add_metrics` specifies additional metrics to show, with the following options: `"specificity"`: specificity, selectivity, true negative rate (TNR) `"npv"`: negative predictive value (NPV) `"accuracy"`: overall accuracy (TP+TN)/(P+N) `"f1"`/`"f2"`/`"f0_5"`: F-scores for β=1 (balanced), β=2 (emphasis on recall) and β=0.5 (emphasis on precision) `"p4"` - an extended F1 score with specificity and NPV included `"phi"` - φ coefficient or Matthews correlation coefficient (MCC)	`[]`

Examples:

linker.evaluation.accuracy_analysis_from_labels_column("ground_truth", add_metrics=["f1"])

Returns:

Name	Type	Description
`chart`	`Union[ChartReturnType, SplinkDataFrame]`	An altair chart

`accuracy_analysis_from_labels_table(labels_splinkdataframe_or_table_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[])` ¶

Generate an accuracy chart or table from labelled (ground truth) data.

The table of labels should be in the following format, and should be registered as a table with your database using labels_table = linker.table_management.register_labels_table(my_df)

source_dataset_l	unique_id_l	source_dataset_r	unique_id_r	clerical_match_score
df_1	1	df_2	2	0.99
df_1	1	df_2	3	0.2

Note that source_dataset and unique_id should correspond to the values specified in the settings dict, and the input_table_aliases passed to the linker object.

For dedupe_only links, the source_dataset columns can be ommitted.

Parameters:

Name	Type	Description	Default
`labels_splinkdataframe_or_table_name`	`str \| SplinkDataFrame`	Name of table containing labels in the database	required
`threshold_match_probability`	`float`	Where the `clerical_match_score` provided by the user is a probability rather than binary, this value is used as the threshold to classify `clerical_match_score`s as binary matches or non matches. Defaults to 0.5.	`0.5`
`match_weight_round_to_nearest`	`float`	When provided, thresholds are rounded. When large numbers of labels are provided, this is sometimes necessary to reduce the size of the ROC table, and therefore the number of points plotted on the chart. Defaults to None.	`0.1`
`add_metrics`	`list(str)`	Precision and recall metrics are always included. Where provided, `add_metrics` specifies additional metrics to show, with the following options: `"specificity"`: specificity, selectivity, true negative rate (TNR) `"npv"`: negative predictive value (NPV) `"accuracy"`: overall accuracy (TP+TN)/(P+N) `"f1"`/`"f2"`/`"f0_5"`: F-scores for β=1 (balanced), β=2 (emphasis on recall) and β=0.5 (emphasis on precision) `"p4"` - an extended F1 score with specificity and NPV included `"phi"` - φ coefficient or Matthews correlation coefficient (MCC)	`[]`

Returns:

Type	Description
`Union[ChartReturnType, SplinkDataFrame]`	altair.Chart: An altair chart

Examples:

linker.evaluation.accuracy_analysis_from_labels_table("ground_truth", add_metrics=["f1"])

`prediction_errors_from_labels_column(label_colname, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5)` ¶

Generate a dataframe containing false positives and false negatives based on the comparison between the splink match probability and the labels column. A label column is a column in the input dataset that contains the 'ground truth' cluster to which the record belongs

Parameters:

Name	Type	Description	Default
`label_colname`	`str`	Name of labels column in input data	required
`include_false_positives`	`bool`	Defaults to True.	`True`
`include_false_negatives`	`bool`	Defaults to True.	`True`
`threshold_match_probability`	`float`	Threshold above which a score is considered to be a match. Defaults to 0.5.	`0.5`

Returns:

Name	Type	Description
`SplinkDataFrame`	`SplinkDataFrame`	Table containing false positives and negatives

Examples:

linker.evaluation.prediction_errors_from_labels_column(
    "ground_truth_cluster",
    include_false_negatives=True,
    include_false_positives=False
).as_pandas_dataframe()

`unlinkables_chart(x_col='match_weight', name_of_data_in_title=None, as_dict=False)` ¶

Generate an interactive chart displaying the proportion of records that are "unlinkable" for a given splink score threshold and model parameters.

Unlinkable records are those that, even when compared with themselves, do not contain enough information to confirm a match.

Parameters:

Name	Type	Description	Default
`x_col`	`str`	Column to use for the x-axis. Defaults to "match_weight".	`'match_weight'`
`name_of_data_in_title`	`str`	Name of the source dataset to use for the title of the output chart.	`None`
`as_dict`	`bool`	If True, return a dict version of the chart.	`False`

Returns:

Type	Description
`ChartReturnType`	altair.Chart: An altair chart

Examples:

After estimating the parameters of the model, run:

linker.evaluation.unlinkables_chart()

`labelling_tool_for_specific_record(unique_id, source_dataset=None, out_path='labelling_tool.html', overwrite=False, match_weight_threshold=-4, view_in_jupyter=False, show_splink_predictions_in_interface=True)` ¶

Create a standalone, offline labelling dashboard for a specific record as identified by its unique id

Parameters:

Name	Type	Description	Default
`unique_id`	`str`	The unique id of the record for which to create the labelling tool	required
`source_dataset`	`str`	If there are multiple datasets, to identify the record you must also specify the source_dataset. Defaults to None.	`None`
`out_path`	`str`	The output path for the labelling tool. Defaults to "labelling_tool.html".	`'labelling_tool.html'`
`overwrite`	`bool`	If true, overwrite files at the output path if they exist. Defaults to False.	`False`
`match_weight_threshold`	`int`	Include possible matches in the output which score above this threshold. Defaults to -4.	`-4`
`view_in_jupyter`	`bool`	If you're viewing in the Jupyter html viewer, set this to True to extract your labels. Defaults to False.	`False`
`show_splink_predictions_in_interface`	`bool`	Whether to show information about the Splink model's predictions that could potentially bias the decision of the clerical labeller. Defaults to True.	`True`

Methods in Linker.evaluation¶

prediction_errors_from_labels_table(labels_splinkdataframe_or_table_name, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5) ¶

accuracy_analysis_from_labels_column(labels_column_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[], positives_not_captured_by_blocking_rules_scored_as_zero=True) ¶

accuracy_analysis_from_labels_table(labels_splinkdataframe_or_table_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[]) ¶

prediction_errors_from_labels_column(label_colname, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5) ¶

unlinkables_chart(x_col='match_weight', name_of_data_in_title=None, as_dict=False) ¶

labelling_tool_for_specific_record(unique_id, source_dataset=None, out_path='labelling_tool.html', overwrite=False, match_weight_threshold=-4, view_in_jupyter=False, show_splink_predictions_in_interface=True) ¶

`prediction_errors_from_labels_table(labels_splinkdataframe_or_table_name, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5)` ¶

`accuracy_analysis_from_labels_column(labels_column_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[], positives_not_captured_by_blocking_rules_scored_as_zero=True)` ¶

`accuracy_analysis_from_labels_table(labels_splinkdataframe_or_table_name, *, threshold_match_probability=0.5, match_weight_round_to_nearest=0.1, output_type='threshold_selection', add_metrics=[])` ¶

`prediction_errors_from_labels_column(label_colname, include_false_positives=True, include_false_negatives=True, threshold_match_probability=0.5)` ¶

`unlinkables_chart(x_col='match_weight', name_of_data_in_title=None, as_dict=False)` ¶

`labelling_tool_for_specific_record(unique_id, source_dataset=None, out_path='labelling_tool.html', overwrite=False, match_weight_threshold=-4, view_in_jupyter=False, show_splink_predictions_in_interface=True)` ¶