Skip to content

Charts in Splink¶

Interactive charts are a key tool when linking data with Splink. To see all of the charts available, check out the Splink Charts Gallery.

Charts in Splink are built with Altair.

For a given chart, there is usually:

The Vega-Lite Editor

By far the best feature of Vega-Lite is the online editor where the JSON schema and the chart are shown side-by-side, showing changes in real time as the editor helps you to navigate the API.

Vega-Lite editor

Editing existing charts¶

If you take any Altair chart in HTML format, you should be able to make changes pretty easily with the Vega-Lite Editor.

For example, consider the comparator_score_chart from the comparison_helpers library:

Before After
Alt text Alt text

Desired changes

  • Titles (shared title)
  • Axis titles
  • Shared y-axis
  • Colour scales!! 🤮 (see the Vega colour schemes docs)
  • red-green is an accessibility no-no
  • shared colour scheme for different metrics
  • unpleasant and unclear to look at
  • legends not necessary (especially when using text labels)
  • Text size encoding (larger text for similar strings)
  • Remove "_similarity" and "_distance" from column labels
  • Fixed column width (rather than chart width)
  • Row highlighting (on click/hover)

The old spec can be pasted into the Vega Lite editor and edited as shown in the video below:

Check out the final, improved version chart specification.

Before-After diff
@@ -1,9 +1,8 @@
{
-  "config": {
-    "view": {
-      "continuousWidth": 400,
-      "continuousHeight": 300
-    }
+  "title": {
+    "text": "Heatmaps of string comparison metrics",
+    "anchor": "middle",
+    "fontSize": 16
  },
  "hconcat": [
    {
@@ -18,25 +17,32 @@
                  0,
                  1
                ],
-                "range": [
-                  "red",
-                  "green"
-                ]
+                "scheme": "greenblue"
              },
-              "type": "quantitative"
+              "type": "quantitative",
+              "legend": null
            },
            "x": {
              "field": "comparator",
-              "type": "ordinal"
+              "type": "ordinal",
+              "title": null
            },
            "y": {
              "field": "strings_to_compare",
-              "type": "ordinal"
+              "type": "ordinal",
+              "title": "String comparison",
+              "axis": {
+                "titleFontSize": 14
+              }
            }
          },
-          "height": 300,
-          "title": "Heatmap of Similarity Scores",
-          "width": 300
+          "title": "Similarity",
+          "width": {
+            "step": 40
+          },
+          "height": {
+            "step": 30
+          }
        },
        {
          "mark": {
@@ -44,6 +50,16 @@
            "baseline": "middle"
          },
          "encoding": {
+            "size": {
+              "field": "score",
+              "scale": {
+                "range": [
+                  8,
+                  14
+                ]
+              },
+              "legend": null
+            },
            "text": {
              "field": "score",
              "format": ".2f",
@@ -51,7 +67,10 @@
            },
            "x": {
              "field": "comparator",
-              "type": "ordinal"
+              "type": "ordinal",
+              "axis": {
+                "labelFontSize": 12
+              }
            },
            "y": {
              "field": "strings_to_compare",
@@ -72,29 +91,33 @@
            "color": {
              "field": "score",
              "scale": {
-                "domain": [
-                  0,
-                  5
-                ],
-                "range": [
-                  "green",
-                  "red"
-                ]
+                "scheme": "yelloworangered",
+                "reverse": true
              },
-              "type": "quantitative"
+              "type": "quantitative",
+              "legend": null
            },
            "x": {
              "field": "comparator",
-              "type": "ordinal"
+              "type": "ordinal",
+              "title": null,
+              "axis": {
+                "labelFontSize": 12
+              }
            },
            "y": {
              "field": "strings_to_compare",
-              "type": "ordinal"
+              "type": "ordinal",
+              "axis": null
            }
          },
-          "height": 300,
-          "title": "Heatmap of Distance Scores",
-          "width": 200
+          "title": "Distance",
+          "width": {
+            "step": 40
+          },
+          "height": {
+            "step": 30
+          }
        },
        {
          "mark": {
@@ -102,6 +125,17 @@
            "baseline": "middle"
          },
          "encoding": {
+            "size": {
+              "field": "score",
+              "scale": {
+                "range": [
+                  8,
+                  14
+                ],
+                "reverse": true
+              },
+              "legend": null
+            },
            "text": {
              "field": "score",
              "type": "quantitative"
@@ -124,7 +158,9 @@
  ],
  "resolve": {
    "scale": {
-      "color": "independent"
+      "color": "independent",
+      "y": "shared",
+      "size": "independent"
    }
  },
  "$schema": "https://vega.github.io/schema/vega-lite/v4.17.0.json",