<?xml version="1.0" encoding="UTF-8" ?> <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"> <channel> <title>Splink</title><link>https://moj-analytical-services.github.io/splink/</link><atom:link href="https://moj-analytical-services.github.io/splink/feed_rss_created.xml" rel="self" type="application/rss+xml" /> <docs>https://github.com/moj-analytical-services/splink</docs><language>en</language> <pubDate>Fri, 13 Mar 2026 10:51:41 -0000</pubDate> <lastBuildDate>Fri, 13 Mar 2026 10:51:41 -0000</lastBuildDate> <ttl>1440</ttl> <generator>MkDocs RSS plugin - v1.17.9</generator> <image> <url>None</url> <title>Splink</title> <link>https://moj-analytical-services.github.io/splink/</link> </image> <item> <title>Running Splink in Production at the Ministry of Justice</title> <author>Tom Hepworth</author> <author>Sam Lindsay</author> <description>&lt;h1&gt;Running Splink in Production at the Ministry of Justice&lt;/h1&gt; &lt;p&gt;We have published plenty on record linkage theory, Splink&#39;s capabilities, and how to build a model. What we have not covered in depth is the engineering side of running linkage as a repeatable data product at scale. Splink gives you the statistical machinery, but it is intentionally unopinionated about how you productionise it.&lt;/p&gt; &lt;p&gt;Getting Splink to run once is usually straightforward. Getting it to run every week, across multiple datasets, while keeping outputs auditable and recoverable is the harder part.&lt;/p&gt; &lt;p&gt;This post sets out how we do that at the Ministry of Justice, how we keep the pipeline modular rather than fragile, and how we catch issues early enough to recover safely.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2026/01/29/running-splink-in-production.html</link> <pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2026/01/29/running-splink-in-production.html</guid> </item> <item> <title>Bias in Data Linking, continued</title> <author>Erica Kane</author> <author>Ross Kennedy</author> <description>&lt;h1&gt;Bias in Data Linking, continued&lt;/h1&gt; &lt;p&gt;This blog is the second in our series dedicated to Bias in Data Linking. Here we wrap up work completed during the the six-month &lt;a href=&#34;https://www.turing.ac.uk&#34;&gt;Alan Turing Institute&lt;/a&gt; internship on &#39;&lt;em&gt;Bias in Data Linking&lt;/em&gt;&#39;, and share some final thoughts.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2024/12/02/bias-in-data-linking-continued.html</link> <pubDate>Mon, 02 Dec 2024 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2024/12/02/bias-in-data-linking-continued.html</guid> </item> <item> <title>Bias in Data Linking</title> <author>Erica Kane</author> <description>&lt;h1&gt;Bias in Data Linking&lt;/h1&gt; &lt;p&gt;In March 2024, the Splink team launched a 6-month &lt;em&gt;&#39;Bias in Data Linking&#39;&lt;/em&gt; internship with the &lt;a href=&#34;https://www.turing.ac.uk&#34;&gt;Alan Turing Institute&lt;/a&gt;. This installment of the Splink Blog is going to introduce the internship, its goals, and provide an update on what&#39;s happened so far.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2024/08/19/bias-in-data-linking.html</link> <pubDate>Mon, 19 Aug 2024 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2024/08/19/bias-in-data-linking.html</guid> </item> <item> <title>Splink 4.0.0 released</title> <author>Robin Linacre</author> <description>&lt;h1&gt;Splink 4.0.0 released&lt;/h1&gt; &lt;p&gt;We&#39;re pleased to release Splink 4, which is more scalable and easier to use than Splink 3.&lt;/p&gt; &lt;p&gt;For the uninitiated, &lt;a href=&#34;../../index.md&#34;&gt;Splink&lt;/a&gt; is a free and open source library for record linkage and deduplication at scale, capable of deduplicating 100 million records+, that is &lt;a href=&#34;../../index.md#use-cases&#34;&gt;widely used&lt;/a&gt; and has been downloaded over 8 million times.&lt;/p&gt; &lt;p&gt;Version 4 is recommended to all new users. For existing users, there has been no change to the statistical methodology. Version 3 and 4 will give the same results, so there&#39;s no urgency to upgrade existing pipelines.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2024/07/24/splink-400-released.html</link> <pubDate>Wed, 24 Jul 2024 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2024/07/24/splink-400-released.html</guid> </item> <item> <title>Splink 3 updates, and Splink 4 development announcement - April 2024</title> <author>Robin Linacre</author> <description>&lt;h1&gt;Splink 3 updates, and Splink 4 development announcement - April 2024&lt;/h1&gt; &lt;p&gt;This post describes significant updates to Splink since our previous &lt;a href=&#34;https://moj-analytical-services.github.io/splink/blog/2023/12/06/splink-updates---december-2023.html&#34;&gt;post&lt;/a&gt; and details of development work taking place on the forthcoming release of Splink 4.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html</link> <pubDate>Tue, 02 Apr 2024 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html</guid> </item> <item> <title>Ethics in Data Linking</title> <author>Zoë Slade</author> <author>Alice O'Leary</author> <description>&lt;h1&gt;Ethics in Data Linking&lt;/h1&gt; &lt;p&gt;Welcome to the next installment of the Splink Blog where we’re talking about Data Ethics!&lt;/p&gt; &lt;h2&gt;:question: Why should we care about ethics?&lt;/h2&gt; &lt;p&gt;Splink was developed in-house at the UK Government’s Ministry of Justice. As data scientists in government, we are accountable to the public and have a duty to maintain public trust. This includes upholding high standards of data ethics in our work.&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2024/01/23/ethics-in-data-linking.html</link> <pubDate>Tue, 23 Jan 2024 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2024/01/23/ethics-in-data-linking.html</guid> </item> <item> <title>Splink Updates - December 2023</title> <author>Ross Kennedy</author> <description>&lt;h1&gt;Splink Updates - December 2023&lt;/h1&gt; &lt;p&gt;Welcome to the second installment of the Splink Blog!&lt;/p&gt; &lt;p&gt;Here are some of the highlights from the second half of 2023, and a taste of what is in store for 2024!&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2023/12/06/splink-updates---december-2023.html</link> <pubDate>Wed, 06 Dec 2023 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2023/12/06/splink-updates---december-2023.html</guid> </item> <item> <title>Splink Updates - July 2023</title> <author>Ross Kennedy</author> <author>Robin Linacre</author> <description>&lt;h1&gt;Splink Updates - July 2023&lt;/h1&gt; &lt;h2&gt;:new: Welcome to the Splink Blog! :new:&lt;/h2&gt; &lt;p&gt;Its hard to keep up to date with all of the new features being added to Splink, so we have launched this blog to share a round up of latest developments every few months.&lt;/p&gt; &lt;p&gt;So, without further ado, here are some of the highlights from the first half of 2023!&lt;/p&gt;</description> <link>https://moj-analytical-services.github.io/splink/blog/2023/07/27/splink-updates---july-2023.html</link> <pubDate>Thu, 27 Jul 2023 00:00:00 +0000</pubDate> <source url="https://moj-analytical-services.github.io/splink/feed_rss_created.xml">Splink</source><guid isPermaLink="true">https://moj-analytical-services.github.io/splink/blog/2023/07/27/splink-updates---july-2023.html</guid> </item> </channel> </rss>