fastLink: Fast Probabilistic Record Linkage

Abstract

This open-source software package implements a Fellegi-Sunter probabilistic record linkage model that allows for missing data and the inclusion of auxiliary information. This includes functionalities to conduct a merge of two datasets under the Fellegi-Sunter model using the Expectation-Maximization algorithm. In addition, tools for preparing, adjusting, and summarizing data merges are included. The package implements methods described in Enamorado, Fifield, and Imai (2017) ”Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records.”

Source Code

The source code and the documentation are available for download from the GitHub repository as well as The Comprehensive R Archive Network.