Privacy Preserving Blocking and Meta-Blocking

Record linkage refers to integrating data from heterogeneous sources to identify information regarding the same entity and provides the basis for sophisticated data mining. When privacy restrictions apply, the data sources may only have access to the merged records of the linkage process, comprising the problem of privacy preserving record linkage. As data are often dirty, and there are no common unique identifiers, the linkage process requires approximate matching and it renders to a very resource demanding task especially for large volumes of data. To speed up the linkage process, privacy preserving blocking and meta-blocking techniques are deployed. Such techniques derive groups of records that are more likely to match with each other. In this nectar paper, we summarize our contributions to privacy preserving blocking and meta-blocking.

[1]  Vassilios S. Verykios,et al.  Secure Blocking + Secure Matching = Secure Record Linkage , 2011, J. Comput. Sci. Eng..

[2]  Mikhail J. Atallah,et al.  Efficient Private Record Linkage , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[3]  Salvatore J. Stolfo,et al.  Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem , 1998, Data Mining and Knowledge Discovery.

[4]  Elisa Bertino,et al.  Private record matching using differential privacy , 2010, EDBT '10.

[5]  Vassilios S. Verykios,et al.  A Sorted Neighborhood Approach to Multidimensional Privacy Preserving Blocking , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[6]  Peter Christen,et al.  Efficient two-party private blocking based on sorted nearest neighborhood clustering , 2013, CIKM.

[7]  Vassilios S. Verykios,et al.  Reference table based k-anonymous private blocking , 2012, SAC '12.

[8]  Vassilios S. Verykios,et al.  A Highly Efficient and Secure Multidimensional Blocking Approach for Private Record Linkage , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.