A Highly Efficient and Secure Multidimensional Blocking Approach for Private Record Linkage

Privacy Preserving Record Linkage is the process of securely integrating information without compromising the privacy of the individuals described by these data. While such an effort sounds appealing for both academic and business applications, it is complicated and computationally intensive. In this paper we aspire to provide a solution to this problem by presenting a highly secure mutidimensional Privacy Preserving Blocking approach which is totally distributed and runs independently on each data holder, making it invulnerable to third party attacks. It is based on the idea of using publicly available corpora of data known as reference sets for creating k-anonymous clusters. We analytically prove that our method is secure and provide experimental results which evaluate the increased performance of our method in terms of matching accuracy and execution time.

[1]  Vassilios S. Verykios,et al.  Privacy preserving record linkage approaches , 2009, Int. J. Data Min. Model. Manag..

[2]  Vassilios S. Verykios,et al.  Advances in Privacy Preserving Record Linkage , 2012 .

[3]  Wenliang Du,et al.  Secure and private sequence comparisons , 2003, WPES '03.

[4]  Lifang Gu,et al.  Privacy-Preserving Fuzzy Matching Using a Public Reference Table , 2009 .

[5]  Vassilios S. Verykios,et al.  Reference table based k-anonymous private blocking , 2012, SAC '12.

[6]  Ina Fourie E‐activity and Intelligent Web Construction: Effects of Social Design , 2012 .

[7]  Stanley Trepetin Privacy-Preserving String Comparisons in Record Linkage Systems: A Review , 2008, Inf. Secur. J. A Glob. Perspect..

[8]  Peter Christen Development and user experiences of an open source data cleaning, deduplication and record linkage system , 2009, SKDD.

[9]  Peter Christen,et al.  Fake Injection Strategies for Private Phonetic Matching , 2011, DPM/SETOP.

[10]  Peter Christen,et al.  Data Matching , 2012, Data-Centric Systems and Applications.

[11]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[12]  Murat Kantarcioglu,et al.  A Constraint Satisfaction Cryptanalysis of Bloom Filters in Private Record Linkage , 2011, PETS.

[13]  Ahmed K. Elmagarmid,et al.  Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.

[14]  Murat Kantarcioglu,et al.  Efficient Similarity Search over Encrypted Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[15]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[16]  Vassilios S. Verykios,et al.  Privacy Preserving Record Linkage Using Phonetic Codes , 2009, 2009 Fourth Balkan Conference in Informatics.

[17]  Rob Hall,et al.  Privacy-Preserving Record Linkage , 2010, Privacy in Statistical Databases.

[18]  Chris Clifton,et al.  Privacy-preserving data integration and sharing , 2004, DMKD '04.

[19]  Elisa Bertino,et al.  Privacy preserving schema and data matching , 2007, SIGMOD '07.

[20]  Dawn Xiaodong Song,et al.  Privacy-Preserving Set Operations , 2005, CRYPTO.

[21]  Elisa Bertino,et al.  A Hybrid Approach to Private Record Linkage , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[22]  Rainer Schnell,et al.  Bmc Medical Informatics and Decision Making Privacy-preserving Record Linkage Using Bloom Filters , 2022 .

[23]  Elisa Bertino,et al.  Private record matching using differential privacy , 2010, EDBT '10.

[24]  Dongwon Lee,et al.  Blocking-aware private record linkage , 2005, IQIS '05.

[25]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[26]  Vassilios S. Verykios,et al.  Secure Blocking + Secure Matching = Secure Record Linkage , 2011, J. Comput. Sci. Eng..