Privacy-Preserving Data Set Union

This paper describes a cryptographic protocol for merging two or more data sets without divulging those identifying records; technically, the protocol computes a blind set-theoretic union. Applications for this protocol arise, for example, in data analysis for biomedical application areas, where identifying fields (e.g., patient names) are protected by governmental privacy regulations or by institutional research board policies.

[1]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[2]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[3]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[4]  Martin E. Hellman,et al.  An improved algorithm for computing logarithms over GF(p) and its cryptographic significance (Corresp.) , 1978, IEEE Trans. Inf. Theory.

[5]  G. Annas HIPAA regulations - a new era of medical-record privacy? , 2003, The New England journal of medicine.

[6]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[9]  Stephen C. Pohlig,et al.  An Improved Algorithm for Computing Logarithms over GF(p) and Its Cryptographic Significance , 2022, IEEE Trans. Inf. Theory.

[10]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[12]  M. Spence,et al.  Analysis of human genetic linkage , 1986 .