Efficient Cryptographic Primitives for Private Data Mining

Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. However, when these techniques are built on cryptographic primitives, while providing strong privacy, they are often too inefficient to be used in practical settings. To this end, we address the problem of efficiency by investigating trade-offs that can be made in the trust model. By making reasonable concessions in the trust model, that is, by adding a non-collaborative third party, we can achieve great gains in efficiency. We show this by creating a novel protocol for privately computing dot product, a foundational primitive for many private data mining activities. We also investigate how to extend our protocol in the case when a third party cannot be completely trusted by both participating parties, thus reducing the amount of trust needed in the third party. We then show experimentally the gains in efficiency that can be realized in the computation of the private dot product using this model.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[3]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[4]  Oded Goldreich,et al.  A randomized protocol for signing contracts , 1985, CACM.

[5]  A. Yao How to generate and exchange secrets , 1986, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[6]  Claude Crépeau,et al.  Equivalence Between Two Flavours of Oblivious Transfers , 1987, CRYPTO.

[7]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[8]  G. Pugliese,et al.  Severe Streptococcus pyogenes Infections, United Kingdom, 2003–2004 , 2008, Emerging infectious diseases.

[9]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[10]  Kenneth L. Clarkson,et al.  An algorithm for approximate closest-point queries , 1994, SCG '94.

[11]  Jon M. Kleinberg,et al.  Two algorithms for nearest-neighbor search in high dimensions , 1997, STOC '97.

[12]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[13]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[14]  Tal Rabin,et al.  Simplified VSS and fast-track multiparty computations with applications to threshold cryptography , 1998, PODC '98.

[15]  Rafail Ostrovsky,et al.  Efficient search for approximate nearest neighbor in high dimensional spaces , 1998, STOC '98.

[16]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[17]  Hans-Peter Kriegel,et al.  OPTICS-OF: Identifying Local Outliers , 1999, PKDD.

[18]  Moni Naor,et al.  Oblivious transfer and polynomial evaluation , 1999, STOC '99.

[19]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[20]  Yuval Ishai,et al.  Priced Oblivious Transfer: How to Sell Digital Goods , 2001, EUROCRYPT.

[21]  Moni Naor,et al.  Efficient oblivious transfer protocols , 2001, SODA '01.

[22]  Wenliang Du,et al.  Privacy-preserving cooperative statistical analysis , 2001, Seventeenth Annual Computer Security Applications Conference.

[23]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[24]  Sunil Arya,et al.  Space-efficient approximate Voronoi diagrams , 2002, STOC '02.

[25]  Sunil Arya,et al.  Linear-size approximate voronoi diagrams , 2002, SODA '02.

[26]  Jan Camenisch,et al.  Efficient Computation Modulo a Shared Secret with Application to the Generation of Shared Safe-Prime Products , 2002, CRYPTO.

[27]  Levent Ertoz,et al.  A New Shared Nearest Neighbor Clustering Algorithm and its Applications , 2002 .

[28]  Mikhail J. Atallah,et al.  A secure protocol for computing dot-products in clustered and distributed environments , 2002, Proceedings International Conference on Parallel Processing.

[29]  Wenliang Du,et al.  A practical approach to solve Secure Multi-party Computation problems , 2002, NSPW '02.

[30]  Wenliang Du,et al.  Building decision tree classifier on private data , 2002 .

[31]  Vipin Kumar,et al.  Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data , 2003, SDM.

[32]  Chris Clifton,et al.  Leveraging the "Multi" in secure multi-party computation , 2003, WPES '03.

[33]  Ananth Grama,et al.  An efficient protocol for Yao's millionaires' problem , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[34]  P. Ravikumar and W. W. Cohen and S. E. Fienberg,et al.  A Secure Protocol for Computing String Distance Metrics , 2004 .

[35]  Chris Clifton,et al.  Privacy-preserving outlier detection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[36]  Wen-Guey Tzeng Efficient 1-Out-of-n Oblivious Transfer Schemes with Universally Usable Parameters , 2004, IEEE Trans. Computers.

[37]  Sven Laur,et al.  On Private Similarity Search Protocols , 2004 .

[38]  Bart Goethals,et al.  On Private Scalar Product Computation for Privacy-Preserving Data Mining , 2004, ICISC.

[39]  Wen-Guey Tzeng,et al.  An Efficient Solution to the Millionaires' Problem Based on Homomorphic Encryption , 2005, ACNS.

[40]  Eike Kiltz,et al.  Unconditionally Secure Constant Round Multi-Party Computation for Equality, Comparison, Bits and Exponentiation , 2006, IACR Cryptol. ePrint Arch..

[41]  Michael O. Rabin,et al.  How To Exchange Secrets with Oblivious Transfer , 2005, IACR Cryptol. ePrint Arch..

[42]  Helger Lipmaa,et al.  An Oblivious Transfer Protocol with Log-Squared Communication , 2005, ISC.

[43]  Vipin Kumar,et al.  Privacy Preserving Nearest Neighbor Search , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[44]  Ian F. Blake,et al.  Conditional Encrypted Mapping and Comparing Encrypted Numbers , 2006, Financial Cryptography.

[45]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[46]  Rebecca N. Wright,et al.  Experimental analysis of a privacy-preserving scalar product protocol , 2006, Comput. Syst. Sci. Eng..