Secret Sharing vs . Encryption-based Techniques For Privacy Preserving Data Mining 1

Privacy preserving querying and data publishing has been studied in the context of statistical databases and statistical disclosure control. Recently, large-scale data collection and integration efforts increased privacy concerns which motivated data mining researchers to investigate privacy implications of data mining and how data mining can be performed without violating privacy. In this paper, we first provide an overview of privacy preserving data mining focusing on distributed data sources, then we compare two technologies used in privacy preserving data mining. The first technology is encryptionbased, and it is used in earlier approaches. The second technology is secret-sharing which is recently being considered as a more efficient approach.

[1]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[2]  G. R. BLAKLEY Safeguarding cryptographic keys , 1979, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[3]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.

[4]  John Bloom,et al.  A modular approach to key safeguarding , 1983, IEEE Trans. Inf. Theory.

[5]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[6]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[7]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[8]  Silvio Micali,et al.  The round complexity of secure protocols , 1990, STOC '90.

[9]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[10]  Rathindra Sarathy,et al.  Security of random data perturbation methods , 1999, TODS.

[11]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[12]  Ivan Damgård,et al.  Secure Distributed Linear Algebra in a Constant Number of Rounds , 2001, CRYPTO.

[13]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[14]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[15]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[16]  Philip S. Yu,et al.  A Condensation Approach to Privacy Preserving Data Mining , 2004, EDBT.

[17]  Bart Goethals,et al.  On Private Scalar Product Computation for Privacy-Preserving Data Mining , 2004, ICISC.

[18]  Kun Liu,et al.  An Attacker's View of Distance Preserving Maps for Privacy Preserving Data Mining , 2006, PKDD.

[19]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[20]  Charu C. Aggarwal,et al.  On Randomization, Public Information and the Curse of Dimensionality , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[21]  Yücel Saygin,et al.  Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing , 2007, PAKDD Workshops.