Privacy-Preserving Data Mining in Presence of Covert Adversaries

Disclosure of the original data sets is not acceptable due to privacy concerns in many distributed data mining settings. To address such concerns, privacy-preserving data mining has been an active research area in recent years. All the recent works on privacy-preserving data mining have considered either semi-honest or malicious adversarial models, whereby an adversary is assumed to follow or arbitrarily deviate from the protocol, respectively. While semi-honest model provides weak security requiring small amount of computation and malicious model provides strong security requiring expensive computations like Non-Interactive Zero Knowledge proofs, we envisage the need for 'covert' adversarial model that performs in between the semi-honest and malicious models, both in terms of security guarantee and computational cost. In this paper, for the first time in data-mining area, we build efficient and secure dot product and set-intersection protocols in covert adversarial model. We use homomorphic property of Paillier encryption scheme and two-party computation of Aumann et al. to construct our protocols. Furthermore, our protocols are secure in Universal Composability framework.

[1]  R. Cramer,et al.  Multiparty Computation from Threshold Homomorphic Encryption , 2000 .

[2]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[3]  Rebecca N. Wright,et al.  Privacy-Preserving Computation of Bayesian Networks on Vertically Partitioned Data , 2006, IEEE Transactions on Knowledge and Data Engineering.

[4]  Adam D. Smith,et al.  Efficient Two Party and Multi Party Computation Against Covert Adversaries , 2008, EUROCRYPT.

[5]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[6]  Georg Fuchsbauer,et al.  Public-Key Encryption with Non-Interactive Opening: New Constructions and Stronger Definitions , 2010, AFRICACRYPT.

[7]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[8]  Vitaly Shmatikov,et al.  Efficient Two-Party Secure Computation on Committed Inputs , 2007, EUROCRYPT.

[9]  Carmit Hazay,et al.  Efficient Set Operations in the Presence of Malicious Adversaries , 2010, Journal of Cryptology.

[10]  Tatsuaki Okamoto,et al.  Homomorphic Encryption and Signatures from Vector Decomposition , 2008, Pairing.

[11]  Dan Boneh,et al.  Evaluating 2-DNF Formulas on Ciphertexts , 2005, TCC.

[12]  Ivan Damgård,et al.  Non-interactive Proofs for Integer Multiplication , 2007, EUROCRYPT.

[13]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[14]  Rafail Ostrovsky,et al.  Secure two-party k-means clustering , 2007, CCS '07.

[15]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[16]  Ivan Damgård,et al.  Multiparty Computation from Threshold Homomorphic Encryption , 2000, EUROCRYPT.

[17]  Chris Clifton,et al.  Privacy-preserving clustering with distributed EM mixture modeling , 2004, Knowledge and Information Systems.

[18]  Ivan Damgård,et al.  Public-Key Encryption with Non-interactive Opening , 2008, CT-RSA.

[19]  Chris Clifton,et al.  Privately Computing a Distributed k-nn Classifier , 2004, PKDD.

[20]  Murat Kantarcioglu,et al.  Privacy-preserving data mining in the malicious model , 2008, Int. J. Inf. Comput. Secur..

[21]  Yehuda Lindell,et al.  Security Against Covert Adversaries: Efficient Protocols for Realistic Adversaries , 2007, Journal of Cryptology.

[22]  Robert H. Deng,et al.  Efficient CCA-Secure PKE from Identity-Based Techniques , 2010, CT-RSA.

[23]  Chunhua Su,et al.  Security and Correctness Analysis on Privacy-Preserving k-Means Clustering Schemes , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..