Privacy-preserving data mining on data grids in the presence of malicious participants

Data privacy is a major threat to the widespread deployment of data grids in domains such as health care and finance. We propose a novel technique for obtaining knowledge - by way of a data mining model - from a data grid, while ensuring that the privacy is cryptographically secure. To the best of our knowledge, all previous approaches for solving this problem fail in the presence of malicious participants. In this paper we present an algorithm which, in addition to being secure against malicious members, is asynchronous, involves no global communication patterns, and dynamically adjusts to new data or newly added resources. As far as we know, this is the first privacy-presenting data mining algorithm to possess these features in the presence of malicious participants. Simulations of thousands of resources prove that our algorithm quickly converges to the correct result. The simulations also prove that the effect of the privacy parameter on the convergence time is logarithmic.

[1]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[2]  Ran Wolff,et al.  Privacy-preserving association rule mining in large-scale distributed systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[5]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[8]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[9]  Hiroaki Kikuchi Oblivious Counter and Majority Protocol , 2002, ISC.

[10]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[11]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[12]  Ran Wolff,et al.  Association rule mining in peer-to-peer systems , 2003, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ibrahim Matta,et al.  BRITE: Boston University Representative Internet Topology gEnerator: A Flexible Generator of Internet Topologies , 2000 .

[15]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[16]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[17]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[18]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[19]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..