Mining frequent itemsets in the presence of malicious participants

Privacy preserving data mining (PPDM) algorithms attempt to reduce the injuries to privacy caused by malicious parties during the rule mining process. Usually, these algorithms are designed for the semi-honest model, where participants do not deviate from the protocol. However, in the real-world, malicious parties may attempt to obtain the secret values of other parties by probing attacks or collusion. In this study, the authors study how to preserve the privacy of participants in a collusion-free model of the frequent itemset mining process, where the protocol protects against probing attacks and collusion. The mining of frequent itemsets is the main step of association rule mining algorithms, and, in this study, the authors propose two privacy-preserving frequent itemset mining algorithms for both two-party and multi-party states in a collusion-free model for vertically partitioned (heterogeneous) data; in addition, a privacy measuring technique is proposed, which quantifies privacy based on the amount of disclosed sensitive information.

[1]  Sheng Zhong,et al.  Privacy-preserving algorithms for distributed mining of frequent itemsets , 2007, Inf. Sci..

[2]  Ivan Damgård,et al.  Homomorphic encryption and secure comparison , 2008, Int. J. Appl. Cryptogr..

[3]  Yücel Saygin,et al.  Privacy Preserving Clustering on Horizontally Partitioned Data , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[4]  Bart Goethals,et al.  On Private Scalar Product Computation for Privacy-Preserving Data Mining , 2004, ICISC.

[5]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[6]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[7]  Wenliang Du,et al.  Building decision tree classifier on private data , 2002 .

[8]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[9]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[10]  Divyakant Agrawal,et al.  Privacy preserving decision tree learning over multiple parties , 2007, Data Knowl. Eng..

[11]  Yanchun Zhang,et al.  Privacy-preserving distributed association rule mining via semi-trusted mixer , 2007, Data Knowl. Eng..

[12]  Jaideep Vaidya,et al.  Privacy preserving association rule mining in vertically partitioned data , 2002, KDD.

[13]  Rebecca N. Wright,et al.  Privacy-preserving Bayesian network structure computation on distributed heterogeneous data , 2004, KDD.

[14]  Tsvi Kuflik,et al.  PRAW - A PRivAcy model for the Web , 2005, J. Assoc. Inf. Sci. Technol..

[15]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[16]  Stan Matwin,et al.  Privacy-Preserving Collaborative Association Rule Mining , 2005, ICEB.

[17]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[18]  Ehud Gudes,et al.  Association rules mining in vertically partitioned databases , 2006, Data Knowl. Eng..

[19]  Dan Boneh,et al.  The Decision Diffie-Hellman Problem , 1998, ANTS.

[20]  Taher ElGamal,et al.  A public key cyryptosystem and signature scheme based on discrete logarithms , 1985 .