Classification of Privacy-preserving Distributed Data Mining protocols

Recently, a new research area, named Privacy-preserving Distributed Data Mining (PPDDM) has emerged. It aims at solving the following problem: a number of participants want to jointly conduct a data mining task based on the private data sets held by each of the participants. This problem setting has captured attention and interests of researchers, practitioners and developers from the communities of both data mining and information security. They have made great progress in designing and developing solutions to address this scenario. However, researchers and practitioners are now faced with a challenge on how to devise a standard on synthesizing and evaluating various PPDDM protocols, because they have been confused by the excessive number of techniques developed so far. In this paper, we put forward a framework to synthesize and characterize existing PPDDM protocols so as to provide a standard and systematic approach of understanding PPDDM-related problems, analyzing PPDDM requirements and designing effective and efficient PPDDM protocols.

[1]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[2]  Chris Clifton,et al.  Privacy-Preserving Decision Trees over Vertically Partitioned Data , 2005, DBSec.

[3]  Weimin Ouyang,et al.  Privacy Preserving Sequential Pattern Mining Based on Secure Multi-party Computation , 2006, 2006 IEEE International Conference on Information Acquisition.

[4]  Wenliang Du,et al.  Building decision tree classifier on private data , 2002 .

[5]  Chris Clifton,et al.  Privacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data , 2004, SDM.

[6]  Rebecca N. Wright,et al.  Privacy-preserving Bayesian network structure computation on distributed heterogeneous data , 2004, KDD.

[7]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[8]  Zhenmin Lin Privacy Preserving Distributed Data Mining , 2012 .

[9]  Kun Liu,et al.  Privacy Sensitive Distributed Data Mining from Multi-party Data , 2003, ISI.

[10]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[11]  Yücel Saygin,et al.  Privacy Preserving Clustering on Horizontally Partitioned Data , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[12]  Weimin Ouyang,et al.  Privacy preserving association rules mining based on secure two-party computation , 2006 .

[13]  Divyakant Agrawal,et al.  Privacy preserving decision tree learning over multiple parties , 2007, Data Knowl. Eng..

[14]  Yanchun Zhang,et al.  Privacy-preserving distributed association rule mining via semi-trusted mixer , 2007, Data Knowl. Eng..

[15]  Artak Amirbekyan,et al.  Privacy-Preserving k-NN for Small and Large Data Sets , 2007, Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007).

[16]  Rebecca N. Wright,et al.  A New Privacy-Preserving Distributed k-Clustering Algorithm , 2006, SDM.

[17]  Jaideep Vaidya,et al.  Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data , 2003 .

[18]  Ling Liu,et al.  k nearest neighbor classification across multiple private databases , 2006, CIKM '06.

[19]  Joan Feigenbaum,et al.  Implementing Privacy-Preserving Bayesian-Net Discovery for Vertically Partitioned Data , 2005 .

[20]  Stan Matwin,et al.  Privacy Preserving K-nearest Neighbor Classification , 2005, Int. J. Netw. Secur..

[21]  Yanchun Zhang,et al.  Privacy-preserving naive Bayes classification on distributed data via semi-trusted mixers , 2009, Inf. Syst..

[22]  Yucel Saygin,et al.  Secret charing vs. encryption-based techniques for privacy preserving data mining , 2007 .

[23]  Yücel Saygin,et al.  Privacy preserving clustering on horizontally partitioned data , 2007, Data Knowl. Eng..

[24]  Chris Clifton Privacy Preserving Distributed Data Mining , 2001 .

[25]  Stan Matwin,et al.  Privacy-Preserving Collaborative Association Rule Mining , 2005, ICEB.

[26]  Elisa Bertino,et al.  A Framework for Evaluating Privacy Preserving Data Mining Algorithms* , 2005, Data Mining and Knowledge Discovery.

[27]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[28]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[29]  Ling Liu,et al.  Mining multiple private databases using a kNN classifier , 2007, SAC '07.

[30]  Sheng Zhong,et al.  Privacy-Preserving Classification of Customer Data without Loss of Accuracy , 2005, SDM.

[31]  Chris Clifton,et al.  Privately Computing a Distributed k-nn Classifier , 2004, PKDD.

[32]  Yücel Saygin,et al.  Distributed privacy preserving k-means clustering with additive secret sharing , 2008, PAIS '08.

[33]  Maguelonne Teisseire,et al.  Privacy preserving sequential pattern mining in distributed databases , 2006, CIKM '06.

[34]  Ehud Gudes,et al.  Privacy preserving Data Mining Algorithms without the use of Secure Computation or Perturbation , 2006, 2006 10th International Database Engineering and Applications Symposium (IDEAS'06).

[35]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[36]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[37]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[38]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[39]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[40]  Yücel Saygin,et al.  Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing , 2007, PAKDD Workshops.

[41]  Peng Zhang,et al.  Multiple-Criteria Linear Programming for VIP E-Mail Behavior Analysis , 2007 .