Distributed Privacy Preserving Clustering via Homomorphic Secret Sharing and Its Application to (Vertically) Partitioned Spatio-Temporal Data

Recent concerns about privacy issues have motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. One approach to develop privacy preserving data mining algorithms is secure multiparty computation, which allows for privacy preserving data mining algorithms that do not trade accuracy for privacy. However, earlier methods suffer from very high communication and computational costs, making them infeasible to use in any real world scenario. Moreover, these algorithms have strict assumptions on the involved parties, assuming involved parties will not collude with each other. In this paper, the authors propose a new secure multiparty computation based k-means clustering algorithm that is both secure and efficient enough to be used in a real world scenario. Experiments based on realistic scenarios reveal that this protocol has lower communication costs and significantly lower computational costs.

[1]  Jill Slay,et al.  Voice Over IP: Privacy and Forensic Implications , 2009, Int. J. Digit. Crime Forensics.

[2]  Yücel Saygin,et al.  Distributed privacy preserving k-means clustering with additive secret sharing , 2008, PAIS '08.

[3]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[4]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[5]  Grigore-Octav Stan,et al.  Cross-Border Transfer of Personal Data: The Example of Romanian Legislation , 2011 .

[6]  Rebecca N. Wright,et al.  Privacy-preserving Bayesian network structure computation on distributed heterogeneous data , 2004, KDD.

[7]  Moti Yung,et al.  Non-interactive cryptocomputing for NC/sup 1/ , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[8]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[9]  Dianne Nicol,et al.  Genetic Testing and Protection of Genetic Privacy: A Comparative Legal Analysis in Europe and Australia , 2011 .

[10]  Taneli Mielikäinen,et al.  Cryptographically private support vector machines , 2006, KDD '06.

[11]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[12]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[13]  Marc Fischlin,et al.  A Cost-Effective Pay-Per-Multiplication Comparison Method for Millionaires , 2001, CT-RSA.

[14]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[15]  Matthew Sorell,et al.  Reliable Motion Detection, Location and Audit in Surveillance Video , 2009, Int. J. Digit. Crime Forensics.

[16]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[17]  Yücel Saygin,et al.  Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing , 2007, PAKDD Workshops.

[18]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[19]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.