Efficient Two-Party Privacy Preserving Collaborative k-means Clustering Protocol Supporting both Storage and Computation Outsourcing

Privacy preserving collaborative data mining aims to extract useful knowledge from distributed databases owned by multiple parties while keeping the privacy of both data and mining result. Nowadays, more and more companies reply on cloud to store data and handle with data. In this context, privacy preserving collaborative k-means clustering framework was proposed to support both storage and computation outsourcing for two parties. However, the computing cost and communication overhead are too high to practical. In this paper, we propose to encrypt each party’s data once and then store them in cloud. Privacy preserving k-means collaborative clustering protocol is executed mainly at cloud side, with total \(O(k(m+n))\)-round interactions among the two parties and the cloud. Here, m and n means that the total numbers of records for the two parties, respectively. The protocol is secure in the semi-honest security model and especially secure in the malicious model supporting only one party corrupted during k centroids re-computation. We also implement it in real cloud environment using e-health data as the testing data.

[1]  Devesh C. Jinwala,et al.  Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Verifiable Secret Sharing Scheme , 2014, Int. J. Distributed Syst. Technol..

[2]  Ximeng Liu,et al.  An Efficient Privacy-Preserving Outsourced Calculation Toolkit With Multiple Keys , 2016, IEEE Transactions on Information Forensics and Security.

[3]  Yücel Saygin,et al.  Distributed privacy preserving k-means clustering with additive secret sharing , 2008, PAIS '08.

[4]  Dongxi Liu,et al.  Privacy-Preserving and Outsourced Multi-user K-Means Clustering , 2014, 2015 IEEE Conference on Collaboration and Internet Computing (CIC).

[5]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[6]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[7]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[8]  Zoe L. Jiang,et al.  Outsourcing Two-Party Privacy Preserving K-Means Clustering Protocol in Wireless Sensor Networks , 2015, 2015 11th International Conference on Mobile Ad-hoc and Sensor Networks (MSN).

[9]  Mihir Bellare,et al.  Foundations of garbled circuits , 2012, CCS.

[10]  Robert H. Deng,et al.  Privacy-Preserving Outsourced Calculation on Floating Point Numbers , 2016, IEEE Transactions on Information Forensics and Security.

[11]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[12]  Dongxi Liu,et al.  Privacy of outsourced k-means clustering , 2014, AsiaCCS.

[13]  Somesh Jha,et al.  Privacy Preserving Clustering , 2005, ESORICS.

[14]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[15]  Jin Li,et al.  Privacy-preserving outsourced classification in cloud computing , 2017, Cluster Computing.

[16]  K. Srinathan,et al.  Efficient Privacy Preserving K-Means Clustering , 2010, PAISI.

[17]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[18]  Devesh C. Jinwala,et al.  Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Zero Knowledge Proof , 2013, ICDCIT.

[19]  Siu-Ming Yiu,et al.  Multi-key privacy-preserving deep learning in cloud computing , 2017, Future Gener. Comput. Syst..

[20]  Ye Zhang,et al.  Fast and Secure Three-party Computation: The Garbled Circuit Approach , 2015, IACR Cryptol. ePrint Arch..

[21]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[22]  Devesh C. Jinwala,et al.  An Efficient Approach for Privacy Preserving Distributed K-Means Clustering Based on Shamir's Secret Sharing Scheme , 2012, IFIPTM.

[23]  Robert H. Deng,et al.  Efficient and Privacy-Preserving Outsourced Calculation of Rational Numbers , 2018, IEEE Transactions on Dependable and Secure Computing.