Mutual Privacy Preserving $k$ -Means Clustering in Social Participatory Sensing

In this paper, we consider the problem of mutual privacy protection in social participatory sensing in which individuals contribute their private information to build a (virtual) community. Particularly, we propose a mutual privacy preserving <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-means clustering scheme that neither discloses an individual's private information nor leaks the community's characteristic data (clusters). Our scheme contains two privacy-preserving algorithms called at each iteration of the <inline-formula> <tex-math notation="LaTeX">$k$</tex-math></inline-formula>-means clustering. The first one is employed by each participant to find the nearest cluster while the cluster centers are kept secret to the participants; and the second one computes the cluster centers without leaking any cluster center information to the participants while preventing each participant from figuring out other members in the same cluster. An extensive performance analysis is carried out to show that our approach is effective for <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -means clustering, can resist collusion attacks, and can provide mutual privacy protection even when the data analyst colludes with all except one participant.

[1]  Safia Nait Bahloul,et al.  Privacy preserving k-means clustering: a survey research , 2012, Int. Arab J. Inf. Technol..

[2]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[3]  Rebecca N. Wright,et al.  A New Privacy-Preserving Distributed k-Clustering Algorithm , 2006, SDM.

[4]  Devesh C. Jinwala,et al.  Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Zero Knowledge Proof , 2013, ICDCIT.

[5]  Guo Li,et al.  A Big Data Clustering Algorithm for Mitigating the Risk of Customer Churn , 2016, IEEE Transactions on Industrial Informatics.

[6]  D. T. Lee,et al.  Multi-party k-Means Clustering with Privacy Consideration , 2010, International Symposium on Parallel and Distributed Processing with Applications.

[7]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[8]  Rafail Ostrovsky,et al.  Secure two-party k-means clustering , 2007, CCS '07.

[9]  Rui Zhang,et al.  PriSense: Privacy-Preserving Data Aggregation in People-Centric Urban Sensing Systems , 2010, 2010 Proceedings IEEE INFOCOM.

[10]  Caroline Fontaine,et al.  A Survey of Homomorphic Encryption for Nonspecialists , 2007, EURASIP J. Inf. Secur..

[11]  Yanchun Zhang,et al.  Equally contributory privacy-preserving k-means clustering over vertically partitioned data , 2013, Inf. Syst..

[12]  Stefan Katzenbeisser,et al.  Protection and Retrieval of Encrypted Multimedia Content: When Cryptography Meets Signal Processing , 2007, EURASIP J. Inf. Secur..

[13]  L. Festinger A Theory of Social Comparison Processes , 1954 .

[14]  Dongxi Liu,et al.  Privacy-Preserving and Outsourced Multi-user K-Means Clustering , 2014, 2015 IEEE Conference on Collaboration and Internet Computing (CIC).

[15]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[16]  Yücel Saygin,et al.  Privacy Preserving Clustering on Horizontally Partitioned Data , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[17]  Jerzy W. Jaromczyk,et al.  Privacy preserving two-party k-means clustering over vertically partitioned dataset , 2011, Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics.

[18]  Somesh Jha,et al.  Privacy Preserving Clustering , 2005, ESORICS.

[19]  Pei-Yu Lin,et al.  Distributed Secret Sharing Approach With Cheater Prevention Based on QR Code , 2016, IEEE Transactions on Industrial Informatics.

[20]  Rebecca N. Wright,et al.  Communication-Efficient Privacy-Preserving Clustering , 2010, Trans. Data Priv..

[21]  Wenliang Du,et al.  Privacy-preserving cooperative statistical analysis , 2001, Seventeenth Annual Computer Security Applications Conference.

[22]  Keng-Pei Lin Privacy-preserving kernel k-means clustering outsourcing with random transformation , 2016, Knowledge and Information Systems.

[23]  Dongxi Liu,et al.  Privacy of outsourced k-means clustering , 2014, AsiaCCS.

[24]  Zekeriya Erkin,et al.  Privacy-preserving distributed clustering , 2013, EURASIP J. Inf. Secur..

[25]  Luis Orozco-Barbosa,et al.  Privacy Preserving k-Means Clustering in Multi-Party Environment , 2007, SECRYPT.

[26]  Rebecca N. Wright,et al.  Privacy-preserving distributed k-means clustering over arbitrarily partitioned data , 2005, KDD '05.

[27]  Yücel Saygin,et al.  Distributed privacy preserving k-means clustering with additive secret sharing , 2008, PAIS '08.

[28]  Yu Zheng,et al.  Methodologies for Cross-Domain Data Fusion: An Overview , 2015, IEEE Transactions on Big Data.