Anonymization of Centralized and Distributed Social Networks by Sequential Clustering

We study the problem of privacy-preservation in social networks. We consider the distributed setting in which the network data is split between several data holders. The goal is to arrive at an anonymized view of the unified network without revealing to any of the data holders information about links between nodes that are controlled by other data holders. To that end, we start with the centralized setting and offer two variants of an anonymization algorithm which is based on sequential clustering (Sq). Our algorithms significantly outperform the SaNGreeA algorithm due to Campan and Truta which is the leading algorithm for achieving anonymity in networks by means of clustering. We then devise secure distributed versions of our algorithms. To the best of our knowledge, this is the first study of privacy preservation in distributed social networks. We conclude by outlining future research proposals in that direction.

[1]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[2]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[3]  Lei Chen,et al.  A Survey of Privacy-Preservation of Graphs and Social Networks , 2010, Managing and Mining Graph Data.

[4]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2010, The VLDB Journal.

[5]  Rajeev Motwani,et al.  Anonymizing Tables , 2005, ICDT.

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[8]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[9]  Chris Clifton,et al.  A secure distributed framework for achieving k-anonymity , 2006, The VLDB Journal.

[10]  KantarciogluMurat,et al.  Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data , 2004 .

[11]  Yanghua Xiao,et al.  k-symmetry model for identity anonymization in social networks , 2010, EDBT '10.

[12]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[13]  Scott Kirkpatrick,et al.  Optimization by Simmulated Annealing , 1983, Sci..

[14]  JiangWei,et al.  A secure distributed framework for achieving k-anonymity , 2006, VLDB 2006.

[15]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[16]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Gemma C. Garriga,et al.  Randomization Techniques for Graphs , 2009, SDM.

[18]  Xiaowei Ying,et al.  Graph Generation with Prescribed Feature Constraints , 2009, SDM.

[19]  Chris Clifton,et al.  Thoughts on k-Anonymization , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[20]  Xiaowei Ying,et al.  On link privacy in randomizing social networks , 2010, Knowledge and Information Systems.

[21]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[22]  Naftali Tishby,et al.  Unsupervised document classification using sequential information maximization , 2002, SIGIR '02.

[23]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[24]  Sheng Zhong,et al.  Privacy-enhancing k-anonymization of customer data , 2005, PODS.

[25]  Siddharth Srivastava,et al.  Anonymizing Social Networks , 2007 .

[26]  Ran Wolff,et al.  Privacy-preserving association rule mining in large-scale distributed systems , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[27]  Josh Benaloh,et al.  Secret Sharing Homomorphisms: Keeping Shares of A Secret Sharing , 1986, CRYPTO.

[28]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[29]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[30]  Tamir Tassa,et al.  Identity obfuscation in graphs through the information theoretic lens , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[31]  Tamir Tassa,et al.  Efficient Anonymizations with Enhanced Utility , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[32]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[33]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.