A Hybrid Algorithm for Privacy Preserving Social Network Publication

With the rapid growth of social networks, privacy issues have been raised for publishing data to third parties. Simply removing the identifying attributes before publishing the social network data is considered to be an ill-advised practice, because the structural characteristic may reveal the users privacy. We discuss the current techniques for publishing social network data and define a privacy preserving social network data publishing model with confidence p. Then we devise a hybrid privacy preserving algorithm satisfying the defined model for publishing social network data. Combining the features of k-anonymity with randomization, the algorithm uses the k-anonymous concept to hide the sensitive information into the natural groups of social network data and employs random approach to process the residual data. We conduct the algorithm on several real-world datasets, the experimental results show that our algorithm is practical and efficient. Compared with the related k-anonymity and random methods, our algorithm is stable and modifies the original data less than the existing algorithms.

[1]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[2]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[3]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[4]  Chengqi Zhang,et al.  Association Rule Mining , 2002, Lecture Notes in Computer Science.

[5]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[6]  Lei Zou,et al.  K-Automorphism: A General Framework For Privacy Preserving Network Publication , 2009, Proc. VLDB Endow..

[7]  Siddharth Srivastava,et al.  Anonymizing Social Networks , 2007 .

[8]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[9]  Shichao Zhang,et al.  "Missing is useful": missing values in cost-sensitive decision trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[10]  Xiaowei Ying,et al.  On link privacy in randomizing social networks , 2010, Knowledge and Information Systems.

[11]  Danah Boyd,et al.  Social Network Sites: Definition, History, and Scholarship , 2007, J. Comput. Mediat. Commun..

[12]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[13]  Yanghua Xiao,et al.  k-symmetry model for identity anonymization in social networks , 2010, EDBT '10.

[14]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[16]  Xiaowei Ying,et al.  Comparisons of randomization and K-degree anonymization schemes for privacy preserving social network publishing , 2009, SNA-KDD '09.

[17]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[18]  Philip S. Yu,et al.  Personalized Privacy Protection in Social Networks , 2010, Proc. VLDB Endow..

[19]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[20]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.