Privacy and Anonymization in Social Networks

As the Internet continues to grow, the proliferation of online social networks raises many privacy concerns. The users of these OSNs are divulging endless details about their lives online. This personal information can be used by attackers to perpetrate significant privacy breaches and carry out attacks such as identity theft and credit card fraud. The privacy concerns arise from not just the users posting their personal information online, but also from OSNs publishing this information for analysis. Driven by Web 2.0 applications, more and more social network has been made publicly available. Preserving the privacy of individuals in this published data is an important concern. Although privacy preservation in data publishing has been studied extensively and several important models such as k- anonymity and l-diversity as well as many efficient algorithms have been proposed, most of the existing studies deal with relational data only. Those methods cannot be applied to social network data straightforwardly. Anonymization of social network data is a much more challenging task than anonymizing relational data. Firstly, in relational databases, attacks come from identifying individuals from quasi-identifiers. But in social networks, information such as neighbourhood graphs can be used to identify individuals. Secondly, tuples can be anonymized in relational data without affecting other tuples. But in social networks, adding edges or vertices affects the neighbourhoods of other vertices in the graph as well. In this chapter, we give a brief overview of the privacy concerns in online social networks and provide a detailed description of our algorithm, GASNA, a greedy algorithm for social network anonymization. This algorithm provides structural anonymity and sensitive attribute protection by achieving k-anonymity and l-diversity in social network data. We also discuss the challenges faced by the existing algorithms/models for social network data privacy and suggest techniques to counter these challenges. The issues discussed are the high cost of achieving k-anonymity when the value of k is fixed and the need for a better anonymity model which suits the current scenario of social networks. We also propose a new model called partial anonymity which can help reduce the number of edges added for anonymization when the value d of d-neighbourhood is greater than 1.

[1]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[2]  Ling Liu,et al.  Supporting anonymous location queries in mobile environments with privacygrid , 2008, WWW.

[3]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[5]  Yufei Tao,et al.  Personalized privacy preservation , 2006, Privacy-Preserving Data Mining.

[6]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[7]  Vern Paxson,et al.  A high-level programming environment for packet trace anonymization and transformation , 2003, SIGCOMM '03.

[8]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[9]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[10]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[11]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[12]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Yansheng Lu,et al.  Preservation of Privacy in Publishing Social Network Data , 2008, 2008 International Symposium on Electronic Commerce and Security.

[14]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  B. K. Tripathy,et al.  An algorithm to achieve k-anonymity and l-diversity anonymisation in social networks , 2012, 2012 Fourth International Conference on Computational Aspects of Social Networks (CASoN).

[16]  J. Gross,et al.  Graph Theory and Its Applications , 1998 .

[17]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Bala Krishna Tripathy,et al.  Anonymisation of Social Networks and Rough Set Approach , 2012 .

[19]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  Chris Clifton,et al.  Hiding the presence of individuals from shared databases , 2007, SIGMOD '07.

[21]  Chris Clifton,et al.  Thoughts on k-Anonymization , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[22]  Dan Suciu,et al.  A formal analysis of information disclosure in data exchange , 2004, SIGMOD '04.

[23]  Ajith Abraham,et al.  Computational Social Networks , 2012, Springer London.

[24]  Indrakshi Ray,et al.  A crossover operator for the k- anonymity problem , 2006, GECCO '06.

[25]  Bradley Malin,et al.  Technical Evaluation: An Evaluation of the Current State of Genomic Data Privacy Protection Technology and a Roadmap for the Future , 2004, J. Am. Medical Informatics Assoc..

[26]  B. K. Tripathy,et al.  GASNA: Greedy algorithm for social network anonymization , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[27]  Lise Getoor,et al.  Link mining: a survey , 2005, SKDD.

[28]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[29]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[30]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[31]  Yehuda Lindell,et al.  More Efficient Constant-Round Multi-Party Computation from BMR and SHE , 2016, IACR Cryptol. ePrint Arch..

[32]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[33]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[34]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2008, The VLDB Journal.