True Friends Let You Down: Benchmarking Social Graph Anonymization Schemes

Greater demand for social graph data among researchers and analysts has fueled an increase in such datasets being published. Consequently, concerns about privacy breach have also risen steadily. To mitigate privacy risks a myriad of social graph anonymization schemes have been proposed. Anonymizing high dimensional data is a very hard problem and conventionally it is considered unwise to publish graph data even without identifiers. Often the schemes proposed provide no proof of efficacy and are designed to defeat only a narrow set of attacks. To facilitate benchmarking of perturbation-based social graph anonymization schemes we propose a machine learning framework which provides a quick and automated platform to evaluate and compare the schemes. We present a mechanism to train the framework without ground truth. We present graph structure based node features that can be easily tuned to accommodate weak or strong adversaries as well as node and edge attributes. The framework provides a granular graph structure-based metric to capture the likelihood of a node being re-identified. We conduct a thorough analysis of the effect of graph perturbation on anonymity achieved and utility preserved using publicly available real world social graphs. To this end we analyze six popular graph perturbation schemes including those promising k-anonymity. Our techniques automate weeding out poor anonymization schemes. Experiments show that it is hard to provide anonymity while preserving utility whereas some schemes destroy utility without providing much anonymity. All useful anonymization schemes leave a fraction a true edges intact and these true friends lead to the re-identification of nodes.

[1]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[2]  Siddharth Srivastava,et al.  Anonymizing Social Networks , 2007 .

[3]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[4]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[5]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2010, The VLDB Journal.

[6]  Lian Liu,et al.  Privacy Preserving in Social Networks Against Sensitive Edge Disclosure , 2008 .

[7]  Xiaowei Ying,et al.  On link privacy in randomizing social networks , 2010, Knowledge and Information Systems.

[8]  Elaine Shi,et al.  Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge , 2011, The 2011 International Joint Conference on Neural Networks.

[9]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[10]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[11]  Balachander Krishnamurthy,et al.  Class-based graph anonymization for social network data , 2009, Proc. VLDB Endow..

[12]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[13]  Yanghua Xiao,et al.  k-symmetry model for identity anonymization in social networks , 2010, EDBT '10.

[14]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[15]  Prateek Mittal,et al.  SecGraph: A Uniform and Open-source Evaluation System for Graph Data Anonymization and De-anonymization , 2015, USENIX Security Symposium.

[16]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[17]  Justin Zhan,et al.  Measuring Topological Anonymity in Social Networks , 2007 .

[18]  Chedy Raïssi,et al.  Delineating social network data anonymization via random edge perturbation , 2012, CIKM.

[19]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[20]  Christopher Krügel,et al.  A Practical Attack to De-anonymize Social Network Users , 2010, 2010 IEEE Symposium on Security and Privacy.

[21]  Xiaowei Ying,et al.  Graph Generation with Prescribed Feature Constraints , 2009, SDM.

[22]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[23]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[25]  Lei Zou,et al.  K-Automorphism: A General Framework For Privacy Preserving Network Publication , 2009, Proc. VLDB Endow..

[26]  Rudolf Mathon,et al.  A Note on the Graph Isomorphism counting Problem , 1979, Inf. Process. Lett..

[27]  Jia Liu,et al.  K-isomorphism: privacy preserving network publication against structural attacks , 2010, SIGMOD Conference.

[28]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[29]  George Danezis,et al.  An Automated Social Graph De-anonymization Technique , 2014, WPES.

[30]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[31]  Danfeng Yao,et al.  The union-split algorithm and cluster-based anonymization of social networks , 2009, ASIACCS '09.

[32]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[33]  Tamir Tassa,et al.  Identity obfuscation in graphs through the information theoretic lens , 2011, ICDE.

[34]  Ting Yu,et al.  Anonymizing bipartite graph data using safe groupings , 2008, Proc. VLDB Endow..

[35]  Lei Chen,et al.  A Survey of Privacy-Preservation of Graphs and Social Networks , 2010, Managing and Mining Graph Data.

[36]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[37]  Christos Faloutsos,et al.  It's who you know: graph mining using recursive structural features , 2011, KDD.

[38]  Nishchol Mishra,et al.  Privacy in Social Networks : A Survey , 2013 .