Resisting structural re-identification in anonymized social networks

We identify privacy risks associated with releasing network datasets and provide an algorithm that mitigates those risks. A network dataset is a graph representing entities connected by edges representing relations such as friendship, communication or shared activity. Maintaining privacy when publishing a network dataset is uniquely challenging because an individual’s network context can be used to identify them even if other identifying information is removed. In this paper, we introduce a parameterized model of structural knowledge available to the adversary and quantify the success of attacks on individuals in anonymized networks. We show that the risks of these attacks vary based on network structure and size and provide theoretical results that explain the anonymity risk in random networks. We then propose a novel approach to anonymizing network data that models aggregate network structure and allows analysis to be performed by sampling from the model. The approach guarantees anonymity for entities in the network while allowing accurate estimates of a variety of network measures with relatively little bias.

[1]  Tsan-sheng Hsu,et al.  Privacy Protection in Social Network Data Disclosure Based on Granular Computing , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[2]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[3]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[4]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[5]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[6]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[7]  Ashwin Machanavajjhala,et al.  Worst-Case Background Knowledge in Privacy , 2006 .

[8]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[9]  D. Corneil,et al.  An Efficient Algorithm for Graph Isomorphism , 1970, JACM.

[10]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[11]  Dan Suciu,et al.  The Boundary Between Privacy and Utility in Data Publishing , 2007, VLDB.

[12]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  Rajeev Motwani,et al.  Link Privacy in Social Networks , 2008, ICDE.

[14]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[15]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[16]  Joshua A. Grochow,et al.  Network Motif Discovery Using Subgraph Enumeration and Symmetry-Breaking , 2007, RECOMB.

[17]  D. West Introduction to Graph Theory , 1995 .

[18]  Lisa Singh,et al.  Measuring Topological Anonymity in Social Networks , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[19]  E. Lander,et al.  Describing Graphs: A First-Order Approach to Graph Canonization , 1990 .

[20]  Philippe Golle,et al.  Private social network analysis: how to assemble pieces of a graph privately , 2006, WPES '06.

[21]  Ginestra Bianconi,et al.  Emergence of large cliques in random scale-free networks , 2006 .

[22]  László Babai,et al.  Canonical labelling of graphs in linear average time , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[23]  R. Rothenberg,et al.  Risk network structure in the early epidemic phase of HIV transmission in Colorado Springs , 2002, Sexually transmitted infections.

[24]  Siddharth Srivastava,et al.  Anonymizing Social Networks , 2007 .

[25]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[26]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[27]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[28]  Noah E. Friedkin,et al.  Horizons of Observability and Limits of Informal Control in Organizations , 1983 .