Private analysis of graph structure

We present efficient algorithms for releasing useful statistics about graph data while providing rigorous privacy guarantees. Our algorithms work on datasets that consist of relationships between individuals, such as social ties or email communication. The algorithms satisfy edge differential privacy, which essentially requires that the presence or absence of any particular relationship be hidden. Our algorithms output approximate answers to subgraph counting queries. Given a query graph H, for example, a triangle, k-star, or k-triangle, the goal is to return the number of edge-induced isomorphic copies of H in the input graph. The special case of triangles was considered by Nissim et al. [2007] and a more general investigation of arbitrary query graphs was initiated by Rastogi et al. [2009]. We extend the approach of Nissim et al. to a new class of statistics, namely k-star queries. We also give algorithms for k-triangle queries using a different approach based on the higher-order local sensitivity. For the specific graph statistics we consider (i.e., k-stars and k-triangles), we significantly improve on the work of Rastogi et al.: our algorithms satisfy a stronger notion of privacy that does not rely on the adversary having a particular prior distribution on the data, and add less noise to the answers before releasing them. We evaluate the accuracy of our algorithms both theoretically and empirically, using a variety of real and synthetic datasets. We give explicit, simple conditions under which these algorithms add a small amount of noise. We also provide the average-case analysis in the Erdős-Rényi-Gilbert G(n,p) random graph model. Finally, we give hardness results indicating that the approach Nissim et al. used for triangles cannot easily be extended to k-triangles (hence justifying our development of a new algorithmic approach).

[1]  David R. Hunter,et al.  Curved exponential family models for social networks , 2007, Soc. Networks.

[2]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[3]  Anastasios Kementsietsidis,et al.  Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013 , 2013, SIGMOD Conference.

[4]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[5]  Cynthia Dwork,et al.  The Differential Privacy Frontier (Extended Abstract) , 2009, TCC.

[6]  Bing-Rong Lin,et al.  Information preservation in statistical privacy and bayesian estimation of unattributed histograms , 2013, SIGMOD '13.

[7]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[8]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[9]  Adam D. Smith,et al.  The price of privately releasing contingency tables and the spectra of random matrices with correlated rows , 2010, STOC '10.

[10]  Avrim Blum,et al.  Differentially private data analysis of social networks via restricted sensitivity , 2012, ITCS '13.

[11]  Aleksandra B. Slavkovic,et al.  Differentially Private Graphical Degree Sequences and Synthetic Graphs , 2012, Privacy in Statistical Databases.

[12]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data , 2009 .

[13]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[14]  Cynthia Dwork The Differential Privacy Frontier , 2009 .

[15]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[16]  KarwaVishesh,et al.  Private Analysis of Graph Structure , 2014 .

[17]  Johannes Gehrke,et al.  Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy , 2011, TCC.

[18]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[19]  Cynthia Dwork,et al.  Differential Privacy for Statistics: What we Know and What we Want to Learn , 2010, J. Priv. Confidentiality.

[20]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[21]  Rebecca N. Wright,et al.  A differentially private estimator for the stochastic Kronecker graph model , 2012, EDBT-ICDT '12.

[22]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[23]  David D. Jensen,et al.  Accurate Estimation of the Degree Distribution of Private Networks , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[24]  Aaron Roth,et al.  Privacy and mechanism design , 2013, SECO.

[25]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[26]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[27]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[28]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[29]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[30]  Van H. Vu,et al.  Concentration of non‐Lipschitz functions and applications , 2002, Random Struct. Algorithms.

[31]  Noga Alon,et al.  The Probabilistic Method, Second Edition , 2004 .

[32]  Shuigeng Zhou,et al.  Recursive mechanism: towards node differential privacy and unrestricted joins , 2013, SIGMOD '13.

[33]  S. Wasserman,et al.  Models and Methods in Social Network Analysis: An Introduction to Random Graphs, Dependence Graphs, and p * , 2005 .

[34]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[35]  Dan Suciu,et al.  Relationship privacy: output perturbation for queries with joins , 2009, PODS.

[36]  P. Pattison,et al.  New Specifications for Exponential Random Graph Models , 2006 .

[37]  Garry Robins,et al.  An introduction to exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[38]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .