Preserving Differential Privacy in Degree-Correlation based Graph Generation

Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then generate the graphs for releasing using the graph model with the private parameters. In particular, we develop a differential privacy preserving graph generator based on the dK-graph generation model. We first derive from the original graph various parameters (i.e., degree correlations) used in the dK-graph model, then enforce edge differential privacy on the learned parameters, and finally use the dK-graph model with the perturbed parameters to generate graphs. For the 2K-graph model, we enforce the edge differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We conduct experiments on four real networks and compare the performance of our private dK-graph models with the stochastic Kronecker graph generation model in terms of utility and privacy tradeoff. Empirical evaluations show the developed private dK-graph generation models significantly outperform the approach based on the stochastic Kronecker generation model.

[1]  David F. Gleich,et al.  Moment-Based Estimation of Stochastic Kronecker Graph Parameters , 2011, Internet Math..

[2]  Jian Pei,et al.  Preserving Privacy in Social Networks Against Neighborhood Attacks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[3]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[4]  Jon M. Kleinberg,et al.  Wherefore art thou R3579X? , 2011, Commun. ACM.

[5]  Xiaowei Ying,et al.  On link privacy in randomizing social networks , 2010, Knowledge and Information Systems.

[6]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[7]  Yang Xiang,et al.  On learning cluster coefficient of private networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[8]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[9]  Christos Faloutsos,et al.  Scalable modeling of real graphs using Kronecker multiplication , 2007, ICML '07.

[10]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[11]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[12]  Xiaowei Ying,et al.  Graph Generation with Prescribed Feature Constraints , 2009, SDM.

[13]  Lei Zou,et al.  K-Automorphism: A General Framework For Privacy Preserving Network Publication , 2009, Proc. VLDB Endow..

[14]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[15]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[16]  Ben Y. Zhao,et al.  Sharing graphs using differentially private graph models , 2011, IMC '11.

[17]  David D. Jensen,et al.  Accurate Estimation of the Degree Distribution of Private Networks , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[18]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[19]  Carolyn J. Anderson,et al.  A p* primer: logit models for social networks , 1999, Soc. Networks.

[20]  Chris Clifton,et al.  Differential identifiability , 2012, KDD.

[21]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[22]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2010, The VLDB Journal.

[23]  Dan Suciu,et al.  Relationship privacy: output perturbation for queries with joins , 2009, PODS.

[24]  F. Chung,et al.  Connected Components in Random Graphs with Given Expected Degree Sequences , 2002 .

[25]  Priya Mahadevan,et al.  Systematic topology analysis and generation using degree correlations , 2006, SIGCOMM.

[26]  Walter Willinger,et al.  A first-principles approach to understanding the internet's router-level topology , 2004, SIGCOMM '04.

[27]  R. Pastor-Satorras,et al.  Class of correlated random networks with hidden variables. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[29]  Rebecca N. Wright,et al.  A Differentially Private Graph Estimator , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[30]  Walter Willinger,et al.  Network topology generators: degree-based vs. structural , 2002, SIGCOMM '02.

[31]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[32]  Leting Wu,et al.  Differential Privacy Preserving Spectral Graph Analysis , 2013, PAKDD.