De-anonymization of Social Networks with Communities: When Quantifications Meet Algorithms

A crucial privacy-driven issue nowadays is re-identifying anonymized social networks by mapping them to correlated cross-domain auxiliary networks. Prior works are typically based on modeling social networks as random graphs representing users and their relations, and subsequently quantify the quality of mappings through cost functions that are proposed without sufficient rationale. Also, it remains unknown how to algorithmically meet the demand of such quantifications, i.e., to find the minimizer of the cost functions. We address those concerns in a more realistic social network modeling parameterized by community structures that can be leveraged as side information for de-anonymization. By Maximum A Posteriori (MAP) estimation, our first contribution is new and well justified cost functions, which, when minimized, enjoy superiority to previous ones in finding the correct mapping with the highest probability. The feasibility of the cost functions is then for the first time algorithmically characterized. While proving the general multiplicative inapproximability, we are able to propose two algorithms, which, respectively, enjoy an \epsilon-additive approximation and a conditional optimality in carrying out successful user re-identification. Our theoretical findings are empirically validated, with a notable dataset extracted from rare true cross-domain networks that reproduce genuine social network de-anonymization. Both theoretical and empirical observations also manifest the importance of community information in enhancing privacy inferencing.

[1]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[2]  Silvio Lattanzi,et al.  An efficient reconciliation algorithm for social networks , 2013, Proc. VLDB Endow..

[3]  Matthias Grossglauser,et al.  On the privacy of anonymized networks , 2011, KDD.

[4]  Rong Yan,et al.  Social influence in social advertising: evidence from field experiments , 2012, EC '12.

[5]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[6]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Elza Erkip,et al.  Optimal de-anonymization in random graphs with community structure , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[8]  Lei Ying,et al.  The Value of Privacy: Strategic Data Subjects, Incentive Mechanisms and Fundamental Limits , 2016, SIGMETRICS.

[9]  Donald Goldfarb,et al.  An O(n3L) primal interior point algorithm for convex quadratic programming , 1991, Math. Program..

[10]  J. Bunch The weak and strong stability of algorithms in numerical linear algebra , 1987 .

[11]  Shouling Ji,et al.  Structural Data De-anonymization: Quantification, Practice, and Implications , 2014, CCS.

[12]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Michele Garetto,et al.  Impact of Clustering on the Performance of Network De-anonymization , 2015, COSN.

[14]  Xue Liu,et al.  Privacy-Preserving Compressive Sensing for Crowdsensing Based Trajectory Recovery , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[15]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[16]  Matthias Grossglauser,et al.  On the performance of percolation graph matching , 2013, COSN '13.

[17]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[18]  Alan M. Frieze,et al.  A new rounding procedure for the assignment problem with applications to dense graph arrangement problems , 2002, Math. Program..

[19]  Prateek Mittal,et al.  On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge , 2015, NDSS.

[20]  Fan Chung Graham,et al.  The Average Distance in a Random Graph with Given Expected Degrees , 2004, Internet Math..

[21]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[22]  Kannan Ramchandran,et al.  Rumor Source Obfuscation on Irregular Trees , 2016, SIGMETRICS.

[23]  Daniel Cullina,et al.  Improved Achievability and Converse Bounds for Erdos-Renyi Graph Matching , 2016, SIGMETRICS.

[24]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[25]  Y. Aflalo,et al.  On convex relaxation of graph isomorphism , 2015, Proceedings of the National Academy of Sciences.

[26]  References , 1971 .

[27]  Michele Garetto,et al.  Social Network De-Anonymization Under Scale-Free User Relations , 2016, IEEE/ACM Transactions on Networking.

[28]  Matthias Grossglauser,et al.  When can two unlabeled networks be aligned under partial overlap? , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).