How to Hide One’s Relationships from Link Prediction Algorithms

Our private connections can be exposed by link prediction algorithms. To date, this threat has only been addressed from the perspective of a central authority, completely neglecting the possibility that members of the social network can themselves mitigate such threats. We fill this gap by studying how an individual can rewire her own network neighborhood to hide her sensitive relationships. We prove that the optimization problem faced by such an individual is NP-complete, meaning that any attempt to identify an optimal way to hide one’s relationships is futile. Based on this, we shift our attention towards developing effective, albeit not optimal, heuristics that are readily-applicable by users of existing social media platforms to conceal any connections they deem sensitive. Our empirical evaluation reveals that it is more beneficial to focus on “unfriending” carefully-chosen individuals rather than befriending new ones. In fact, by avoiding communication with just 5 individuals, it is possible for one to hide some of her relationships in a massive, real-life telecommunication network, consisting of 829,725 phone calls between 248,763 individuals. Our analysis also shows that link prediction algorithms are more susceptible to manipulation in smaller and denser networks. Evaluating the error vs. attack tolerance of link prediction algorithms reveals that rewiring connections randomly may end up exposing one’s sensitive relationships, highlighting the importance of the strategic aspect. In an age where personal relationships continue to leave digital traces, our results empower the general public to proactively protect their private relationships.

[1]  Ke Wang,et al.  Neighborhood randomization for link privacy in social network analysis , 2013, World Wide Web.

[2]  Aleksandra B. Slavkovic,et al.  Sharing social network data: differentially private estimation of exponential family random‐graph models , 2015, ArXiv.

[3]  Mohammad Al Hasan,et al.  A Survey of Link Prediction in Social Networks , 2011, Social Network Data Analytics.

[4]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[6]  T. Sørensen,et al.  A method of establishing group of equal amplitude in plant sociobiology based on similarity of species content and its application to analyses of the vegetation on Danish commons , 1948 .

[7]  Lisa Singh,et al.  Can Friends Be Trusted? Exploring Privacy in Online Social Networks , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  Xiaowei Ying,et al.  Randomizing Social Networks: a Spectrum Preserving Approach , 2008, SDM.

[10]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[11]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[12]  Fernando Berzal Galiano,et al.  A Survey of Link Prediction in Complex Networks , 2016, ACM Comput. Surv..

[13]  Lior Rokach,et al.  Links Reconstruction Attack , 2013 .

[14]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[15]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[16]  Jian Pei,et al.  Privacy-aware data management in information networks , 2011, SIGMOD '11.

[17]  Sven F. Crone,et al.  Predicting Customer Online Shopping Adoption - an Evaluation of Data Mining and Market Modelling Approaches , 2005, DMIN.

[18]  Lise Getoor,et al.  Preserving the Privacy of Sensitive Relationships in Graph Data , 2007, PinKDD.

[19]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[20]  K. Liu,et al.  Towards identity anonymization on graphs , 2008, SIGMOD Conference.

[21]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[22]  Martin Ester,et al.  Co-offending Network Mining , 2011, Counterterrorism and Open Source Intelligence.

[23]  Jian Pei,et al.  A brief survey on anonymization techniques for privacy preserving publishing of social network data , 2008, SKDD.

[24]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Lei Chen,et al.  A Survey of Privacy-Preservation of Graphs and Social Networks , 2010, Managing and Mining Graph Data.

[26]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[27]  Philip S. Yu,et al.  Limiting link disclosure in social network analysis through subgraph-wise perturbation , 2012, EDBT '12.

[28]  Timothy Ravasi,et al.  From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks , 2013, Scientific Reports.

[29]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[30]  Michalis Vazirgiannis,et al.  Community-preserving anonymization of graphs , 2018, Knowledge and Information Systems.

[31]  David Page,et al.  Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals , 2013, ECML/PKDD.

[32]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[33]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[34]  Futian Wang,et al.  Measuring the robustness of link prediction algorithms under noisy environment , 2016, Scientific Reports.

[35]  Siddharth Srivastava,et al.  Anonymizing Social Networks , 2007 .

[36]  Uwe Glässer,et al.  Locating Central Actors in Co-offending Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[37]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[38]  Manuel Cebrián,et al.  Limited communication capacity unveils strategies for human interaction , 2013, Scientific Reports.

[39]  David Page,et al.  Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals , 2013, ECML/PKDD.

[40]  Lise Getoor,et al.  Privacy in Social Networks: A Survey , 2011, Social Network Data Analytics.

[41]  Ashwin Machanavajjhala,et al.  Utility Cost of Formal Privacy for Releasing National Employer-Employee Statistics , 2017, SIGMOD Conference.

[42]  Ming Gu,et al.  A Brief Survey on De-anonymization Attacks in Online Social Networks , 2010, 2010 International Conference on Computational Aspects of Social Networks.

[43]  Ashish Kumar,et al.  Improving Attribute Inference Attack Using Link Prediction in Online Social Networks , 2016 .

[44]  Tom A. B. Snijders,et al.  Social Network Analysis , 2011, International Encyclopedia of Statistical Science.

[45]  Alex Pentland,et al.  Stealing Reality: When Criminals Become Data Scientists (or Vice Versa) , 2011, IEEE Intelligent Systems.

[46]  John Scott,et al.  Social Network Analysis, Overview of , 2009, Encyclopedia of Complexity and Systems Science.

[47]  Michael Gertz,et al.  Mining email social networks , 2006, MSR '06.

[48]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[49]  Lior Rokach,et al.  Chapter 1 Links Reconstruction Attack Using Link Prediction Algorithms to Compromise Social Networks Privacy , 2013 .

[50]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[51]  Xiaowei Ying,et al.  On link privacy in randomizing social networks , 2010, Knowledge and Information Systems.

[52]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[53]  Lise Getoor,et al.  To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles , 2009, WWW '09.