Fundamental Privacy Limits in Bipartite Networks Under Active Attacks

This work considers active deanonymization of bipartite networks. The scenario arises naturally in evaluating privacy in various applications such as social networks, mobility networks, and medical databases. For instance, in active deanonymization of social networks, an anonymous victim is targeted by an attacker (e.g. the victim visits the attacker’s website), and the attacker queries her group memberships (e.g. by querying the browser history) to deanonymize her. In this work, the fundamental limits of privacy, in terms of the minimum number of queries necessary for deanonymization, is investigated. A stochastic model is considered, where i) the bipartite network of group memberships is generated randomly, ii) the attacker has partial prior knowledge of the group memberships, and iii) it receives noisy responses to its real-time queries. The bipartite network is generated based on linear and sublinear preferential attachment, and the stochastic block model. The victim’s identity is chosen randomly based on a distribution modeling the users’ risk of being the victim (e.g. probability of visiting the website). An attack algorithm is proposed which builds upon techniques from communication with feedback, and its performance, in terms of expected number of queries, is analyzed. Simulation results are provided to verify the theoretical derivations.

[1]  Narseo Vallina-Rodriguez,et al.  Apps, Trackers, Privacy, and Regulators: A Global Study of the Mobile Tracking Ecosystem , 2018, NDSS.

[2]  A. Wald On Cumulative Sums of Random Variables , 1944 .

[3]  Thomas Eisenbarth,et al.  PerfWeb: How to Violate Web Privacy with Hardware Performance Events , 2017, ESORICS.

[4]  Vincent D. Blondel,et al.  A survey of results on mobile phone datasets analysis , 2015, EPJ Data Science.

[5]  Yuval Elovici,et al.  Online Social Networks: Threats and Solutions , 2013, IEEE Communications Surveys & Tutorials.

[6]  Remco van der Hofstad,et al.  Random Graphs and Complex Networks , 2016, Cambridge Series in Statistical and Probabilistic Mathematics.

[7]  Jordi Forné,et al.  Online advertising: Analysis of privacy threats and protection approaches , 2017, Comput. Commun..

[8]  Markus Jakobsson,et al.  Invasive browser sniffing and countermeasures , 2006, WWW '06.

[9]  Lise Getoor,et al.  Co-evolution of social and affiliation networks , 2009, KDD.

[10]  S. N. Dorogovtsev,et al.  Structure of growing networks with preferential linking. , 2000, Physical review letters.

[11]  Holger Boche,et al.  On the ϵ-Capacity of Finite Compound Channels with Applications to the Strong Converse and Second Order Coding Rate , 2020, 2020 54th Annual Conference on Information Sciences and Systems (CISS).

[12]  Russell Impagliazzo,et al.  Constructive Proofs of Concentration Bounds , 2010, APPROX-RANDOM.

[13]  D. Larremore,et al.  Community Detection in Bipartite Networks with Stochastic Blockmodels , 2020, Physical review. E.

[14]  Josep Domingo-Ferrer,et al.  Database Anonymization: Privacy Models, Data Utility, and Microaggregation-based Inter-model Connections , 2016, Database Anonymization.

[15]  Claude Castelluccia,et al.  On the uniqueness of Web browsing history patterns , 2014, Ann. des Télécommunications.

[16]  Leyla Bilge,et al.  All your contacts are belong to us: automated identity theft attacks on social networks , 2009, WWW '09.

[17]  Elza Erkip,et al.  An information theoretic framework for active de-anonymization in social networks based on group memberships , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Jérôme Kunegis,et al.  Preferential attachment in online networks: measurement and explanations , 2013, WebSci.

[21]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[22]  Elza Erkip,et al.  Optimal Active social Network De-anonymization Using Information Thresholds , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[23]  Dennis Goeckel,et al.  Asymptotic Loss in Privacy due to Dependency in Gaussian Traces , 2018, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[24]  Mindaugas Bloznelis,et al.  Random Intersection Graph Process , 2013, Internet Math..

[25]  Michael Hicks,et al.  Deanonymizing mobility traces: using social network as a side-channel , 2012, CCS.

[26]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[27]  Maria Karyda,et al.  Using Personalization Technologies for Political Purposes: Privacy Implications , 2017, e-Democracy.

[28]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[29]  Jong Kim,et al.  Inferring browser activity and status through remote monitoring of storage usage , 2016, ACSAC.

[30]  Stefan Mangard,et al.  Practical Memory Deduplication Attacks in Sandboxed Javascript , 2015, ESORICS.

[31]  Mikko Alava,et al.  Correlations in bipartite collaboration networks , 2005, physics/0508027.

[32]  H. Simon,et al.  Models Of Man : Social And Rational , 1957 .

[33]  Christopher Krügel,et al.  A Practical Attack to De-anonymize Social Network Users , 2010, 2010 IEEE Symposium on Security and Privacy.

[34]  Niloy Ganguly,et al.  Emergence of a non-scaling degree distribution in bipartite networks: A numerical and analytical study , 2007 .

[35]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[36]  Calton Pu,et al.  Large Online Social Footprints--An Emerging Threat , 2009, 2009 International Conference on Computational Science and Engineering.

[37]  Prateek Mittal,et al.  Robust Website Fingerprinting Through the Cache Occupancy Channel , 2018, USENIX Security Symposium.

[38]  Serge Fdida,et al.  A preferential attachment gathering mobility model , 2005, IEEE Communications Letters.

[39]  Arvind Narayanan,et al.  De-anonymizing Web Browsing Data with Social Networks , 2017, WWW.

[40]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[41]  Peter Bühler,et al.  Apps , 2019, Digital Publishing.

[42]  Will Perkins,et al.  Spectral thresholds in the bipartite stochastic block model , 2015, COLT.

[43]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[44]  Dennis Goeckel,et al.  Limits of location privacy under anonymization and obfuscation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[45]  Christopher Krügel,et al.  Abusing Social Networks for Automated User Profiling , 2010, RAID.

[46]  Shravan Narayan,et al.  Browser history re: visited , 2018, WOOT @ USENIX Security Symposium.

[47]  Hossein Pishro-Nik,et al.  Achieving Perfect Location Privacy in Wireless Devices Using Anonymization , 2016, IEEE Transactions on Information Forensics and Security.

[48]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[49]  Tara Javidi,et al.  Active Sequential Hypothesis Testing , 2012, ArXiv.

[50]  G. Caldarelli,et al.  Preferential attachment in the growth of social networks, the Internet encyclopedia wikipedia , 2007 .