A Literature Survey and Classifications on Data Deanonymisation

The problem of disclosing private anonymous data has become increasingly serious particularly with the possibility of carrying out deanonymisation attacks on publishing data. The related work available in the literature is inadequate in terms of the number of techniques analysed, and is limited to certain contexts such as Online Social Networks. We survey a large number of state-of-the-art techniques of deanonymisation achieved in various methods and on different types of data. Our aim is to build a comprehensive understanding about the problem. For this survey, we propose a framework to guide a thorough analysis and classifications. We are interested in classifying deanonymisation approaches based on type and source of auxiliary information and on the structure of target datasets. Moreover, potential attacks, threats and some suggested assistive techniques are identified. This can inform the research in gaining an understanding of the deanonymisation problem and assist in the advancement of privacy protection.

[1]  John Riedl,et al.  You are what you say: privacy risks of public mentions , 2006, SIGIR '06.

[2]  Ninghui Li,et al.  Provably Private Data Anonymization: Or, k-Anonymity Meets Differential Privacy , 2011, ArXiv.

[3]  Elaine Shi,et al.  Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge , 2011, The 2011 International Joint Conference on Neural Networks.

[4]  Khaled El Emam,et al.  Estimating the re-identification risk of clinical data sets , 2012, BMC Medical Informatics and Decision Making.

[5]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[6]  Tobias Friedrich,et al.  De-anonymization of Heterogeneous Random Graphs in Quasilinear Time , 2014, Algorithmica.

[7]  Kieron O'Hara,et al.  Transparent government, not transparent citizens: a report on privacy and transparency for the Cabinet Office , 2011 .

[8]  Ming Gu,et al.  A Brief Survey on De-anonymization Attacks in Online Social Networks , 2010, 2010 International Conference on Computational Aspects of Social Networks.

[9]  Xing Xie,et al.  Effective Social Graph Deanonymization Based on Graph Structure and Descriptive Information , 2015, ACM Trans. Intell. Syst. Technol..

[10]  Vishal Bhatnagar,et al.  Anonymisation in social network: a literature survey and classification , 2012, Int. J. Soc. Netw. Min..

[11]  David J. Crandall,et al.  De-Anonymizing Users Across Heterogeneous Social Computing Platforms , 2013, ICWSM.

[12]  Lise Getoor,et al.  To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles , 2009, WWW '09.

[13]  Prateek Mittal,et al.  On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge , 2015, NDSS.

[14]  Ming Gu,et al.  De-Anonymizing Dynamic Social Networks , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[15]  Sébastien Gambs,et al.  De-anonymization Attack on Geolocated Data , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[16]  George Danezis,et al.  An Automated Social Graph De-anonymization Technique , 2014, WPES.

[17]  Sándor Imre,et al.  Measuring importance of seeding for structural de-anonymization attacks in social networks , 2014, 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS).

[18]  Sandor Imre,et al.  Analysis of Grasshopper, a Novel Social Network De-anonymization Algorithm , 2014 .

[19]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  Vicenç Torra,et al.  A formalization of re-identification in terms of compatible probabilities , 2013, ArXiv.

[21]  Alex Biryukov,et al.  Trawling for Tor Hidden Services: Detection, Measurement, Deanonymization , 2013, 2013 IEEE Symposium on Security and Privacy.

[22]  Martin M. Merener Theoretical Results on De-Anonymization via Linkage Attacks , 2012, Trans. Data Priv..

[23]  Yang Wang,et al.  Personalization and privacy: a survey of privacy risks and remedies in personalization-based systems , 2012, User Modeling and User-Adapted Interaction.

[24]  Josep Domingo-Ferrer,et al.  A Critique of k-Anonymity and Some of Its Enhancements , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[25]  Matthias Grossglauser,et al.  On the privacy of anonymized networks , 2011, KDD.

[26]  Shouling Ji,et al.  Structure Based Data De-Anonymization of Social Networks and Mobility Traces , 2014, ISC.

[27]  XieXing,et al.  Effective Social Graph Deanonymization Based on Graph Structure and Descriptive Information , 2015 .

[28]  Anupam Datta,et al.  Provable De-anonymization of Large Datasets with Sparse Dimensions , 2012, POST.

[29]  L. Sweeney,et al.  Trail Re-Identification: Learning Who You Are From Where You Have Been , 2003 .

[30]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[31]  Yong-Yeol Ahn,et al.  Community-Enhanced De-anonymization of Online Social Networks , 2014, CCS.

[32]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[33]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[34]  Carmela Troncoso,et al.  You cannot hide for long: de-anonymization of real-world dynamic behaviour , 2013, WPES.

[35]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[36]  Junyuan Xie,et al.  On the feasibility of user de-anonymization from shared mobile sensor data , 2012, PhoneSense '12.

[37]  Jonathan J. H. Zhu,et al.  Controllability of Weighted and Directed Networks with Nonidentical Node Dynamics , 2013 .

[38]  Jayakrishnan Unnikrishnan,et al.  De-anonymizing private data by matching statistics , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[39]  Vitaly Shmatikov,et al.  2011 IEEE Symposium on Security and Privacy “You Might Also Like:” Privacy Risks of Collaborative Filtering , 2022 .

[40]  Jie Wu,et al.  A Two-Stage Deanonymization Attack against Anonymized Social Networks , 2014, IEEE Transactions on Computers.

[41]  Shouling Ji,et al.  Structural Data De-anonymization: Quantification, Practice, and Implications , 2014, CCS.

[42]  Tetsuji Kuboyama,et al.  Content-Based De-anonymisation of Tweets , 2011, 2011 Seventh International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[43]  Zoltán Alexin Does fair anonymization exist? , 2014 .

[44]  Moni Naor,et al.  On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy , 2010, J. Priv. Confidentiality.

[45]  Yanchun Zhang,et al.  On the identity anonymization of high‐dimensional rating data , 2012, Concurr. Comput. Pract. Exp..

[46]  Michael Hicks,et al.  Deanonymizing mobility traces: using social network as a side-channel , 2012, CCS.

[47]  Carmela Troncoso,et al.  Vida: How to Use Bayesian Inference to De-anonymize Persistent Communications , 2009, Privacy Enhancing Technologies.

[48]  Bradley Malin,et al.  How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems , 2004, J. Biomed. Informatics.

[49]  Xing Xie,et al.  Privacy Risk in Anonymized Heterogeneous Information Networks , 2014, EDBT.

[50]  Christopher Krügel,et al.  A Practical Attack to De-anonymize Social Network Users , 2010, 2010 IEEE Symposium on Security and Privacy.

[51]  Paul Ohm Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization , 2009 .

[52]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[53]  Xiangtao Li,et al.  Structural Attack to Anonymous Graph of Social Networks , 2013 .

[54]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[55]  Stefan Bender,et al.  Re-identifying register data by survey data: An empirical study , 2001 .

[56]  Marco Mamei,et al.  Re-identification of anonymized CDR datasets using social network data , 2014, 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS).

[57]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[58]  Xuelong Li,et al.  When Location Meets Social Multimedia , 2015, ACM Transactions on Intelligent Systems and Technology.

[59]  Donald F. Towsley,et al.  Resisting structural re-identification in anonymized social networks , 2010, The VLDB Journal.