Website Fingerprinting at Internet Scale

The website fingerprinting attack aims to identify the content (i.e., a webpage accessed by a client) of encrypted and anonymized connections by observing patterns of data flows such as packet size and direction. This attack can be performed by a local passive eavesdropper – one of the weakest adversaries in the attacker model of anonymization networks such as Tor. In this paper, we present a novel website fingerprinting attack. Based on a simple and comprehensible idea, our approach outperforms all state-of-the-art methods in terms of classification accuracy while being computationally dramatically more efficient. In order to evaluate the severity of the website fingerprinting attack in reality, we collected the most representative dataset that has ever been built, where we avoid simplified assumptions made in the related work regarding selection and type of webpages and the size of the universe. Using this data, we explore the practical limits of website fingerprinting at Internet scale. Although our novel approach is by orders of magnitude computationally more efficient and superior in terms of detection accuracy, for the first time we show that no existing method – including our own – scales when applied in realistic settings. With our analysis, we explore neglected aspects of the attack and investigate the realistic probability of success for different strategies a real-world adversary may follow.

[1]  Bruce Schneier,et al.  Analysis of the SSL 3.0 protocol , 1996 .

[2]  H. Cheng,et al.  Traffic Analysis of SSL Encrypted Web Browsing , 1998 .

[3]  Hannes Federrath,et al.  Web MIXes: A System for Anonymous and Unobservable Internet Access , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[4]  Andrew Hintz,et al.  Fingerprinting Websites Using Traffic Analysis , 2002, Privacy Enhancing Technologies.

[5]  Lili Qiu,et al.  Statistical identification of encrypted Web browsing traffic , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[6]  David D. Jensen,et al.  Privacy Vulnerabilities in Encrypted HTTP Streams , 2005, Privacy Enhancing Technologies.

[7]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[8]  Michael I. Baron,et al.  Probability and Statistics for Computer Scientists , 2013 .

[9]  Charles V. Wright,et al.  Language Identification of Encrypted VoIP Traffic: Alejandra y Roberto or Alice and Bob? , 2007, USENIX Security Symposium.

[10]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[11]  Charles V. Wright,et al.  Spot Me if You Can: Uncovering Spoken Phrases in Encrypted VoIP Conversations , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[12]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.

[13]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[14]  Mun Choon Chan,et al.  Website Fingerprinting and Identification Using Ordered Feature Sequences , 2010, ESORICS.

[15]  Roger Dingledine,et al.  A Case Study on Measuring Statistical Data in the Tor Anonymity Network , 2010, Financial Cryptography Workshops.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[18]  Xiapu Luo,et al.  HTTPOS: Sealing Information Leaks with Browser-side Obfuscation of Encrypted Flows , 2011, NDSS.

[19]  Brijesh Joshi,et al.  Touching from a distance: website fingerprinting attacks and defenses , 2012, CCS.

[20]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[21]  Nikita Borisov,et al.  Website Detection Using Remote Traffic Analysis , 2011, Privacy Enhancing Technologies.

[22]  Tao Wang,et al.  Improved website fingerprinting on Tor , 2013, WPES.

[23]  Tao Wang,et al.  A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses , 2014, CCS.

[24]  Tao Wang,et al.  Effective Attacks and Provable Defenses for Website Fingerprinting , 2014, USENIX Security Symposium.

[25]  Rachel Greenstadt,et al.  A Critical Evaluation of Website Fingerprinting Attacks , 2014, CCS.

[26]  Xiang Cai,et al.  Glove: A Bespoke Website Fingerprinting Defense , 2014, WPES.

[27]  Ian Goldberg,et al.  Walkie-Talkie: An Effective and Efficient Defense against Website Fingerprinting , 2015 .

[28]  Marc Dacier,et al.  Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services , 2015, USENIX Security Symposium.

[29]  Tao Wang,et al.  On Realistically Attacking Tor with Website Fingerprinting , 2016, Proc. Priv. Enhancing Technol..