Website fingerprinting in onion routing based anonymization networks

Low-latency anonymization networks such as Tor and JAP claim to hide the recipient and the content of communications from a local observer, i.e., an entity that can eavesdrop the traffic between the user and the first anonymization node. Especially users in totalitarian regimes strongly depend on such networks to freely communicate. For these people, anonymity is particularly important and an analysis of the anonymization methods against various attacks is necessary to ensure adequate protection. In this paper we show that anonymity in Tor and JAP is not as strong as expected so far and cannot resist website fingerprinting attacks under certain circumstances. We first define features for website fingerprinting solely based on volume, time, and direction of the traffic. As a result, the subsequent classification becomes much easier. We apply support vector machines with the introduced features. We are able to improve recognition results of existing works on a given state-of-the-art dataset in Tor from 3% to 55% and in JAP from 20% to 80%. The datasets assume a closed-world with 775 websites only. In a next step, we transfer our findings to a more complex and realistic open-world scenario, i.e., recognition of several websites in a set of thousands of random unknown websites. To the best of our knowledge, this work is the first successful attack in the open-world scenario. We achieve a surprisingly high true positive rate of up to 73% for a false positive rate of 0.05%. Finally, we show preliminary results of a proof-of-concept implementation that applies camouflage as a countermeasure to hamper the fingerprinting attack. For JAP, the detection rate decreases from 80% to 4% and for Tor it drops from 55% to about 3%.

[1]  Steven J. Murdoch,et al.  Sampled Traffic Analysis by Internet-Exchange-Level Adversaries , 2007, Privacy Enhancing Technologies.

[2]  Charles V. Wright,et al.  Language Identification of Encrypted VoIP Traffic: Alejandra y Roberto or Alice and Bob? , 2007, USENIX Security Symposium.

[3]  Bruce Schneier,et al.  Analysis of the SSL 3.0 protocol , 1996 .

[4]  George Danezis,et al.  Low-cost traffic analysis of Tor , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[5]  Andrew Hintz,et al.  Fingerprinting Websites Using Traffic Analysis , 2002, Privacy Enhancing Technologies.

[6]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[7]  Eric C. Price,et al.  Browser-Based Attacks on Tor , 2007, Privacy Enhancing Technologies.

[8]  David D. Jensen,et al.  Privacy Vulnerabilities in Encrypted HTTP Streams , 2005, Privacy Enhancing Technologies.

[9]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.

[10]  Paul F. Syverson,et al.  Locating hidden servers , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[11]  Hannes Federrath,et al.  Web MIXes: A System for Anonymous and Unobservable Internet Access , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[12]  George Danezis,et al.  The Traffic Analysis of Continuous-Time Mixes , 2004, Privacy Enhancing Technologies.

[13]  Lili Qiu,et al.  Statistical identification of encrypted Web browsing traffic , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[14]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.0 , 1996, RFC.

[15]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[16]  Dirk Grunwald,et al.  Shining Light in Dark Places: Understanding the Tor Network , 2008, Privacy Enhancing Technologies.

[17]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[18]  Xun Gong,et al.  Fingerprinting websites using remote traffic analysis , 2010, CCS '10.

[19]  Charles V. Wright,et al.  Spot Me if You Can: Uncovering Spoken Phrases in Encrypted VoIP Conversations , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[20]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[21]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[22]  H. Cheng,et al.  Traffic Analysis of SSL Encrypted Web Browsing , 1998 .