Enhancing Tor's performance using real-time traffic classification

Tor is a low-latency anonymity-preserving network that enables its users to protect their privacy online. It consists of volunteer-operated routers from all around the world that serve hundreds of thousands of users every day. Due to congestion and a low relay-to-client ratio, Tor suffers from performance issues that can potentially discourage its wider adoption, and result in an overall weaker anonymity to all users. We seek to improve the performance of Tor by defining different classes of service for its traffic. We recognize that although the majority of Tor traffic is interactive web browsing, a relatively small amount of bulk downloading consumes an unfair amount of Tor's scarce bandwidth. Furthermore, these traffic classes have different time and bandwidth constraints; therefore, they should not be given the same Quality of Service (QoS), which Tor offers them today. We propose and evaluate DiffTor, a machine-learning-based approach that classifies Tor's encrypted circuits by application in real time and subsequently assigns distinct classes of service to each application. Our experiments confirm that we are able to classify circuits we generated on the live Tor network with an extremely high accuracy that exceeds 95%. We show that our real-time classification in combination with QoS can considerably improve the experience of Tor clients, as our simple techniques result in a 75% improvement in responsiveness and an 86% reduction in download times at the median for interactive users.

[1]  Anja Feldmann,et al.  On dominant characteristics of residential broadband internet traffic , 2009, IMC '09.

[2]  Ian Goldberg,et al.  An improved algorithm for tor circuit scheduling , 2010, CCS '10.

[3]  Björn Scheuermann,et al.  Tor is unfair — And what to do about it , 2011, 2011 IEEE 36th Conference on Local Computer Networks.

[4]  Nicholas Hopper,et al.  Shadow: Running Tor in a Box for Accurate and Efficient Experimentation , 2011, NDSS.

[5]  Ian Goldberg,et al.  DefenestraTor: Throwing Out Windows in Tor , 2011, PETS.

[6]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[7]  Roger Dingledine,et al.  Building Incentives into Tor , 2010, Financial Cryptography.

[8]  Andrew W. Moore,et al.  Traffic Classification Using a Statistical Approach , 2005, PAM.

[9]  Micah Sherr,et al.  Exploring the potential benefits of expanded rate limiting in Tor: slow and steady wins the race with Tortoise , 2011, ACSAC '11.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Kevin Jeffay,et al.  Tracking the evolution of Web traffic: 1995-2003 , 2003, 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003..

[12]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[13]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[14]  João Gama,et al.  Functional trees for classification , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Roger Dingledine Tor and Circumvention: Lessons Learned - (Abstract to Go with Invited Talk) , 2011, CRYPTO.

[16]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[17]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[18]  Catherine Rosenberg,et al.  Behavioral authentication of server flows , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[19]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[20]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[21]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[22]  Nicholas Hopper,et al.  Recruiting new tor relays with BRAIDS , 2010, CCS '10.

[23]  Dirk Grunwald,et al.  Shining Light in Dark Places: Understanding the Tor Network , 2008, Privacy Enhancing Technologies.

[24]  Zongpeng Li,et al.  Characterizing user sessions on YouTube , 2008, Electronic Imaging.

[25]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[26]  Walid Dabbous,et al.  One Bad Apple Spoils the Bunch: Exploiting P2P Applications to Trace and Profile Tor Users , 2011, LEET.

[27]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[28]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[29]  Micah Sherr,et al.  ExperimenTor: A Testbed for Safe and Realistic Tor Experimentation , 2011, CSET.

[30]  Nicholas Hopper,et al.  Throttling Tor Bandwidth Parasites , 2012, NDSS.

[31]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[32]  Mohamed Ali Kâafar,et al.  Digging into Anonymous Traffic: A Deep Analysis of the Tor Anonymizing Network , 2010, 2010 Fourth International Conference on Network and System Security.

[33]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.