Inferring Application Type Information from Tor Encrypted Traffic

Tor is a famous anonymity communication system for preserving users' online privacy. It supports TCP applications and packs application data into encrypted equal-sized cells to hide some private information of users, such as the running application type (Web, P2P, FTP, Others). The known of application types is harmful because they can be used to reduce the anonymity set and facilitate other attacks. However, unfortunately, the current Tor design cannot conceal certain application behaviors. For example, P2P applications usually upload and download files simultaneously and this behavioral feature is also kept in Tor traffic. Motivated by this observation, we investigate a new attack against Tor, traffic classification attack, which can recognize application types from Tor traffic. An attacker first carefully selects some flow features, e.g., burst volumes and directions to represent the application behaviors and takes advantage of some efficient machine learning algorithm to model different types of applications. Then these established models can be used to classify target's Tor traffic and infer its application type. We have implemented the traffic classification attack on Tor and our experiments validate the feasibility and effectiveness of the attack.

[1]  Charles V. Wright,et al.  Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis , 2009, NDSS.

[2]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[3]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[4]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[5]  Nicholas Hopper,et al.  Throttling Tor Bandwidth Parasites , 2012, NDSS.

[6]  Nikita Borisov,et al.  SWIRL: A Scalable Watermark to Detect Correlated Network Flows , 2011, NDSS.

[7]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[8]  Steven J. Murdoch,et al.  Hot or not: revealing hidden services by their clock skew , 2006, CCS '06.

[9]  Douglas S. Reeves,et al.  Robust Correlation of Encrypted Attack Traffic through Stepping Stones by Flow Watermarking , 2011, IEEE Transactions on Dependable and Secure Computing.

[10]  Riccardo Bettati,et al.  On Flow Correlation Attacks and Countermeasures in Mix Networks , 2004, Privacy Enhancing Technologies.

[11]  Grzegorz Kondrak,et al.  Multiple Word Alignment with Profile Hidden Markov Models , 2009, HLT-NAACL.

[12]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[13]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[14]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[15]  Nicholas Hopper,et al.  How much anonymity does network latency leak? , 2010, ACM Trans. Inf. Syst. Secur..

[16]  Brijesh Joshi,et al.  Touching from a distance: website fingerprinting attacks and defenses , 2012, CCS.

[17]  C. Notredame,et al.  Recent progress in multiple sequence alignment: a survey. , 2002, Pharmacogenomics.

[18]  Ian Goldberg,et al.  Enhancing Tor's performance using real-time traffic classification , 2012, CCS.

[19]  Ian Goldberg,et al.  An improved algorithm for tor circuit scheduling , 2010, CCS '10.

[20]  Roger Dingledine,et al.  A Case Study on Measuring Statistical Data in the Tor Anonymity Network , 2010, Financial Cryptography Workshops.

[21]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[22]  Thomas Engel,et al.  Slotted Packet Counting Attacks on Anonymity Protocols , 2009, AISC.

[23]  Jean-François Raymond,et al.  Traffic Analysis: Protocols, Attacks, Design Issues, and Open Problems , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[24]  Weijia Jia,et al.  A New Cell-Counting-Based Attack Against Tor , 2012, IEEE/ACM Transactions on Networking.

[25]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[26]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.