A novel application classification attack against Tor

Tor is a famous anonymous communication system for preserving users' online privacy. It supports TCP applications and packs upper‐layer application data into encrypted equal‐sized cells with onion routing to hide private information of users. However, we note that the current Tor design cannot conceal certain application behaviors. For example, P2P applications usually upload and download files simultaneously, and this behavioral feature is also kept in Tor traffic. Motivated by this observation, we investigate a new attack against Tor, application classification attack, which can recognize application types from Tor traffic. An attacker first carefully selects some flow features such as burst volumes and directions to represent the application behaviors and takes advantage of some efficient machine‐learning algorithm (e.g., Profile Hidden Markov Model) to model different types of applications. Then he or she can use these established models to classify target's Tor traffic and infer its application type. We have implemented the application classification attack on Tor using parallel computing, and our experiments validate the feasibility and effectiveness of the attack. We argue that the disclosure of application type information is a serious threat to Tor users' anonymity because it can be used to reduce the anonymity set and facilitate other attacks. We also present guidelines to defend against application classification attack. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Dan Meng,et al.  An empirical study of morphing on behavior-based network traffic classification , 2015, Secur. Commun. Networks.

[2]  Thomas Ristenpart,et al.  Peek-a-Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail , 2012, 2012 IEEE Symposium on Security and Privacy.

[3]  Jun Li,et al.  Accelerating Application Identification with Two-Stage Matching and Pre-Classification , 2011 .

[4]  Ian Goldberg,et al.  SkypeMorph: protocol obfuscation for Tor bridges , 2012, CCS.

[5]  Douglas S. Reeves,et al.  Robust Correlation of Encrypted Attack Traffic through Stepping Stones by Flow Watermarking , 2011, IEEE Transactions on Dependable and Secure Computing.

[6]  José Everardo Bessa Maia,et al.  NTCS: A real time flow-based network traffic classification system , 2014, 10th International Conference on Network and Service Management (CNSM) and Workshop.

[7]  Ian Goldberg,et al.  Enhancing Tor's performance using real-time traffic classification , 2012, CCS.

[8]  Sabrina De Capitani di Vimercati,et al.  Guest Editorial: Special Issue on Computer and Communications Security , 2008, TSEC.

[9]  Thomas Engel,et al.  Slotted Packet Counting Attacks on Anonymity Protocols , 2009, AISC.

[10]  Ian Goldberg,et al.  An improved algorithm for tor circuit scheduling , 2010, CCS '10.

[11]  李军,et al.  Accelerating Application Identification with Two-Stage Matching and Pre-Classification , 2011 .

[12]  Jun Zhang,et al.  Internet Traffic Classification Using Constrained Clustering , 2014, IEEE Transactions on Parallel and Distributed Systems.

[13]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[14]  Ian Goldberg,et al.  DefenestraTor: Throwing Out Windows in Tor , 2011, PETS.

[15]  Roger Dingledine,et al.  A Case Study on Measuring Statistical Data in the Tor Anonymity Network , 2010, Financial Cryptography Workshops.

[16]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[17]  Siddheswar Ray,et al.  Determination of Number of Clusters in K-Means Clustering and Application in Colour Image Segmentation , 2000 .

[18]  Ming Yang,et al.  De-anonymizing and countermeasures in anonymous communication networks , 2015, IEEE Communications Magazine.

[19]  Brijesh Joshi,et al.  Touching from a distance: website fingerprinting attacks and defenses , 2012, CCS.

[20]  Alex Biryukov,et al.  Trawling for Tor Hidden Services: Detection, Measurement, Deanonymization , 2013, 2013 IEEE Symposium on Security and Privacy.

[21]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[22]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[23]  Grzegorz Kondrak,et al.  Multiple Word Alignment with Profile Hidden Markov Models , 2009, HLT-NAACL.

[24]  Nicholas Hopper,et al.  How much anonymity does network latency leak? , 2007, TSEC.

[25]  Steven J. Murdoch,et al.  Hot or not: revealing hidden services by their clock skew , 2006, CCS '06.

[26]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[27]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[28]  Ming Yang,et al.  A novel active website fingerprinting attack against Tor anonymous system , 2014, Proceedings of the 2014 IEEE 18th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[29]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[30]  Hannes Federrath,et al.  Analyzing Characteristic Host Access Patterns for Re-identification of Web User Sessions , 2010, NordSec.

[31]  Ming Yang,et al.  Application-level attack against Tor's hidden service , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[32]  Kiran P. Somase,et al.  Analysis of a new cell-counting-based attack against connection based Tor , 2014 .

[33]  C. Notredame,et al.  Recent progress in multiple sequence alignment: a survey. , 2002, Pharmacogenomics.

[34]  Jean-François Raymond,et al.  Traffic Analysis: Protocols, Attacks, Design Issues, and Open Problems , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[35]  Weijia Jia,et al.  A New Cell-Counting-Based Attack Against Tor , 2012, IEEE/ACM Transactions on Networking.

[36]  Nicholas Hopper,et al.  Throttling Tor Bandwidth Parasites , 2012, NDSS.

[37]  Yifan Yu,et al.  TIFAflow: enhancing traffic archiving system with flow granularity for forensic analysis in network security , 2013 .

[38]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[39]  Andreas Pfitzmann,et al.  Anonymity, Unobservability, and Pseudonymity - A Proposal for Terminology , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[40]  Santosh Lomte,et al.  Active Watermarking Approach in Detecting Encrypted Traffic Attack by Making Correlation Scheme Robust , 2014 .

[41]  Muhammad N. Marsono,et al.  Automated Dataset Generation for Training Peer-to-Peer Machine Learning Classifiers , 2013, Journal of Network and Systems Management.

[42]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.