Breaking and Improving Protocol Obfuscation

Different techniques for traffic classification are utilized in various fields of application. In this technical report, we look closer on how statistical analysis can be used to identify network protocols. We show how even obfuscated application layer protocols, such as BitTorrent's MSE protocol and Skype, can be identified by fingerprinting statistically measurable properties of TCP and UDP sessions. We also look closer on the properties our protocol identification algorithm exploits to identify these obfuscated protocols -- protocols that are designed not to be detectable and are thus considered to be very hard to classify. Many of the analyzed protocols are shown to have statistically measurable properties in payload data, flow behavior, or both. Based on this new insight, we propose techniques that can improve future versions of obfuscated protocols, inhibiting identification through this type of statistical analysis. These techniques include better obfuscation of payload data and flow features as well as hiding inside tunnels of well known protocols. This report is intended to provide feedback and suggestions for improvement to creators of obfuscated network protocols, and should thus help to facilitate sustained network neutrality on the Internet.

[1]  Aleksandar Kuzmanovic,et al.  Googling the Internet: Profiling Internet Endpoints via the World Wide Web , 2010, IEEE/ACM Transactions on Networking.

[2]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[3]  Hiroshi Esaki,et al.  Unsupervised host behavior classification from connection patterns , 2010, Int. J. Netw. Manag..

[4]  Maurizio Dusi,et al.  Estimating routing symmetry on single links by passive flow measurements , 2010, IWCMC.

[5]  Tomas Olovsson,et al.  Passive internet measurement: Overview and guidelines based on experiences , 2010, Comput. Commun..

[6]  Ari Huttunen,et al.  UDP Encapsulation of IPsec ESP Packets , 2005, RFC.

[7]  Dario Rossi,et al.  Tracking Down Skype Traffic , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[8]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[9]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[10]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[11]  Carey L. Williamson,et al.  A Longitudinal Study of P2P Traffic Classification , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[12]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[13]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[14]  Ronald L. Rivest,et al.  The RC4 encryption algorithm , 1992 .

[15]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[16]  Maurizio Dusi,et al.  Tunnel Hunter: Detecting application-layer tunnels with statistical fingerprinting , 2009, Comput. Networks.

[17]  Wolfgang John,et al.  Heuristics to Classify Internet Backbone Traffic based on Connection Patterns , 2008, 2008 International Conference on Information Networking.

[18]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[19]  Anja Feldmann,et al.  Dynamic Application-Layer Protocol Analysis for Network Intrusion Detection , 2006, USENIX Security Symposium.

[20]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[21]  Guillaume Urvoy-Keller,et al.  Challenging statistical classification for operational usage: the ADSL case , 2009, IMC '09.

[22]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[23]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[24]  Min Zhang,et al.  State of the Art in Traffic Classification: A Research Review , 2009 .

[25]  Philip Branch,et al.  Real Time VoIP Traffic Classification , 2009 .

[26]  Kimberly C. Claffy,et al.  Dialing Privacy and Utility: A Proposed Data-Sharing Framework to Advance Internet Research , 2010, IEEE Security & Privacy.

[27]  Dario Rossi,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM '07.

[28]  Zhi-Li Zhang,et al.  Adaptive packet sampling for accurate and scalable flow measurement , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[29]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[30]  Michael Langberg,et al.  Realtime Classification for Encrypted Traffic , 2010, SEA.

[31]  Erik Hjelmvik,et al.  Statistical Protocol IDentification with SPID: Preliminary Results , 2009 .

[32]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[33]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[34]  Andrew W. Moore,et al.  Traffic Classification Using a Statistical Approach , 2005, PAM.

[35]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[36]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[37]  Whitfield Diffie,et al.  New Directions in Cryptography , 1976, IEEE Trans. Inf. Theory.

[38]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[39]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[40]  P. Biondi,et al.  Silver Needle in the Skype , 2006 .

[41]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[42]  Dario Rossi,et al.  Stochastic Packet Inspection for TCP Traffic , 2010, 2010 IEEE International Conference on Communications.

[43]  George Varghese,et al.  Graph-Based P2P Traffic Classification at the Internet Backbone , 2009, IEEE INFOCOM Workshops 2009.