KISS: Stochastic Packet Inspection Classifier for UDP Traffic

This paper proposes KISS, a novel Internet classification engine. Motivated by the expected raise of UDP traffic, which stems from the momentum of Peer-to-Peer (P2P) streaming applications, we propose a novel classification framework that leverages on statistical characterization of payload. Statistical signatures are derived by the means of a Chi-Square (χ2)-like test, which extracts the protocol “format,” but ignores the protocol “semantic” and “synchronization” rules. The signatures feed a decision process based either on the geometric distance among samples, or on Support Vector Machines. KISS is very accurate, and its signatures are intrinsically robust to packet sampling, reordering, and flow asymmetry, so that it can be used on almost any network. KISS is tested in different scenarios, considering traditional client-server protocols, VoIP, and both traditional and new P2P Internet applications. Results are astonishing. The average True Positive percentage is 99.6%, with the worst case equal to 98.1,% while results are almost perfect when dealing with new P2P streaming applications.

[1]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[2]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[3]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[4]  Dario Rossi,et al.  Understanding VoIP from Backbone Measurements , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[5]  Danny Bickson,et al.  The eMule Protocol Specification , 2005 .

[6]  Dario Rossi,et al.  KISS: Stochastic Packet Inspection , 2009, TMA.

[7]  Zhi-Li Zhang,et al.  Profiling internet backbone traffic: behavior models and applications , 2005, SIGCOMM '05.

[8]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[9]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[10]  Dario Rossi,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM '07.

[11]  Tong Zhang,et al.  An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods , 2001, AI Mag..

[12]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[13]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[14]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[15]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[16]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[17]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[18]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[19]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[20]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[23]  Marco Mellia,et al.  Measuring IP and TCP behavior on edge nodes with Tstat , 2005, Comput. Networks.

[24]  Dario Rossi,et al.  Building a cooperative P2P-TV application over a wise network: the approach of the European FP-7 strep NAPA-WINE , 2008, IEEE Communications Magazine.

[25]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[26]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.

[27]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.