A statistical approach to IP-level classification of network traffic

Correct classification of traffic flows according to the application layer protocols that generated them is essential for most network-management, resource allocation and intrusion detection systems in TCP/IP networks. With the ever increasing number of network protocols and services running on non-standard TCP ports, the classification methods based on the analysis of the transport layer header are rapidly becoming ineffective. On the other hand, mechanisms based on full payload analysis are too computationally demanding to be run on most high-bandwidth links. Here we present a novel classification technique based on the statistical analysis of network traffic performed at the IP-level. The key idea behind our approach is to build a set of protocol fingerprints that we believe summarize, in a compact and efficient way, the main IP-level statistical properties of application layer protocols. By means of a simple, lightweight algorithm based on the notion of anomaly scores, also presented in this paper, an unknown flow can be compared against known protocol fingerprints, detecting the application that generated the flow. Our methodology is completely based on IP-level analysis: no payload analysis or port analysis is required for the classification of an unknown flow. Besides introducing our approach, we describe preliminary experimental results that show how this technique is effective in correctly classifying network traffic in a real network environment.

[1]  David Moore,et al.  The CoralReef Software Suite as a Tool for System and Network Administrators , 2001, LISA.

[2]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[3]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[4]  Sally Floyd,et al.  Wide-area traffic: the failure of Poisson modeling , 1994 .

[5]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[6]  A. Mena,et al.  An empirical study of real audio traffic , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[7]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[8]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[9]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[10]  Andrew B. Nobel,et al.  Statistical Clustering of Internet Communication Patterns , 2003 .

[11]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[12]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[13]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[14]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.