Session level flow classification by packet size distribution and session grouping

Classifying traffic into specific network applications is essential for application-aware network management and it becomes more challenging because modern applications complicate their network behaviors. While port number-based classifiers work only for some well-known applications and signature-based classifiers are not applicable to encrypted packet payloads, researchers tend to classify network traffic based on behaviors observed in network applications. In this paper, a session level flow classification (SLFC) approach is proposed to classify network flows as a session, which comprises of flows in the same conversation. SLFC first classifies flows into the corresponding applications by packet size distribution (PSD) and then groups flows as sessions by port locality. With PSD, each flow is transformed into a set of points in a two-dimension space and the distances between each flow and the representatives of pre-selected applications are computed. The flow is recognized as the application having a minimum distance. Meanwhile, port locality is used to group flows as sessions because an application often uses consecutive port numbers within a session. If flows of a session are classified into different applications, an arbitration algorithm is invoked to make the correction. The evaluation shows that SLFC achieves high accuracy rates on both flow and session classifications, say 99.9% and 99.98%, respectively. When SLFC is applied to online classification, it is able to make decisions quickly by checking at most 300 packets for long-lasting flows. Based on our test data, an average of 72% of packets in long-lasting flows can be skipped without reducing the classification accuracy rates.

[1]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[2]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[3]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[4]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[5]  Yuan-Cheng Lai,et al.  Application classification using packet size distribution and port association , 2009, J. Netw. Comput. Appl..

[6]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[7]  Timothy A. Gonsalves,et al.  Traffic Modeling and Classification Using Packet Train Length and Packet Train Size , 2006, IPOM.

[8]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[9]  Chase Cotton,et al.  Packet-level traffic measurements from the Sprint IP backbone , 2003, IEEE Netw..

[10]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[11]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[12]  Sally Floyd,et al.  Wide area traffic: the failure of Poisson modeling , 1995, TNET.

[13]  Vern Paxson,et al.  Semi-automated discovery of application session structure , 2006, IMC '06.

[14]  Vern Paxson,et al.  Empirically derived analytic models of wide-area TCP connections , 1994, TNET.

[15]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[16]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[17]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[18]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[19]  Anja Feldmann,et al.  An analysis of Internet chat systems , 2003, IMC '03.