A Machine Learning Approach for Efficient Traffic Classification

Traffic classification is of fundamental importance to track the evolution of network applications and model their behaviours. Further, classified traffic is required to understand how the Internet is being used, and to effectively control the services that traffic receives. In this paper we present a machine-learning approach that accurately classifies live traffic using C4.5 decision tree. By collecting 12 features at the start of the flows, without inspecting the packet payload, our method can identify live traffic of different types of applications with 99.8% total accuracy. Moreover, accuracy is not our only concern; we also consider the latency and throughput as of high importance.

[1]  Robi Polikar,et al.  Learn++.MT: A New Approach to Incremental Learning , 2004, Multiple Classifier Systems.

[2]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[3]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[4]  Ian H. Witten,et al.  Clustering Documents with Active Learning Using Wikipedia , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[6]  Helen J. Wang,et al.  Automatically Extracting Fields from Unknown Network Protocols , 2006 .

[7]  Michalis Faloutsos,et al.  Profiling the End Host , 2007, PAM.

[8]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[9]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[10]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[11]  Andrew W. Moore,et al.  Architecture of a network monitor , 2003 .

[12]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[13]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[14]  Grenville J. Armitage,et al.  Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[15]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[16]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[17]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[18]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[19]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[20]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[21]  George Varghese,et al.  Network Algorithmics-An Interdisciplinary Approach to Designing Fast Networked Devices , 2004 .

[22]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .