Empirical Analysis of Application-Level Traffic Classification Using Supervised Machine Learning

Accurate application traffic classification and identification are important for network monitoring and analysis. The accuracy of traditional Internet application traffic classification approaches is rapidly decreasing due to the diversity of today's Internet application traffic, such as ephemeral port allocation, proprietary protocol, and traffic encryption. This paper presents an empirical evaluation of application-level traffic classification using supervised machine learning techniques. Our results indicate that we cannot achieve high accuracy with a simple feature set. Even if a simple feature set shows good performance in application category-level classification, more sophisticated feature selection methods and other techniques are necessary for performance enhancement.

[1]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[4]  Jeffrey Erman,et al.  Internet Traffic Identification using Machine Learning , 2006 .

[5]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[6]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[7]  C.-C. Jay Kuo,et al.  GA-Based Internet Traffic Classification Technique for QoS Provisioning , 2006, 2006 International Conference on Intelligent Information Hiding and Multimedia.

[8]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[9]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[10]  Grenville J. Armitage,et al.  Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[11]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[12]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[13]  J. Erman,et al.  QRP05-4: Internet Traffic Identification using Machine Learning , 2006, IEEE Globecom 2006.