Machine Learning-Based Classification of Encrypted Internet Traffic

Peer-to-peer (P2P) networking has introduced a major shift in the application and traffic mix of the Internet and established itself as the main driver of increasing traffic volume. The high requirements of some P2P applications result in network operational issues: these applications consume vast amounts of network resources and can prevent mission critical applications from accessing the network. Therefore the ability to correctly identify them can be crucial for many network management and measurement tasks. In this paper some flow-based statistical features of Internet traffic are investigated in order to detect P2P traffic. We propose a system to identify the BT traffic, which is one of the most popular and problematic P2P applications using support vector machines. The accuracy of 94.5% was achieved for recognizing encrypted traffic which is a very promising result.

[1]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[4]  Matthew Roughan,et al.  Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification , 2004, IMC '04.

[5]  Maurizio Dusi,et al.  IP Traffic Classification for QoS Guarantees: The Independence of Packets , 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks.

[6]  Wei Tu,et al.  Distributed scheduling scheme for video streaming over multi-channel multi-radio multi-hop wireless networks , 2010, IEEE Journal on Selected Areas in Communications.

[7]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[8]  Panayiotis Mavrommatis,et al.  Identifying Known and Unknown Peer-to-Peer Traffic , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[9]  István Szabó,et al.  Accurate Traffic Classification , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[10]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[11]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[12]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[13]  Antonio Pescapè,et al.  Traffic classification and its applications to modern networks , 2009, Comput. Networks.

[14]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[15]  Daryoush Habibi,et al.  Classification of digital modulation schemes using neural networks , 1999, ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications (IEEE Cat. No.99EX359).

[16]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[17]  Anja Feldmann,et al.  Dynamic Application-Layer Protocol Analysis for Network Intrusion Detection , 2006, USENIX Security Symposium.

[18]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .