Author ' s personal copy A novel self-learning architecture for p 2 p traffic classification in high speed networks

The popularity of a new generation of smart peer-to-peer applications has resulted in several new challenges for accurately classifying network traffic. In this paper, we propose a novel two-stage p2p traffic classifier, called Self-Learning Traffic Classifier (SLTC), that can accurately identify p2p traffic in high speed networks. The first stage classifies p2p traffic from the rest of the network traffic, and the second stage automatically extracts application payload signatures to accurately identify the p2p application that generated the p2p flow. For the first stage, we propose a fast, light-weight algorithm called Time Correlation Metric (TCM), that exploits the temporal correlation of flows to clearly separate peer-to-peer (p2p) traffic from the rest of the traffic. Using real network traces from tier-1 ISPs that are located in different continents, we show that the detection rate of TCM is consistently above 95% while always keeping the false positives at 0%. For the second stage, we use the LASER signature extraction algorithm [20] to accurately identify signatures of several known and unknown p2p protocols with very small false positive rate (<1%). Using our prototype on tier-1 ISP traces, we demonstrate that SLTC automatically learns signatures for more than 95% of both known and unknown traffic within 3 min. 2009 Elsevier B.V. All rights reserved.

[1]  Sebastian Zander,et al.  Self-Learning IP Traffic Classification Based on Statistical Flow Characteristics , 2005, PAM.

[2]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[3]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[4]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[5]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[6]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[7]  Chen-Nee Chuah,et al.  Self-Learning Peer-to-Peer Traffic Classifier , 2009, 2009 Proceedings of 18th International Conference on Computer Communications and Networks.

[8]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.

[9]  Panayiotis Mavrommatis,et al.  Identifying Known and Unknown Peer-to-Peer Traffic , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[10]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[11]  Henning Schulzrinne,et al.  An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol , 2004, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[12]  Daniel Stutzbach,et al.  Understanding churn in peer-to-peer networks , 2006, IMC '06.

[13]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[14]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[15]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[16]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.