Self-Learning Peer-to-Peer Traffic Classifier

The popularity of a new generation of smart peer-to-peer applications has resulted in several new challenges for accurately classifying network traffic. In this paper, we propose a novel 2stage p2p traffic classifier, called Self Learning Traffic Classifier (SLTC), that can accurately identify p2p traffic in high speed networks. The first stage classifies p2p traffic from the rest of the network traffic, and the second stage automatically extracts application payload signatures to accurately identify the p2p application that generated the p2p flow. For the first stage, we propose a fast, light-weight algorithm called Time Correlation Metric (TCM), that exploits the temporal correlation of flows to clearly separate peer-to-peer (p2p) traffic from the rest of the traffic. Using real network traces from tier-1 ISPs that are located in different continents, we show that the detection rate of TCM is consistently above 95% while always keeping the false positives at 0%. For the second stage, we use the LASER signature extraction algorithm [17] to accurately identify signatures of several known and unknown p2p protocols with very small false positive rate (< 1%). Using our prototype on tier-1 ISP traces, we demonstrate that SLTC automatically learns signatures for more than 95% of both known and unknown traffic within 3 minutes.

[1]  Daniel Stutzbach,et al.  Understanding churn in peer-to-peer networks , 2006, IMC '06.

[2]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[3]  Patrick Haffner,et al.  ACAS: automated construction of application signatures , 2005, MineNet '05.

[4]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[5]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[6]  Panayiotis Mavrommatis,et al.  Identifying Known and Unknown Peer-to-Peer Traffic , 2006, Fifth IEEE International Symposium on Network Computing and Applications (NCA'06).

[7]  Sebastian Zander,et al.  Self-Learning IP Traffic Classification Based on Statistical Flow Characteristics , 2005, PAM.

[8]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[9]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[10]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[11]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[12]  James Won-Ki Hong,et al.  Towards automated application signature generation for traffic identification , 2008, NOMS 2008 - 2008 IEEE Network Operations and Management Symposium.

[13]  Henning Schulzrinne,et al.  An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol , 2004, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[14]  Michalis Faloutsos,et al.  Transport layer identification of P2P traffic , 2004, IMC '04.