A Near Real-time IP Traffic Classification Using Machine Learning

With drastic increase in internet traffic over last few years due to increase in number of internet users, IP traffic classification has gained significant importance for research community as well as various internet service providers for optimization of their network performance and for governmental intelligence organizations. Today, traditional IP traffic classification techniques such as port number and payload based direct packet inspection techniques are rarely used because of use of dynamic port number instead of wellknown port number in packet headers and various cryptographic techniques which inhibit inspection of packet payload. Current trends are use of machine learning (ML) techniques for IP traffic classification. In this research paper, a real time internet traffic dataset has been developed using packet capturing tool for 2 second packet capturing duration and other datasets have been developed by reducing number of features of 2 second duration dataset using Correlation and Consistency based Feature Selection (FS) Algorithms. Then, five ML algorithms MLP, RBF, C4.5, Bayes Net and Naïve Bayes are employed for IP traffic classification with these datasets. This experimental analysis shows that Bayes Net is an effective ML technique for near real time and online IP traffic classification with reduction in packet capture duration and reduction in number of features characterizing each application sample with Correlation based FS Algorithm.

[1]  Shahrul Azman Noah,et al.  Performance Comparison of Multi-layer Perceptron (Back Propagation, Delta Rule and Perceptron) algorithms in Neural Networks , 2009, 2009 IEEE International Advance Computing Conference.

[2]  Ioan Pop,et al.  An approach of the Naive Bayes classifier for the document classification 1 , 2006 .

[3]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[4]  Kuldeep Singh,et al.  Internet Traffic Classification , 2011 .

[5]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[6]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[7]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[8]  Kuldeep Singh,et al.  Comparative analysis of five machine learning algorithms for IP traffic classification , 2011, 2011 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC).

[9]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[10]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[11]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[12]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[13]  Bo Yang,et al.  Traffic classification using probabilistic neural networks , 2010, 2010 Sixth International Conference on Natural Computation.