Better network traffic identification through the independent combination of techniques

Traffic identification is currently an important challenge for network management and dimensioning. In recent years, some new algorithms and the different uses of known techniques have been proposed, yet the results are so far limited in scope and frequently disappointing. Furthermore, existing results cannot be directly compared, since networks and traffic profiles differ significantly among collected traces. When submitted to an analysis, considering different networks, data granularities and baselines, most algorithms perform well in one or two scenarios. However, no algorithm has proven better than the others in the majority of the scenarios. Summarizing four years of research in traffic identification, this work shows that the identification abilities of algorithms vary for different situations and proposes a new methodology for traffic identification through the combination of any set of algorithms for traffic identification. Four different combination mechanisms (and many variations) are validated against four different network scenarios that are commonly used in the literature. Combination shows promising results, mainly because it revealed to be robust against bias towards any scenario, which happens in previous identification algorithms.

[1]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[2]  Michalis Faloutsos,et al.  Is P2P dying or just hiding? [P2P traffic measurement] , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[3]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[4]  Judith Kelner,et al.  A Survey on Internet Traffic Identification , 2009, IEEE Communications Surveys & Tutorials.

[5]  Gaogang Xie,et al.  Accurate Online Traffic Classification with Multi-Phases Identification Methodology , 2008, 2008 5th IEEE Consumer Communications and Networking Conference.

[6]  Li Jun,et al.  Internet Traffic Classification Using Machine Learning , 2007, 2007 Second International Conference on Communications and Networking in China.

[7]  Malcolm J. Beynon,et al.  The Dempster-Shafer Theory , 2009, Encyclopedia of Artificial Intelligence.

[8]  I. Anantavrasilp,et al.  Automatic flow classification using machine learning , 2007, 2007 15th International Conference on Software, Telecommunications and Computer Networks.

[9]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Xiaohong Guan,et al.  Accurate Classification of the Internet Traffic Based on the SVM Method , 2007, 2007 IEEE International Conference on Communications.

[12]  Zhi-Li Zhang,et al.  Profiling internet backbone traffic: behavior models and applications , 2005, SIGCOMM '05.

[13]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[14]  István Szabó,et al.  Accurate Traffic Classification , 2007, 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks.

[15]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[16]  Oliver Spatscheck,et al.  Accurate, scalable in-network identification of p2p traffic using application signatures , 2004, WWW '04.

[17]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[18]  István Szabó,et al.  On the Validation of Traffic Classification Algorithms , 2008, PAM.

[19]  Stuart C. Shapiro,et al.  Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .