Identification and Selection of Flow Features for Accurate Traffic Classification in SDN

Software-Defined Networking (SDN) aims to alleviate the limitations imposed by traditional IP networks by decoupling network tasks performed on each device in particular planes. This approach offers several benefits, such as standard communication protocols, centralized network functions, and specific network elements, for example, controller devices. Despite these benefits, there is still a lack of adequate support for performing tasks related to traffic classification, because (i) there are traffic profiles that are very similar, which makes their classification difficult (e.g., Both HTTP and DNS flows are characterized by packet bursts), (ii) Open Flow, the key SDN implementation today, only offers native flow features, such as packet and byte count, that do not describe intrinsic traffic profiles, and (iii) there is a lack of support to determine what is the optimal set of flow features to characterize different types of traffic profiles. In this paper, we introduce an architecture to collect, extend, and select flow features for traffic classification in Open Flow-based networks. The main goal of our solution is to offer an extensive set of flow features that can be analyzed and refined and to be capable of finding the optimal subset of features to classify different types of traffic flows. The experimental evaluation of our proposal shows that some features emerge as meaningful, occupying the top positions for the classification of distinct flows in different experimental scenarios.

[1]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[2]  Zahir Tari,et al.  Toward an efficient and scalable feature selection approach for internet traffic classification , 2013, Comput. Networks.

[3]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[4]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[5]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[6]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[7]  Lisandro Zambenedetti Granville,et al.  Software-defined networking: management requirements and challenges , 2015, IEEE Communications Magazine.

[8]  Matti Mantere,et al.  Feature Selection for Machine Learning Based Anomaly Detection in Industrial Control System Networks , 2012, 2012 IEEE International Conference on Green Computing and Communications.

[9]  Peter Filzmoser,et al.  Robust feature selection and robust PCA for internet traffic anomaly detection , 2012, 2012 Proceedings IEEE INFOCOM.

[10]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Pier Luca Lanzi,et al.  Fast feature selection with genetic algorithms: a filter approach , 1997, Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC '97).

[12]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[13]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .