Traffic Classification Based on Zero-Length Packets

Network traffic classification is fundamental to network management and its performance. However, traditional approaches for traffic classification, which were designed to work on a dedicated hardware at very high line rates, may not function well in a virtual software-based environment. In this paper, we devise a novel fingerprinting technique that can be utilized as a software-based solution which enables machine-learning-based classification of ongoing flows. The suggested scheme is very simple to implement and requires minimal resources, yet attains very high accuracy. Specifically, for TCP flows, we suggest a fingerprint that is based on zero-length packets, hence enables a highly efficient sampling strategy which can be adopted with a single content-addressable memory rule. The suggested fingerprinting scheme is robust to network conditions such as congestion, fragmentation, delay, retransmissions, duplications, and losses and to varying processing capabilities. Hence, its performance is essentially independent of placement and migration issues, and thus yields an attractive solution for virtualized software-based environments. We suggest an analogous fingerprinting scheme for user datagram protocol traffic, which benefits from the same advantages as the TCP one and attains very high accuracy as well. Results show that our scheme correctly classified about 97% of the flows on the dataset tested, even on encrypted data.

[1]  Xiaofeng Chen,et al.  RangeTree: A Feature Selection Algorithm for C4.5 Decision Tree , 2013, 2013 5th International Conference on Intelligent Networking and Collaborative Systems.

[2]  Elena Baralis,et al.  Hierarchical learning for fine grained internet traffic classification , 2012, 2012 8th International Wireless Communications and Mobile Computing Conference (IWCMC).

[3]  Pooja Mehta,et al.  A Survey of Network Based Traffic Classification Methods , 2017 .

[4]  Guochu Shou,et al.  Online automatic traffic classification architecture in access network , 2009, 2009 9th International Conference on Electronic Measurement & Instruments.

[5]  Viktor K. Prasanna,et al.  Dynamically configurable online statistical flow feature extractor on FPGA , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[6]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[7]  Yuan-Cheng Lai,et al.  Application classification using packet size distribution and port association , 2009, J. Netw. Comput. Appl..

[8]  David Walker,et al.  CacheFlow: Dependency-Aware Rule-Caching for Software-Defined Networks , 2016, SOSR.

[9]  Nabin Kumar Karn,et al.  Network Traffic Classification techniques and comparative analysis using Machine Learning algorithms , 2016, 2016 2nd IEEE International Conference on Computer and Communications (ICCC).

[10]  Ali A. Ghorbani,et al.  Online Classification of Network Flows , 2009, 2009 Seventh Annual Communication Networks and Services Research Conference.

[11]  Sebastian Zander,et al.  Sub-flow packet sampling for scalable ML classification of interactive traffic , 2012, 37th Annual IEEE Conference on Local Computer Networks.

[12]  Lizhi Peng,et al.  A Novel Online Traffic Classification Method Based on Few Packets , 2012, 2012 8th International Conference on Wireless Communications, Networking and Mobile Computing.

[13]  T. Rajasundari,et al.  A comparative performance analysis on network traffic classification using supervised learning algorithms , 2017, 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS).

[14]  Maciej Kuźniar,et al.  What You Need to Know About SDN Flow Tables , 2015, PAM.

[15]  Pere Barlet-Ros,et al.  Extended Independent Comparison of Popular Deep Packet Inspection (DPI) Tools for Traffic Classification , 2014 .

[16]  Sandrine Vaton,et al.  Stretching the Edges of SVM Traffic Classification With FPGA Acceleration , 2014, IEEE Transactions on Network and Service Management.

[17]  Colin J. Fidge,et al.  A Comparison of Supervised Machine Learning Algorithms for Classification of Communications Network Traffic , 2017, ICONIP.

[18]  Carsten Lund,et al.  Learn more, sample less: control of volume and variance in network measurement , 2005, IEEE Transactions on Information Theory.

[19]  Augustin Soule,et al.  Blind application recognition through behavioral classification , 2006 .

[20]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[21]  Nicolas Hohn,et al.  Inverting sampled traffic , 2003, IEEE/ACM Transactions on Networking.

[22]  Judith Kelner,et al.  A stratified traffic sampling methodology for seeing the big picture , 2008, Comput. Networks.

[23]  Antonio Pescapè,et al.  Classification of Network Traffic via Packet-Level Hidden Markov Models , 2008, IEEE GLOBECOM 2008 - 2008 IEEE Global Telecommunications Conference.

[24]  Michalis Faloutsos,et al.  BLINC: multilevel traffic classification in the dark , 2005, SIGCOMM '05.

[25]  Praveen Yalagandula,et al.  Minimizing Rulesets for TCAM Implementation , 2009, IEEE INFOCOM 2009.

[26]  Shunzheng Yu,et al.  Internet Traffic Classification Using Machine Learning: A Token-based Approach , 2011, 2011 14th IEEE International Conference on Computational Science and Engineering.

[27]  Andrea Baiocchi,et al.  Low complexity, high performance neuro-fuzzy system for Internet traffic flows early classification , 2013, 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC).

[28]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[29]  Pere Barlet-Ros,et al.  Is Our Ground-Truth for Traffic Classification Reliable? , 2014, PAM.

[30]  Riyad Alshammari,et al.  Machine learning based encrypted traffic classification: Identifying SSH and Skype , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[31]  Vijay Mann,et al.  Effective switch memory management in OpenFlow networks , 2014, DEBS '14.

[32]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[33]  Aamir Shafi,et al.  Virtual TCAM for Data Center switches , 2015, 2015 IEEE Conference on Network Function Virtualization and Software Defined Network (NFV-SDN).

[34]  Myriana Rifai,et al.  Too Many SDN Rules? Compress Them with MINNIE , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[35]  Baohua Yang,et al.  SMILER: Towards Practical Online Traffic Classification , 2011, 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems.

[36]  Tina R. Patil,et al.  Performance Analysis of Naive Bayes and J 48 Classification Algorithm for Data Classification , 2013 .

[37]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[38]  George Varghese,et al.  Building a better NetFlow , 2004, SIGCOMM.

[39]  Renata Teixeira,et al.  Early application identification , 2006, CoNEXT '06.

[40]  Martín Casado,et al.  The Design and Implementation of Open vSwitch , 2015, NSDI.