FlowCop: Detecting "Stranger" in Network Traffic Classification

As the cornerstone of future network research, network traffic classification plays an important role on network management, cyberspace security and quality of service. Recently, many researches have used Machine Learning technologies for traffic classification. Most of them only focus on classifying the samples into predefined classes but ignoring the "strangers". In this paper, we use stranger to represent the traffic not belonging to any predefined application, and propose a novel scheme named FlowCop to achieve stranger detection in network traffic classification. By constructing multiple one-class classifiers, FlowCop can divide testing traffic into N classes and a stranger class. Since samples of stranger class are not required during the training stage, FlowCop works in an inexperienced way to detect strangers, just like the cops searching the crowd for strangers. Besides, for accurate classification and low overhead, a feature subspace algorithm is proposed to select outstanding features for each one-class classifier. To verify the effectiveness of FlowCop, we make contrast experiments on two real-world datasets. The results show that FlowCop can not only identify the predefined traffic flows but also detect the strangers. It outperforms four state-of-the-art approaches on both precision and recall.

[1]  Wei Lu,et al.  A Heuristic-Based Co-clustering Algorithm for the Internet Traffic Classification , 2014, 2014 28th International Conference on Advanced Information Networking and Applications Workshops.

[2]  Anthony McGregor,et al.  Flow Clustering Using Machine Learning Techniques , 2004, PAM.

[3]  Judith Kelner,et al.  Multi-objective optimization of a hybrid model for network traffic classification by combining machine learning techniques , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[4]  Anil Kumar,et al.  Incorporating multiple cluster models for network traffic classification , 2015, 2015 IEEE 40th Conference on Local Computer Networks (LCN).

[5]  Jie Wu,et al.  Robust Network Traffic Classification , 2015, IEEE/ACM Transactions on Networking.

[6]  Yong Shi,et al.  Feature Selection with Attributes Clustering by Maximal Information Coefficient , 2013, ITQM.

[7]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[8]  Sukumar Nandi,et al.  Early detection of VoIP network flows based on sub-flow statistical characteristics of flows using machine learning techniques , 2014, 2014 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS).

[9]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[10]  Niccolo Cascarano,et al.  GT: picking up the truth from the ground for internet traffic , 2009, CCRV.

[11]  Ujjwal Maulik,et al.  Integration of dense subgraph finding with feature clustering for unsupervised feature selection , 2014, Pattern Recognit. Lett..

[12]  Andrew W. Moore,et al.  Internet traffic classification using bayesian analysis techniques , 2005, SIGMETRICS '05.

[13]  Gaogang Xie,et al.  Toward Unsupervised Protocol Feature Word Extraction , 2014, IEEE Journal on Selected Areas in Communications.

[14]  Sebastian Zander,et al.  A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification , 2006, CCRV.

[15]  Qinghua Hu,et al.  Subspace clustering guided unsupervised feature selection , 2017, Pattern Recognit..

[16]  Marwan A. Al-Namari,et al.  Internet traffic classification using machine learning approach: Datasets validation issues , 2016, 2016 Conference of Basic Sciences and Engineering Studies (SGCAC).

[17]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.

[18]  Riyad Alshammari,et al.  How Robust Can a Machine Learning Approach Be for Classifying Encrypted VoIP? , 2014, Journal of Network and Systems Management.

[19]  Carey L. Williamson,et al.  Categories and Subject Descriptors: C.4 [Computer Systems Organization]Performance of Systems , 2022 .

[20]  Jiong Jin,et al.  Novel feature selection and classification of Internet video traffic based on a hierarchical scheme , 2017, Comput. Networks.

[21]  Michalis Faloutsos,et al.  SubFlow: Towards practical flow-level traffic classification , 2012, 2012 Proceedings IEEE INFOCOM.

[22]  Riyad Alshammari,et al.  Identification of VoIP encrypted traffic using a machine learning approach , 2015, J. King Saud Univ. Comput. Inf. Sci..

[23]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[24]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.