Zero-Day Traffic Identification

Recent research on Internet traffic classification has achieved certain success in the application of machine learning techniques into flow statistics based method. However, existing methods fail to deal with zero-day traffic which are generated by previously unknown applications in a traffic classification system. To tackle this critical problem, we propose a novel traffic classification scheme which has the capability of identifying zero-day traffic as well as accurately classifying the traffic generated by pre-defined application classes. In addition, the proposed scheme provides a new mechanism to achieve fine-grained classification of zero-day traffic through manually labeling very few traffic flows. The preliminary empirical study on a big traffic data show that the proposed scheme can address the problem of zero-day traffic effectively. When zero-day traffic present, the classification performance of the proposed scheme is significantly better than three state-of-the-art methods, random forest classifier, classification with flow correlation, and semi-supervised traffic classification.

[1]  Philippe Owezarski,et al.  MINETRAC: Mining flows for unsupervised analysis & semi-supervised classification , 2011, 2011 23rd International Teletraffic Congress (ITC).

[2]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[3]  Maurizio Dusi,et al.  Traffic classification through simple statistical fingerprinting , 2007, CCRV.

[4]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[5]  Zhi-Li Zhang,et al.  A Modular Machine Learning System for Flow-Level Traffic Classification in Large Networks , 2012, TKDD.

[6]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[7]  Anirban Mahanti,et al.  Traffic classification using clustering algorithms , 2006, MineNet '06.

[8]  Sebastian Zander,et al.  Timely and Continuous Machine-Learning-Based Classification for Interactive IP Traffic , 2012, IEEE/ACM Transactions on Networking.

[9]  Jun Zhang,et al.  A novel semi-supervised approach for network traffic clustering , 2011, 2011 5th International Conference on Network and System Security.

[10]  Sebastian Zander,et al.  Automated traffic classification and application identification using machine learning , 2005, The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05)l.

[11]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[12]  Marco Mellia,et al.  Revealing skype traffic: when randomness plays with you , 2007, SIGCOMM 2007.

[13]  Michalis Faloutsos,et al.  Internet traffic classification demystified: myths, caveats, and the best practices , 2008, CoNEXT '08.

[14]  Jun Zhang,et al.  Network Traffic Classification Using Correlation Information , 2013, IEEE Transactions on Parallel and Distributed Systems.

[15]  Renata Teixeira,et al.  Traffic classification on the fly , 2006, CCRV.

[16]  Xenofontas A. Dimitropoulos,et al.  Classifying internet one-way traffic , 2012, Internet Measurement Conference.

[17]  Andrew W. Moore,et al.  Bayesian Neural Networks for Internet Traffic Classification , 2007, IEEE Transactions on Neural Networks.

[18]  Carey L. Williamson,et al.  Offline/realtime traffic classification using semi-supervised learning , 2007, Perform. Evaluation.

[19]  Luca Salgarelli,et al.  Support Vector Machines for TCP traffic classification , 2009, Comput. Networks.

[20]  Renata Teixeira,et al.  Early Recognition of Encrypted Applications , 2007, PAM.

[21]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.