Network traffic signature generation mechanism using principal component analysis

The Deep Packet Inspection (DPI) method is a popular method that can accurately identify the flow data and its corresponding application. Currently, the DPI method is widely used in common network management systems. However, the major limitation of DPI systems is that their signature library is mainly extracted manually, which makes it hard to efficiently obtain the signature of new applications. Hence, in this paper, we propose an automatic signature extraction mechanism using Principal Component Analysis (PCA) technology, which is able to extract the signature automatically. In the proposed method, the signatures are expressed in the form of serial consistent sequences constructed by principal components instead of normally separated substrings in the original data extracted from the traditional methods. Extensive experiments based on numerous sets of data have been carried out to evaluate the performance of the proposed scheme, and the results prove that the newly proposed method can achieve good performance in terms of accuracy and efficiency.

[1]  Xiaohong Huang,et al.  Automatic traffic signature extraction based on Smith-waterman algorithm for traffic classification , 2010, 2010 3rd IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT).

[2]  Judith Kelner,et al.  Deep packet inspection tools and techniques in commodity platforms: Challenges and trends , 2012, J. Netw. Comput. Appl..

[3]  Li Guo,et al.  An efficient regular expressions compression algorithm from a new perspective , 2011, 2011 Proceedings IEEE INFOCOM.

[4]  Yuming Jiang,et al.  A Structural Analysis of Network Delay , 2011, 2011 Ninth Annual Communication Networks and Services Research Conference.

[5]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[6]  Yong Guan,et al.  Sketch-Based Streaming PCA Algorithm for Network-Wide Traffic Anomaly Detection , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[7]  Chadi Barakat,et al.  Can We Trust the Inter-Packet Time for Traffic Classification? , 2011, 2011 IEEE International Conference on Communications (ICC).

[8]  Kensuke Fukuda,et al.  Evaluation of Anomaly Detection Based on Sketch and PCA , 2010, 2010 IEEE Global Telecommunications Conference GLOBECOM 2010.

[9]  Xiaohong Huang,et al.  LCGT: A Low-Cost Continuous Ground Truth Generation Method for Traffic Classification , 2009, APNOMS.

[10]  Victor C. Valgenti,et al.  Hybrid Regular Expression Matching for Deep Packet Inspection on Multi-Core Architecture , 2010, 2010 Proceedings of 19th International Conference on Computer Communications and Networks.

[11]  James Won-Ki Hong,et al.  Toward fine-grained traffic classification , 2011, IEEE Communications Magazine.

[12]  Ming-Yang Kao,et al.  Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[13]  Viktor K. Prasanna,et al.  High Performance Dictionary-Based String Matching for Deep Packet Inspection , 2010, 2010 Proceedings IEEE INFOCOM.

[14]  Chadi Barakat,et al.  Using host profiling to refine statistical application identification , 2012, 2012 Proceedings IEEE INFOCOM.

[15]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[16]  Neco Ventura,et al.  Accurate signature generation for polymorphic worms using principal component analysis , 2010, 2010 IEEE Globecom Workshops.