Anomaly detection based on feature extraction of unknown protocol payload format

Intrusion detection technology is mainly divided into two categories: Misuse Detection and Anomaly Detection. Misuse Detection cannot detect unknown attacks, and the rate of false negatives is high. Abnormal Detection's false alarms rate is high, and practical applications are few. The current mainstream intrusion detection systems (IDS) consider application scenarios that are common network environments and pursue high cost performance. To make up for the deficiencies of IDS, this paper proposes an attack detection in special network, which exist a large number of user-defined unknown protocols. By extracting features from the unknown protocols, the characteristics of the unknown protocol format are obtained, and anomaly detection is performed based on this characteristic. This paper first demonstrates the feature extraction technology of unknown protocols, explains mathematically how to obtain the protocol format information of each layer of the unknown protocol, then conducts simulation experiments and compares the detection results with ordinary misuse detection and anomaly detection.

[1]  Yoshio Tateno,et al.  Accuracy of estimated phylogenetic trees from molecular data , 1983, Journal of Molecular Evolution.

[2]  Li Guo,et al.  A semantics aware approach to automated reverse engineering unknown protocols , 2012, 2012 20th IEEE International Conference on Network Protocols (ICNP).

[3]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[4]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[5]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[6]  I. Bichindaritz,et al.  Knowledge Based Phylogenetic Classification Mining , 2004, ICDM.

[7]  Udi Manber,et al.  A FAST ALGORITHM FOR MULTI-PATTERN SEARCHING , 1999 .

[8]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[11]  James Theiler,et al.  Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware , 2001, FPGA '01.

[12]  Konstantina Papagiannaki,et al.  Toward the Accurate Identification of Network Applications , 2005, PAM.

[13]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[14]  Yoshio Tateno,et al.  Accuracy of estimated phylogenetic trees from molecular data , 2005, Journal of Molecular Evolution.

[15]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.