Optimizing Pattern Matching for Intrusion Detection

This paper presents an optimized version of the Aho-Corasick [1] algorithm. This design represents a significant enhancement to the author’s original implementation released in 2002 as part of an update to the Snort Intrusion Detection System. The enhanced design uses an optimized vector implementation of the Aho-Corasick state table that significantly improves performance. A memory efficient variant uses sparse matrix storage to reduce memory requirements and further improve performance on large pattern groups. Intrusion Detection Systems are very specialized applications that require real-time pattern matching capabilities at very high network speeds, and in hostile environments. Several of the major issues that must be considered in pattern matching and Intrusion Detection are discussed to establish a framework for the use of the Aho-Corasick algorithm as implemented in the Snort Intrusion Detection System. The performance results comparing the original, optimized, and sparse storage versions of the authors Aho-Corasick algorithm are presented. Tests were conducted using several dictionary tests and a Snort based Intrusion Detection performance test. The impact of pattern group sizes and compiler selection on performance is also demonstrated using several popular compilers. Index Terms – pattern matching, Aho-Corasick, Intrusion Detection, IDS, Snort

[1]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[2]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[3]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[4]  Beate Commentz-Walter,et al.  A String Matching Algorithm Fast on the Average , 1979, ICALP.

[5]  Robert E. Tarjan,et al.  Storing a sparse table , 1979, CACM.

[6]  R. Nigel Horspool,et al.  Practical fast searching in strings , 1980, Softw. Pract. Exp..

[7]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[8]  I. Duff,et al.  Direct Methods for Sparse Matrices , 1987 .

[9]  Andrew Binstock,et al.  Practical algorithms for programmers , 1995 .

[10]  Vineet Bafna,et al.  Pattern Matching Algorithms , 1997 .

[11]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[12]  Udi Manber,et al.  A FAST ALGORITHM FOR MULTI-PATTERN SEARCHING , 1999 .

[13]  George Varghese,et al.  Fast Content-Based Packet Handling for Intrusion Detection , 2001 .

[14]  C.J. Coit,et al.  Towards faster string matching for intrusion detection or exceeding the speed of Snort , 2001, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01.

[15]  英哉 岩崎 20世紀の名著名論:D. E. Knuth J. H. Morris V. R. Pratt : Fast pattern matching in Strings , 2004 .

[16]  George Varghese,et al.  Deterministic memory-efficient string matching algorithms for intrusion detection , 2004, IEEE INFOCOM 2004.