A Multi-dimensional Progressive Perfect Hashing for High-Speed String Matching

Aho-Corasick (AC) automaton is widely used for multi-string matching in today's Network Intrusion Detection System (NIDS). With fast-growing rule sets, implementing AC automaton with a small memory without sacrificing its performance has remained challenging in NIDS design. In this paper, we propose a multi-dimensional progressive perfect hashing algorithm named P2-Hashing, which allows transitions of an AC automaton to be placed in a compact hash table without any collision. P2-Hashing is based on the observation that a hash key of each transition consists of two dimensions, namely a source state ID and an input character. When placing a transition in a hash table and causing a collision, we can change the value of a dimension of the hash key to rehash the transition to a new location of the hash table. For a given AC automaton, P2-Hashing first divides all the transitions into many small sets based on the two-dimensional values of the hash keys, and then places the sets of transitions progressively into the hash table until all are placed. Hash collisions that occurred during the insertion of a transition will only affect the transitions in the same set. The proposed P2-Hashing has many unique properties, including fast hash index generation and zero memory overhead, which are very suitable for the AC automaton operation. The feasibility and performance of P2-Hashing are investigated through simulations on the full Snort (6.4k rules) and Clam AV (54k rules) rule sets, each of which is first converted to a single AC automaton. Simulation results show that P2-Hashing can successfully construct the perfect hash table even when the load factor of the hash table is as high as 0.91.

[1]  H. Jonathan Chao,et al.  TriBiCa: Trie Bitmap Content Analyzer for High-Speed Network Intrusion Detection , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[2]  Stamatis Vassiliadis,et al.  Scalable Multigigabit Pattern Matching for Packet Inspection , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Randy H. Katz,et al.  High speed deep packet inspection with hardware support , 2006 .

[4]  T. V. Lakshman,et al.  Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection , 2009, IEEE INFOCOM 2009.

[5]  Wei Zhang,et al.  A Memory Efficient Multiple Pattern Matching Architecture for Network Security , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[6]  George Varghese,et al.  Deterministic memory-efficient string matching algorithms for intrusion detection , 2004, IEEE INFOCOM 2004.

[7]  L. Spitzner,et al.  Honeypots: Tracking Hackers , 2002 .

[8]  Dan Schnackenberg,et al.  Statistical approaches to DDoS attack detection and response , 2003, Proceedings DARPA Information Survivability Conference and Exposition.

[9]  Viktor K. Prasanna,et al.  High-throughput linked-pattern matching for intrusion detection systems , 2005, 2005 Symposium on Architectures for Networking and Communications Systems (ANCS).

[10]  T. V. Lakshman,et al.  Gigabit rate packet pattern-matching using TCAM , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[11]  George Varghese,et al.  Beyond bloom filters: from approximate membership checks to approximate state machines , 2006, SIGCOMM.

[12]  H. Jonathan Chao,et al.  Boundary Hash for Memory-Efficient Deep Packet Inspection , 2008, 2008 IEEE International Conference on Communications.

[13]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[14]  Viktor K. Prasanna,et al.  Memory-Efficient Pipelined Architecture for Large-Scale String Matching , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[15]  Patrick Crowley,et al.  Peacock Hashing: Deterministic and Updatable Hashing for High Performance Networking , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[16]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[17]  Udi Manber,et al.  A FAST ALGORITHM FOR MULTI-PATTERN SEARCHING , 1999 .

[18]  John W. Lockwood,et al.  Fast and Scalable Pattern Matching for Network Intrusion Detection Systems , 2006, IEEE Journal on Selected Areas in Communications.

[19]  Yan Luo,et al.  Efficient memory utilization on network processors for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[20]  Jan van Lunteren,et al.  High-Performance Pattern-Matching for Intrusion Detection , 2006, INFOCOM.

[21]  Patrick Crowley,et al.  HEXA: Compact Data Structures for Faster Packet Processing , 2007, 2007 IEEE International Conference on Network Protocols.

[22]  Timothy Sherwood,et al.  A high throughput string matching architecture for intrusion detection and prevention , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[23]  Antonius P. J. Engbersen,et al.  Fast and scalable packet classification , 2003, IEEE J. Sel. Areas Commun..

[24]  Bin Liu,et al.  A Memory-Efficient Parallel String Matching Architecture for High-Speed Intrusion Detection , 2006, IEEE Journal on Selected Areas in Communications.

[25]  Patrick Crowley,et al.  Efficient regular expression evaluation: theory to practice , 2008, ANCS '08.