A hybrid multiple-character transition finite-automaton for string matching engine

Display Omitted A hybrid finite automaton is proposed with deterministic and nondeterministic parts.The hybrid FA is capable of inspecting multiple characters in parallel.The space required by the finite automata is efficient when scales up.The transitions number increases almost linearly to the number of multi-character.A configurable multi-stage architecture can implement the hybrid finite automaton. The throughput of a string-matching engine can be multiplied up by inspecting multiple characters in parallel. However, the space that is required to implement a matching engine that can process multiple characters in every cycle grows dramatically with the number of characters to be processed in parallel. This paper presents a hybrid finite automaton (FA) that has deterministic and nondeterministic finite automaton (NFA and DFA) parts and is based on the Aho-Corasick algorithm, for inspecting multiple characters in parallel while maintaining favorable space utilization. In the presented approach, the number of multi-character transitions increases almost linearly with respect to the number of characters to be inspected in parallel. This paper also proposes a multi-stage architecture for implementing the hybrid FA. Since this multi-stage architecture has deterministic stages, configurable features can be introduced into it for processing various keyword sets by simply updating the configuration. The experimental results of the implementation of the multi-stage architecture on FPGAs for 8-character transitions reveal a 4.3 Gbps throughput with a 67MHz clock, and the results obtained when the configurable architecture with two-stage pipelines was implemented in ASICs reveal a 7.9 Gbps throughput with a 123MHz clock.

[1]  Sheng-De Wang,et al.  An efficient multicharacter transition string-matching engine based on the aho-corasick algorithm , 2013, ACM Trans. Archit. Code Optim..

[2]  George Varghese,et al.  Deterministic memory-efficient string matching algorithms for intrusion detection , 2004, IEEE INFOCOM 2004.

[3]  T. V. Lakshman,et al.  Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection , 2009, IEEE INFOCOM 2009.

[4]  Mohammad Bagher Ghaznavi-Ghoushchi,et al.  A Multi-Gb/s Parallel String Matching Engine for Intrusion Detection Systems , 2008 .

[5]  Fabrizio Petrini,et al.  Exact multi-pattern string matching on the cell/b.e. processor , 2008, CF '08.

[6]  Benfano Soewito,et al.  Packet Inspection on Programmable Hardware , 2013 .

[7]  Sheng-De Wang,et al.  A MULTI-CHARACTER TRANSITION STRING MATCHING ARCHITECTURE BASED ON AHO-CORASICK ALGORITHM , 2012 .

[8]  Sartaj Sahni,et al.  Highly compressed Aho-Corasick automata for efficient intrusion detection , 2008, 2008 IEEE Symposium on Computers and Communications.

[9]  Wei Lin,et al.  Pipelined Parallel AC-Based Approach for Multi-String Matching , 2008, 2008 14th IEEE International Conference on Parallel and Distributed Systems.

[10]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[11]  Patrick Crowley,et al.  A hybrid finite automaton for practical deep packet inspection , 2007, CoNEXT '07.

[12]  Norio Yamagaki,et al.  High-speed regular expression matching engine using multi-character NFA , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[13]  Kei Hiraki,et al.  Over 10Gbps String Matching Mechanism for Multi-stream Packet Scanning Systems , 2004, FPL.

[14]  Bin Liu,et al.  A memory-efficient pipelined implementation of the aho-corasick string-matching algorithm , 2010, TACO.

[15]  Dionisios N. Pnevmatikatos,et al.  A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems , 2007, 2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[16]  Xing Wang,et al.  Multi-Stride String Searching for High-Speed Content Inspection , 2012, Comput. J..

[17]  Tsutomu Sasao,et al.  A regular expression matching circuit: Decomposed non-deterministic realization with prefix sharing and multi-character transition , 2012, Microprocess. Microsystems.

[18]  Vijay Kumar,et al.  High Speed Pattern Matching for Network IDS/IPS , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[19]  Christopher R. Clark,et al.  Scalable pattern matching for high speed networks , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[20]  Gerald Tripp,et al.  A Parallel “String Matching Engine” for use in High Speed Network Intrusion Detection Systems , 2006, Journal in Computer Virology.