DFA-based and SIMD NFA-based regular expression matching on cell BE for fast network traffic filtering

Regular expression matching is the heart of many data processing routines, such as string search, network traffic filtering, etc. The traditional way of regexp matching is building and execution of a deterministic finite automaton (DFA), that provides O(1) processing time per 1 input symbol for any regular expression. But this technique almost always forces many modern SIMD-processors to perform regexp search in scalar mode, thus it doesn't use the most part of their computational power. This paper represents traditional straightforward DFA along with another regexp implementation, based on nondeterministic finite automata (NFA) SIMD-simulation on Cell Broadband Engine processor. Software implementation of NFA-based SIMD algorithm achieves as much as 10 Gbit/s per one Cell BE processor with 512 NFA states, thus it is feasible for preliminary network traffic filtering of suspicious objects, while DFA-based scalar one gains up to 60 Gbit/s with 60-state automaton. One Cell BE processor can maintain NFA of 6000..7000 overall states simultaneously, so if one wants to use signatures with more that 512 states, it's possible with linear performance-to-signatures tradeoff.

[1]  Jonathan S. Turner,et al.  Advanced algorithms for fast and scalable deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[2]  Abedelaziz Mohaisen,et al.  Deep Packet Inspection for Intrusion Detection Systems: A Survey , 2007 .

[3]  Gaston H. Gonnet,et al.  A new approach to text searching , 1989, SIGIR '89.

[4]  Gonzalo Navarro,et al.  Pattern Matching , 2008, Encyclopedia of Algorithms.

[5]  George Varghese,et al.  Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia , 2007, ANCS '07.

[6]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[7]  Bell Telephone,et al.  Regular Expression Search Algorithm , 1968 .

[8]  T. V. Lakshman,et al.  Fast and memory-efficient regular expression matching for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[9]  Fabrizio Petrini,et al.  Exact multi-pattern string matching on the cell/b.e. processor , 2008, CF '08.

[10]  Somesh Jha,et al.  XFA: Faster Signature Matching with Extended Automata , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[11]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM.

[12]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[13]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[14]  Stefano Giordano,et al.  An improved DFA for fast regular expression matching , 2008, CCRV.

[15]  Neelam Goyal,et al.  Signature Matching in Network Processing using SIMD / GPU Architectures , 2007 .