A modular NFA architecture for regular expression matching

We propose a non-deterministic finite automata (NFA) based architecture for regexp scanners on FPGA, called CES: the Character Class with Constraint Repetition (CCR) based regExp Scanner. CES is designed to realize a new MIN-MAX counting algorithm, which can solve both the character class ambiguity problem and the overlapped matching problem. CES also supports non-regular Perl grammars such as zero-width pattern and back-reference. We propose a CCR-syntax tree and its parsing scheme to map a Perl or POSIX regexp rule to a CES topology. The interconnection patterns, and operational parameters of CCR modules (CCRM), which are the building blocks of CES, can be easily configured by regular memory writes when regexp rules change, without re-synthesis of low-level logic. For implementation, character classes of CCRs are stored in Block RAMs. The MIN-MAX algorithm uses two counters MIN and MAX to resolve the character class ambiguity problem. Two checkpoint counters are employed to implement overlapped matching detection. CES topologies optimized for different types of rules can run in different Partial Reconfigurable Regions (PRR), and can be swapped on the fly by a PRR controller. We developed a tool chain to automate the CES implementation to a Virtex 5 LX110T device. This device can host up to 3000 CCRMs, and run at an estimated throughput of 1.996 Gbps in simulation, and 863 Mbps between a PC and the Virtex 5 board in real tests. The Snort and SpamAssassin rule sets can be parsed and mapped in milliseconds. Once a base CES architecture is synthesized, the physical reconfiguration of a CES on the Virtex 5 LX110T chip can be done in less than a second.

[1]  Marco D. Santambrogio,et al.  An adaptable FPGA-based System for Regular Expression Matching , 2008, 2008 Design, Automation and Test in Europe.

[2]  Viktor K. Prasanna,et al.  Time and area efficient pattern matching on FPGAs , 2004, FPGA '04.

[3]  Jyh-Charn Liu,et al.  SA2PX: A Tool to Translate SpamAssassin Regular Expression Rules to POSIX , 2009 .

[4]  Dionisios N. Pnevmatikatos,et al.  Fast, Large-Scale String Match for a 10Gbps FPGA-Based Network Intrusion Detection System , 2003, FPL.

[5]  Cheng-Hung Lin,et al.  Optimization of pattern matching algorithm for memory based architecture , 2007, ANCS '07.

[6]  D UllmanJeffrey,et al.  Introduction to automata theory, languages, and computation, 2nd edition , 2001 .

[7]  Timothy Sherwood,et al.  A high throughput string matching architecture for intrusion detection and prevention , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[8]  George Varghese,et al.  Fast Content-Based Packet Handling for Intrusion Detection , 2001 .

[9]  T. V. Lakshman,et al.  Fast and memory-efficient regular expression matching for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[10]  George Varghese,et al.  Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia , 2007, ANCS '07.

[11]  Viktor K. Prasanna,et al.  Regular Expression Software Deceleration for Intrusion Detection Systems , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[12]  Miad Faezipour,et al.  Constraint Repetition Inspection for Regular Expression on FPGA , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[13]  Viktor K. Prasanna,et al.  Automatic Construction of Large-Scale Regular Expression Matching Engines on FPGA , 2008, 2008 International Conference on Reconfigurable Computing and FPGAs.

[14]  Elliot Berk,et al.  JLex: A lexical analyzer generator for Java , 2004 .

[15]  Brad L. Hutchings,et al.  Assisting network intrusion detection with reconfigurable hardware , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[16]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM 2006.

[17]  Ron K. Cytron,et al.  A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[18]  Sarang Dharmapurikar,et al.  Implementation results of bloom filters for string matching , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[19]  Cheng-Hung Lin,et al.  Optimization of Regular Expression Pattern Matching Circuits on FPGA , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[20]  Somesh Jha,et al.  Deflating the big bang: fast and scalable deep packet inspection with extended finite automata , 2008, SIGCOMM '08.

[21]  John W. Lockwood,et al.  Implementation of a content-scanning module for an Internet firewall , 2003, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2003. FCCM 2003..

[22]  Paul D. Franzon,et al.  Hardware Architecture of a Parallel Pattern Matching Engine , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[23]  Laxmi N. Bhuyan,et al.  Compiling PCRE to FPGA for accelerating SNORT IDS , 2007, ANCS '07.

[24]  Stamatis Vassiliadis,et al.  Regular Expression Matching in Reconfigurable Hardware , 2008, J. Signal Process. Syst..

[25]  Viktor K. Prasanna,et al.  Fast Regular Expression Matching Using FPGAs , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[26]  Christopher R. Clark,et al.  Scalable pattern matching for high speed networks , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[27]  Srihari Cadambi,et al.  Memory-Efficient Regular Expression Search Using State Merging , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[28]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[29]  John W. Lockwood,et al.  A Scalable Hybrid Regular Expression Pattern Matcher , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[30]  Yan Luo,et al.  DPICO: a high speed deep packet inspection engine using compact finite automata , 2007, ANCS '07.