Hardware-accelerated regular expression matching at multiple tens of Gb/s

Hardware acceleration of regular expression matching is key to meeting the throughput requirements of state-of-the-art network intrusion detection systems (NIDSs) dictated by fast growing link speeds. This paper presents extensions to a programmable state machine, called B-FSM, which was initially optimized for string matching. These extensions enable direct support in hardware for essential regular expression features, such as character classes and case insensitivity. Moreover, they also allow the exploitation of regular expression properties that show up at the data structure level as common transitions shared between multiple states, resulting in storage reductions of up to 95% for five NIDS pattern sets analyzed. Additional instruction support based on a flexible integration within the B-FSM data structure increases the processing capabilities and enables the scaling to larger pattern collections. The new IBM Power Edge of NetworkTM processor employs the B-FSM technology to provide scanning capabilities at typical rates of 20-40 Gb/s.

[1]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM 2006.

[2]  Danny Hendler,et al.  Space-Efficient TCAM-Based Classification Using Gray Coding , 2007, IEEE Transactions on Computers.

[3]  Christoph Hagleitner,et al.  Memory-efficient distribution of regular expressions for fast deep packet inspection , 2009, CODES+ISSS '09.

[4]  Somesh Jha,et al.  Deflating the big bang: fast and scalable deep packet inspection with extended finite automata , 2008, SIGCOMM '08.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Patrick Crowley,et al.  Efficient regular expression evaluation: theory to practice , 2008, ANCS '08.

[7]  Christoph Hagleitner,et al.  Regular Expression Acceleration at Multiple Tens of Gb / s , 2009 .

[8]  Stefano Giordano,et al.  Differential Encoding of DFAs for Fast Regular Expression Matching , 2011, IEEE/ACM Transactions on Networking.

[9]  Ron K. Cytron,et al.  A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching , 2006, ISCA 2006.

[10]  Jan van Lunteren Searching very large routing tables in wide embedded memory , 2001, GLOBECOM.

[11]  George Varghese,et al.  Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia , 2007, ANCS '07.

[12]  Jonathan S. Turner,et al.  Advanced algorithms for fast and scalable deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[13]  Somesh Jha,et al.  XFA: Faster Signature Matching with Extended Automata , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[14]  Jan van Lunteren,et al.  High-Performance Pattern-Matching for Intrusion Detection , 2006, INFOCOM.

[15]  Srihari Cadambi,et al.  Memory-Efficient Regular Expression Search Using State Merging , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[16]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[17]  Chen-Yong Cher,et al.  A wire-speed powerTM processor: 2.3GHz 45nm SOI with 16 cores and 64 threads , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[18]  H. Jonathan Chao,et al.  Range hash for regular expression pre-filtering , 2010, 2010 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[19]  Bin Liu,et al.  Deflation DFA: Remembering History is Adequate , 2010, 2010 IEEE International Conference on Communications.

[20]  Patrick Crowley,et al.  An improved algorithm to accelerate regular expression evaluation , 2007, ANCS '07.

[21]  H. Franke,et al.  Introduction to the wire-speed processor and architecture , 2010, IBM J. Res. Dev..

[22]  T. V. Lakshman,et al.  Fast and memory-efficient regular expression matching for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.