Chain-Based DFA Deflation for Fast and Scalable Regular Expression Matching Using TCAM

Regular expression matching is the core engine of many network functions such as intrusion detection, protocol analysis and so on. In spite of intensive research, we are still in need of a method for fast and scalable regular expression matching, where it takes one simple memory lookup to match each input character (like DFA) and storage space growing linearly with regular expression pattern set size (like NFA). Most recently, TCAM-based DFA implementation has been proposed as a promising approach, for TCAM's unique parallel and wildcard matching capabilities. However, the number of TCAM entries needed is still above exponentially growing DFA size and hence not scalable. In this paper, we propose a chain-based {DFA deflation} method for fast and scalable regular expression matching using TCAM, which takes one simple TCAM lookup to match each input character and effectively deflates DFA size. Experiments based on real life pattern sets demonstrate that, the number of TCAM entries used by our DFA deflation method is up to two orders of magnitude lower than the DFA size, and comes quite close to the linearly growing NFA size. This not only means superior scalability, but also allows us to implement regular expression matching at extremely fast matching speed, up to two orders of magnitude faster than the existing TCAM-based DFA implementation method.

[1]  Patrick Crowley,et al.  A hybrid finite automaton for practical deep packet inspection , 2007, CoNEXT '07.

[2]  Somesh Jha,et al.  Deflating the big bang: fast and scalable deep packet inspection with extended finite automata , 2008, SIGCOMM '08.

[3]  Patrick Crowley,et al.  An improved algorithm to accelerate regular expression evaluation , 2007, ANCS '07.

[4]  Patrick Crowley,et al.  Algorithms to accelerate multiple regular expressions matching for deep packet inspection , 2006, SIGCOMM.

[5]  Patrick Crowley,et al.  Efficient regular expression evaluation: theory to practice , 2008, ANCS '08.

[6]  Patrick Crowley,et al.  Extending finite automata to efficiently match Perl-compatible regular expressions , 2008, CoNEXT '08.

[7]  T. V. Lakshman,et al.  Fast and memory-efficient regular expression matching for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[8]  Eric Torng,et al.  Bit Weaving: A Non-Prefix Approach to Compressing Packet Classifiers in TCAMs , 2012, IEEE/ACM Transactions on Networking.

[9]  Eric Torng,et al.  Fast Regular Expression Matching Using Small TCAMs for Network Intrusion Detection and Prevention Systems , 2010, USENIX Security Symposium.

[10]  George Varghese,et al.  Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia , 2007, ANCS '07.

[11]  D UllmanJeffrey,et al.  Introduction to automata theory, languages, and computation, 2nd edition , 2001 .

[12]  T. V. Lakshman,et al.  Gigabit rate packet pattern-matching using TCAM , 2004, Proceedings of the 12th IEEE International Conference on Network Protocols, 2004. ICNP 2004..

[13]  송왕철,et al.  IDS(Intrusion Detection System) , 2000 .

[14]  Timothy Sherwood,et al.  Modeling TCAM power for next generation network devices , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[15]  Vijay Kumar,et al.  High Speed Pattern Matching for Network IDS/IPS , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[16]  Somesh Jha,et al.  XFA: Faster Signature Matching with Extended Automata , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[17]  Youngseok Lee,et al.  A multi-gigabit rate deep packet inspection algorithm using TCAM , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[18]  Min Chen,et al.  TCAM-based DFA deflation: A novel approach to fast and scalable regular expression matching , 2011, 2011 IEEE Nineteenth IEEE International Workshop on Quality of Service.