A Novel Parallel Dual-Character String Matching Algorithm on Graphical Processing Units

Aho-Corasick algorithm has been widely used in network intrusion detection system to inspect network packets against thousands of attack patterns. To improve the performance of network intrusion detection systems, many variations of Aho-Corasick algorithm are proposed to accelerate multiple string matching on GPUs or dedicated hardware. One of the proposed variations is to increase the number of characters that are processed per cycle. However, increasing the number of characters processed per cycle will encounter two major problems. The first problem is the input alignment problem while the second problem is the large increase of memory required for storing the state transition table. The two problems cause the multi-character approach become less feasible. In this paper, we propose a novel parallel dual-character string matching algorithm on graphical processing units. In order to solve the two major problems, the proposed algorithm presents a new state machine to solve the input alignment problem, and compresses the state transition table using perfect hashing to solve the memory explosion problem. The experimental results show that the proposed algorithm is superior to the state-of-the-art approaches in terms of performance and memory requirements.

[1]  T. V. Lakshman,et al.  Variable-Stride Multi-Pattern Matching For Scalable Deep Packet Inspection , 2009, IEEE INFOCOM 2009.

[2]  Cheng-Hung Lin,et al.  Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs , 2013, IEEE Transactions on Computers.

[3]  John W. Lockwood,et al.  Fast and Scalable Pattern Matching for Network Intrusion Detection Systems , 2006, IEEE Journal on Selected Areas in Communications.

[4]  Yeim-Kuan Chang,et al.  The Cost Effective Pre-processing Based NFA Pattern Matching Architecture for NIDS , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[5]  Aziz Mohaisen,et al.  A Survey on Deep Packet Inspection for Intrusion Detection Systems , 2008, ArXiv.

[6]  Anat Bremler-Barr,et al.  CompactDFA: Generic State Machine Compression for Scalable Pattern Matching , 2010, 2010 Proceedings IEEE INFOCOM.

[7]  Nikeeta R Patel Data Structure Alignment , 2017 .

[8]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[9]  Junghak Kim,et al.  High speed pattern matching for deep packet inspection , 2009, 2009 9th International Symposium on Communications and Information Technology.

[10]  Viktor K. Prasanna,et al.  Scalable multi-pipeline architecture for high performance multi-pattern string matching , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[11]  Cheng-Hung Lin,et al.  Perfect Hashing Based Parallel Algorithms for Multiple String Matching on Graphic Processing Units , 2017, IEEE Transactions on Parallel and Distributed Systems.

[12]  Norio Yamagaki,et al.  High-speed regular expression matching engine using multi-character NFA , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[13]  Vijay Kumar,et al.  High Speed Pattern Matching for Network IDS/IPS , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[14]  Sheng-De Wang,et al.  An efficient multicharacter transition string-matching engine based on the aho-corasick algorithm , 2013, ACM Trans. Archit. Code Optim..

[15]  John W. Lockwood,et al.  Fast and scalable pattern matching for content filtering , 2005, 2005 Symposium on Architectures for Networking and Communications Systems (ANCS).