Optimistic regular expression matching on FPGAs for near-data processing

Regular expressions (regex) are the main means to search for specific patterns in the vast amount of stored textual information. As a consequence, different designs of hardware accelerators have been proposed that enable memory-bound regex processing. Here, the regular expression to be evaluated is translated to a non-deterministic (NFA) or deterministic finite automaton (DFA) which is then mapped onto the hardware design. The available hardware resources of the design imply the maximum size (in terms of amount of states and transitions) of the supported automata. However, regular expressions may be arbitrarily complex. As a remedy, we propose optimistic regular expression evaluation which follows the idea of pruning the DFA of a regex such that it fits the available hardware resources. Consequently, we obtain a DFA with not only matching states but also an uncertain state. Texts marked as uncertain have to be re-evaluated by software. This is particularly tailored to near-data processing where the optimistic regex evaluation is performed near the data source thus reducing the overall amount of data to be transmitted to the requesting host. A prototype is implemented within Google's RE2 meaning a complete coverage of RE2 supported regular expression for the proposed design. Regular expression evaluation of up to 2.66 GByte/s could be achieved on an FPGA-based Zynq SoC.

[1]  Sean Eilert,et al.  DataCenter 2020: Near-memory acceleration for data-oriented applications , 2014, 2014 Symposium on VLSI Circuits Digest of Technical Papers.

[2]  Steven Swanson,et al.  Near-Data Processing: Insights from a MICRO-46 Workshop , 2014, IEEE Micro.

[3]  Gustavo Alonso,et al.  Runtime Parameterizable Regular Expression Operators for Databases , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[4]  Jürgen Teich,et al.  A co-design approach for accelerated SQL query processing via FPGA-based data filtering , 2015, 2015 International Conference on Field Programmable Technology (FPT).

[5]  Viktor K. Prasanna,et al.  Fast Regular Expression Matching Using FPGAs , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[6]  A. Civit-Balcells,et al.  ROM-Based Finite State Machine Implementation in Low Cost FPGAs , 2007, 2007 IEEE International Symposium on Industrial Electronics.

[7]  Thomas F. Wenisch,et al.  HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).