CICERO: A Domain-Specific Architecture for Efficient Regular Expression Matching

Regular Expression (RE) matching is a computational kernel used in several applications. Since RE complexity and data volumes are steadily increasing, hardware acceleration is gaining attention also for this problem. Existing approaches have limited flexibility as they require a different implementation for each RE. On the other hand, it is complex to map efficient RE representations like non-deterministic finite-state automata onto software-programmable engines or parallel architectures. In this work, we present CICERO , an end-to-end framework composed of a domain-specific architecture and a companion compilation framework for RE matching. Our solution is suitable for many applications, such as genomics/proteomics and natural language processing. CICERO aims at exploiting the intrinsic parallelism of non-deterministic representations of the REs. CICERO can trade-off accelerators’ efficiency and processors’ flexibility thanks to its programmable architecture and the compilation framework. We implemented CICERO prototypes on embedded FPGA achieving up to 28.6× and 20.8× more energy efficiency than embedded and mainstream processors, respectively. Since it is a programmable architecture, it can be implemented as a custom ASIC that is orders of magnitude more energy-efficient than mainstream processors.

[1]  J. Hennessy A new golden age for computer architecture: Domain-specific hardware/software co-design, enhanced security, open instruction sets, and agile chip development , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[2]  Viktor K. Prasanna,et al.  FPGA Based Accelerator for Pattern Matching in YARA Framework , 2015 .

[3]  Vaughn Betz,et al.  Latency Insensitive Design Styles for FPGAs , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).

[4]  Wu-chun Feng,et al.  Demystifying automata processing: GPUs, FPGAs or Micron's AP? , 2017, ICS.

[5]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[6]  Ken Thompson,et al.  Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.

[7]  Kevin Skadron,et al.  AutomataZoo: A Modern Automata Processing Benchmark Suite , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).

[8]  Kevin Skadron,et al.  REAPR: Reconfigurable engine for automata processing , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[9]  Charlie Johnson,et al.  IBM Power Edge of Network Processor: A Wire-Speed System on a Chip , 2011, IEEE Micro.

[10]  Kevin Skadron,et al.  Automata Processing in Reconfigurable Architectures , 2019, ACM Trans. Reconfigurable Technol. Syst..

[11]  David Donofrio,et al.  A Parallel, Energy Efficient Hardware Architecture for the merAligner on FPGA Using Chisel HCL , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[12]  Gustavo Alonso,et al.  Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures , 2017, SIGMOD Conference.

[13]  Jan van Lunteren,et al.  Hardware-accelerated regular expression matching at multiple tens of Gb/s , 2012, 2012 Proceedings IEEE INFOCOM.

[14]  Thomas F. Wenisch,et al.  HARE: Hardware accelerator for regular expressions , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[15]  Indranil Roy,et al.  Algorithmic techniques for the micron automata processor , 2015 .

[16]  Eric Torng,et al.  Fast Regular Expression Matching Using Small TCAMs for Network Intrusion Detection and Prevention Systems , 2010, USENIX Security Symposium.

[17]  Kevin Skadron,et al.  FlexAmata: A Universal and Efficient Adaption of Applications to Spatial Automata Processing Accelerators , 2020, ASPLOS.

[18]  Patrick Crowley,et al.  Efficient regular expression evaluation: theory to practice , 2008, ANCS '08.

[19]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[20]  Marco D. Santambrogio,et al.  FPGA-based PairHMM Forward Algorithm for DNA Variant Calling , 2018, 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[21]  Marco D. Santambrogio,et al.  TiReX: Tiled Regular eXpression Matching Architecture , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[22]  Vyas Sekar,et al.  Achieving 100Gbps Intrusion Prevention on a Single Server , 2020, OSDI.

[23]  Donald E. Knuth,et al.  On the Translation of Languages from Left to Right , 1965, Inf. Control..

[24]  Kevin Skadron,et al.  Regular expression acceleration on the micron automata processor: Brill tagging as a case study , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[25]  Kevin Skadron,et al.  ANMLzoo: a benchmark suite for exploring bottlenecks in automata processing engines and architectures , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[26]  Jürgen Teich,et al.  Optimistic regular expression matching on FPGAs for near-data processing , 2018, DaMoN.

[27]  Dave Brown,et al.  Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .

[28]  Kevin Skadron,et al.  Grapefruit: An Open-Source, Full-Stack, and Customizable Automata Processing on FPGAs , 2020, 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[29]  Gustavo Alonso,et al.  Runtime Parameterizable Regular Expression Operators for Databases , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[30]  Luca P. Carloni,et al.  An FPGA-based infrastructure for fine-grained DVFS analysis in high-performance embedded systems , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[31]  Kevin Skadron,et al.  An overview of micron's automata processor , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[32]  Patrick Crowley,et al.  A hybrid finite automaton for practical deep packet inspection , 2007, CoNEXT '07.

[33]  Lei Jiang,et al.  PiDFA: A practical multi-stride regular expression matching engine based On FPGA , 2016, 2016 IEEE International Conference on Communications (ICC).

[34]  Lei Jiang,et al.  A fast regular expression matching engine for NIDS applying prediction scheme , 2014, 2014 IEEE Symposium on Computers and Communications (ISCC).