A Dynamically Reconfigurable Automata Processor Overlay

This paper describes a design for a parameterizable automata processor overlay and a placement algorithm required for its support software. The resulting framework serves as both an open-source alternative to Micron's Automata Processor (AP) and as an experimental testbed for exploration of architectural tradeoffs. An automata processor is a processor-in-memory architecture designed to recognize patterns in streaming data. Our framework takes a description of a nondeterministic finite automata (NFA) described in Micron's ANML language and uses instantiated JTAG sources to configure the on-chip RAM and programmable interconnect of the overlay programmed onto an FPGA. Like the Micron AP, our design is comprised of an array of interconnected state transition elements (STEs). While our STE design is equivalent to that of the Micron AP, our overlay uses a simpler, non-switched interconnect based on pairwise gated connections. This interconnect design creates a constraint satisfaction problem when mapping logical states to the physical STEs. In this paper, we explore the impact of tradeoffs in the interconnect architecture as it relates to a Stratix 5 GX target device and we describe and evaluate an algorithm for STE placement with respect to the ANMLZoo benchmark suite. As far as the authors know, this is the first example of an FPGA-based automata processor overlay.

[1]  Tilman Wolf,et al.  Picking pesky parameters: optimizing regular expression matching in practice , 2013, ANCS 2013.

[2]  Yanjun Qi,et al.  Association Rule Mining with the Micron Automata Processor , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[3]  Kevin Skadron,et al.  An overview of micron's automata processor , 2016, 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[4]  Srinivas Aluru,et al.  Finding Motifs in Biological Sequences Using the Micron Automata Processor , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[5]  Viktor K. Prasanna,et al.  Fast Regular Expression Matching Using FPGAs , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[6]  Viktor K. Prasanna,et al.  Compact architecture for high-throughput regular expression matching on FPGA , 2008, ANCS '08.

[7]  Kevin Skadron,et al.  Regular expression acceleration on the micron automata processor: Brill tagging as a case study , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[8]  Patrick Crowley,et al.  Data structures, algorithms and architectures for efficient regular expression evaluation , 2009 .

[9]  Patrick Crowley,et al.  Efficient regular expression evaluation: theory to practice , 2008, ANCS '08.

[10]  Kevin Skadron,et al.  ANMLzoo: a benchmark suite for exploring bottlenecks in automata processing engines and architectures , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[11]  Kevin Skadron,et al.  Nondeterministic Finite Automata in Hardware-the Case of the Levenshtein Automaton , 2015 .

[12]  T. V. Lakshman,et al.  Fast and memory-efficient regular expression matching for deep packet inspection , 2006, 2006 Symposium on Architecture For Networking And Communications Systems.

[13]  Viktor K. Prasanna,et al.  High-Performance and Compact Architecture for Regular Expression Matching on FPGA , 2012, IEEE Transactions on Computers.

[14]  Wu-chun Feng,et al.  Demystifying automata processing: GPUs, FPGAs or Micron's AP? , 2017, ICS.

[15]  Kevin Skadron,et al.  Brill tagging on the Micron Automata Processor , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[16]  Kevin Skadron,et al.  Automata-to-Routing: An Open-Source Toolchain for Design-Space Exploration of Spatial Automata Processing Architectures , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[17]  Vaughn Betz,et al.  VPR: A new packing, placement and routing tool for FPGA research , 1997, FPL.

[18]  Andrew A. Chien,et al.  Fast support for unstructured data processing: The unified automata processor , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Dave Brown,et al.  Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .

[20]  Kevin Skadron,et al.  RAPID Programming of Pattern-Recognition Processors , 2016, International Conference on Architectural Support for Programming Languages and Operating Systems.