Finite-State-Machine Overlay Architectures for Fast FPGA Compilation and Application Portability

Despite significant advantages, wider usage of field-programmable gate arrays (FPGAs) has been limited by lengthy compilation and a lack of portability. Virtual-architecture overlays have partially addressed these problems, but previous work focuses mainly on heavily pipelined applications with minimal control requirements. We expand previous work by enabling more flexible control via overlay architectures for finite-state machines. Although not appropriate for control-intensive circuits, the presented architectures reduced compilation times of control changes in a convolution case study from 7 hours to less than 1 second, with no performance overhead and an area overhead of 0.2%.

[1]  Nina Yevtushenko,et al.  Multi component digital circuit optimization by solving FSM equations , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[2]  Luca Benini,et al.  State assignment for low power dissipation , 1995 .

[3]  Wayne Luk,et al.  Have GPUs made FPGAs redundant in the field of video processing? , 2005, Proceedings. 2005 IEEE International Conference on Field-Programmable Technology, 2005..

[4]  Greg Brown,et al.  A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications , 2012, FPGA '12.

[5]  Mary Jane Irwin,et al.  FPGA-based synthesis of FSMs through decomposition , 1994, Proceedings of 4th Great Lakes Symposium on VLSI.

[6]  Maya Gokhale,et al.  Matched Filter Computation on FPGA, Cell and GPU , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[7]  Eric Mayer,et al.  Synthesis Of Finite State Machines Logic Optimization , 2016 .

[8]  Steven J. E. Wilton,et al.  Product-term-based synthesizable embedded programmable logic cores , 2003, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[9]  A. Civit-Balcells,et al.  ROM-Based Finite State Machine Implementation in Low Cost FPGAs , 2007, 2007 IEEE International Symposium on Industrial Electronics.

[10]  Robert K. Brayton,et al.  Optimal State Assignment for Finite State Machines , 1985, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Arnaldo S. R. Oliveira,et al.  A dynamically reconfigurable accelerator for operations over Boolean and ternary vectors , 2003, Euromicro Symposium on Digital System Design, 2003. Proceedings..

[12]  Neil W. Bergmann,et al.  QUKU: A FPGA Based Flexible Coarse Grain Architecture Design Paradigm using Process Networks , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[13]  Increasing Productivity With Quartus II Incremental Compilation , 1998 .

[14]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[15]  Kunjan Patel,et al.  High Performance Programmable FPGA Overlay for Digital Signal Processing , 2011, ARC.

[16]  José C. Monteiro,et al.  Finite state machine decomposition for low power , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[17]  James Coole,et al.  Intermediate fabrics: Virtual architectures for circuit portability and fast placement and routing , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[18]  Valery Sklyarov Reconfigurable models of finite state machines and their implementation in FPGAs , 2002, J. Syst. Archit..

[19]  TingTing Hwang,et al.  Low power realization of finite state machines—a decomposition approach , 1996, TODE.

[20]  James Coole,et al.  Fast, Flexible High-Level Synthesis from OpenCL using Reconfiguration Contexts , 2014, IEEE Micro.

[21]  Axel Jantsch,et al.  Run-time Partial Reconfiguration speed investigation and architectural design space exploration , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[22]  Jukka Saarinen,et al.  Finite state machine encoding for VHDL synthesis , 2001 .

[23]  Mario Cifrek,et al.  A brief introduction to OpenCV , 2012, 2012 Proceedings of the 35th International Convention MIPRO.

[24]  Valery Sklyarov Hierarchical finite-state machines and their use for digital control , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[25]  Frank Vahid,et al.  A quantitative analysis of the speedup factors of FPGAs over processors , 2004, FPGA '04.

[26]  Tsutomu Maruyama,et al.  Performance comparison of FPGA, GPU and CPU in image processing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[27]  Greg Stitt,et al.  A low-overhead interconnect architecture for virtual reconfigurable fabrics , 2012, CASES '12.

[28]  Guy Lemieux,et al.  ZUMA: An Open FPGA Overlay Architecture , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[29]  Nachiket Kapre,et al.  Packet Switched vs. Time Multiplexed FPGA Overlay Networks , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[30]  Valery Sklyarov,et al.  Architecture of a Reconfigurable Processor for Implementing Search Algorithm over Discrete Matrices , 2003, Engineering of Reconfigurable Systems and Algorithms.

[31]  Dah-Jye Lee,et al.  Real-Time Optical Flow Calculations on FPGA and GPU Architectures: A Comparison Study , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[32]  Frank Vahid,et al.  Firm-core Virtual FPGA for Just-in-Time FPGA Compilation (abstract only) , 2005, FPGA '05.

[33]  Tiziano Villa,et al.  NOVA: State Assignment of Finite State Machines for Optimal Two-Level Logic Implementations , 1989, 26th ACM/IEEE Design Automation Conference.