The input-aware dynamic adaptation of area and performance for reconfigurable accelerator

Attaching reconfigurable loop accelerator to a processor is a promising way to improve the performance and efficiency of the system. It's usually to unroll a loop to increase the parallelism of a loop accelerator. While the higher degree a loop is unrolled, the more reconfigurable area is needed. However, an observation is that the utilization of the loop accelerator is relative to the input. Focusing on the area and performance balance, a dynamically adaptive reconfigurable accelerator framework is proposed on CPU/RA architecture in the paper. Firstly, the inputs are classified into certain predefined types. At run-time the input of the application will be monitored and then the accelerator will be reconfigured to accomplish the area-performance dynamic adaption. An accelerator selection model is also presented to choose an accelerator at run-time according to the predefined input types. And a bzip2 case study is presented, the experimental results demonstrated the feasibility of the approach, and shown that up to 93.6% reconfigurable area is saved at a cost of 1.6% performance lost in a best case.

[1]  Christof Paar,et al.  An FPGA implementation and performance evaluation of the Serpent block cipher , 2000, FPGA '00.

[2]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[3]  Sri Parameswaran,et al.  Novel architecture for loop acceleration: a case study , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[4]  Edward J. McCluskey,et al.  A reliable LZ data compressor on reconfigurable coprocessors , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[5]  Sharad Malik,et al.  Accelerating Boolean satisfiability with configurable hardware , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[6]  William H. Mangione-Smith,et al.  Factoring large numbers with programmable hardware , 2000, FPGA '00.

[7]  Al Davis,et al.  A loop accelerator for low power embedded VLIW processors , 2004, CODES+ISSS '04.

[8]  Carl Ebeling,et al.  Specifying and compiling applications for RaPiD , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[9]  Scott A. Mahlke,et al.  VEAL: Virtualized Execution Accelerator for Loops , 2008, 2008 International Symposium on Computer Architecture.

[10]  John Wawrzynek,et al.  Garp: a MIPS processor with a reconfigurable coprocessor , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[11]  Wai Keung Wong,et al.  FPGA implementation of a microcoded elliptic curve cryptographic processor , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[12]  Michael A. Schuette,et al.  The Reconfigurable Streaming Vector Processor (RSVPTM) , 2003, MICRO.

[13]  Paul Chow,et al.  Integrating FPGAs in high-performance computing: introduction , 2007, FPGA '07.

[14]  Lorenz Huelsbergen,et al.  A representation for dynamic graphs in reconfigurable hardware and its application to fundamental graph algorithms , 2000, FPGA '00.

[15]  Scott A. Mahlke,et al.  Increasing hardware efficiency with multifunction loop accelerators , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[16]  Vern Paxson,et al.  The shunt: an FPGA-based accelerator for network intrusion prevention , 2007, FPGA '07.

[17]  Tim Güneysu,et al.  Attacking elliptic curve cryptosystems with special-purpose hardware , 2007, FPGA '07.

[18]  Wayne Luk,et al.  Pipeline vectorization for reconfigurable systems , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).