The microarchitecture of FPGA-based soft processors

As more embedded systems are built using FPGA platforms, there is an increasing need to support processors in FPGAs. One option is the soft processor, a programmable instruction processor implemented in the reconfigurable logic of the FPGA. Commercial soft processors have been widely deployed, and hence we are motivated to understand their microarchitecture. We must re-evaluate microarchiteture in the soft processor context because an FPGA platform is significantly different than an ASIC platform---for example, the relative speed of memory and logic is quite different in the two platforms, as is the area cost. In this paper we present an infrastructure for rapidly generating RTL models of soft processors, as well as a methodology for measuring their area, performance, and power. Using our automatically-generated soft processors we explore the microarchitecture trade-off space including: (i) hardware vs software multiplication support; (ii) shifter implementations; and (iii) pipeline depth, organization, and forwarding. For example, we find that a 3-stage pipeline has better wall-clock-time performance than deeper pipelines, despite lower clock frequency. We also compare our designs to Altera's NiosII commercial soft processor variations and find that our automatically generated designs span the design space while remaining very competitive.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  Norman P. Jouppi,et al.  The MIPS Machine , 1982, COMPCON.

[3]  S. McFarling Combining Branch Predictors , 1993 .

[4]  John Erickson,et al.  DartMIPS: a case study in quantitative analysis of processor design tradeoffs using FPGAs , 1994 .

[5]  Michael Gschwind,et al.  An extendible MIPS-I processor kernel in VHDL for hardware/software co-design , 1996, Proceedings EURO-DAC '96. European Design Automation Conference with EURO-VHDL '96 and Exhibition.

[6]  Edwin A. Harcourt,et al.  Generation of software tools from processor descriptions for hardware/software codesign , 1997, DAC.

[7]  Hiroyuki Tomiyama,et al.  Architecture Description Languages for Systems-on-Chip Design , 1999 .

[8]  Joseph A. Fisher Customized instruction-sets for embedded processors , 1999, DAC '99.

[9]  Yoshinori Takeuchi,et al.  PEAS-III: an ASIP design environment , 2000, Proceedings 2000 International Conference on Computer Design.

[10]  Yoshinori Takeuchi,et al.  Effectiveness of the ASIP design system PEAS-III in design of pipelined processors , 2001, Proceedings of the ASP-DAC 2001. Asia and South Pacific Design Automation Conference 2001 (Cat. No.01EX455).

[11]  Heinrich Meyr,et al.  Architecture implementation using the machine description language LISA , 2002, Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design.

[12]  B. Ramakrishna Rau,et al.  PICO: Automatically Designing Custom Computers , 2002, Computer.

[13]  David I. August,et al.  Microarchitectural exploration with Liberty , 2002, MICRO 35.

[14]  Sharad Malik,et al.  From ASIC to ASIP: the next design discontinuity , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[15]  Sharad Malik,et al.  Design Tools for Application Specific Embedded Processors , 2002, EMSOFT.

[16]  N. Dutt,et al.  HDLGen : Architecture Description Language driven HDL Generation for Pipelined Processors , 2003 .

[17]  Vaughn Betz,et al.  The stratixπ routing and logic architecture , 2003, FPGA '03.

[18]  Andrea Lodi,et al.  A pipelined configurable gate array for embedded processors , 2003, FPGA '03.

[19]  Darin Petkov,et al.  Automatic generation of application specific processors , 2003, CASES '03.

[20]  Ronny Krashinsky,et al.  A Parameterizable FPGA Prototype of a Vector-Thread Processor , 2004 .

[21]  Matthias Gries,et al.  Methods for evaluating and covering the design space during early design development , 2004, Integr..

[22]  Nikil D. Dutt,et al.  Synthesis-driven exploration of pipelined embedded processors , 2004, 17th International Conference on VLSI Design. Proceedings..

[23]  Paul Metzgen,et al.  A high performance 32-bit ALU for programmable logic , 2004, FPGA '04.

[24]  Sharad Malik,et al.  Microarchitecture modeling for design-space exploration , 2004 .

[25]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[26]  Rainer Leupers,et al.  RTL processor synthesis for architecture exploration and implementation , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[27]  P. Metzgen Optimizing a high performance 32-bit processor for programmable logic , 2004, 2004 International Symposium on System-on-Chip, 2004. Proceedings..

[28]  Shih-Lien Lu,et al.  Memory Subsystem Performance Evaluation with FPGA based Emulators , 2005 .

[29]  Vaughn Betz,et al.  The Stratix II logic and routing architecture , 2005, FPGA '05.

[30]  Michael Pellauer,et al.  UNUM: A General Microprocessor Framework Using Guarded Atomic Actions , 2005 .

[31]  Kunle Olukotun,et al.  ATLAS: A Scalable Emulator for Transactional Parallel Systems , 2005 .

[32]  John W. Lockwood,et al.  Semi-automatic Microarchitecture Con guration of Soft-Core Systems , 2005 .