Augmenting Modern Superscalar Architectures with Configurable Extended Instructions

The instruction sets of general-purpose microprocessors are designed to offer good performance across a wide range of programs. The size and complexity of the instruction sets, however, are limited by a need for generality and for streamlined implementation. The particular needs of one application are balanced against the needs of the full range of applications considered. For this reason, one can "design" a better instruction set when considering only a single application than when considering a general collection of applications. Configurable hardware gives us the opportunity to explore this option. This paper examines the potential for automatically identifying application-specific extended instructions and implementing them in programmable functional units based on configurable hardware. Adding fine-grained reconfigurable hardware to the datapath of an out-of-order issue superscalar processor allows 4-44% speedups on the MediaBench benchmarks [1]. As a key contribution of our work, we present a selective algorithm for choosing extended instructions to minimize reconfiguration costs within loops. Our selective algorithm constrains instruction choices so that significant speedups are achieved with as few as 4 moderately sized programmable functional units, typically containing less than 150 look-up tables each.

[1]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[2]  Dzung T. Hoang,et al.  The Splash 2 processor and applications , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[3]  Jean Vuillemin,et al.  Introduction to programmable active memories , 1990 .

[4]  Maya Gokhale,et al.  The NAPA adaptive processing architecture , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[5]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[6]  Ralph Wittig,et al.  OneChip: an FPGA processor with reconfigurable logic , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[7]  S SohiGurindar Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers , 1990 .

[8]  Sergei Sawitzki,et al.  Increasing Microprocessor Performance with Tightly-Coupled Reconfigurable Logic Arrays , 1998, FPL.

[9]  Michael D. Smith,et al.  PRISC software acceleration techniques , 1994, Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[10]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.