On reducing register pressure and energy in multiple-banked register files

The storage for speculative values in superscalar processors is one of the main sources of complexity and power dissipation. We present a novel technique to reduce register requirements as well as their dynamic and static power dissipation that is based on delaying the dispatch of instructions while minimizing its impact on performance. The proposed technique outperforms previous schemes in both performance and power savings. With only 1.77% IPC loss, the mechanism achieves more than 13% dynamic and 15% static extra power savings in the integer rename buffers and more than 9% dynamic and 10% static extra power savings in the FP rename buffers. Significant power savings are also achieved if the processor uses a physical register file for both committed and noncommitted values instead of rename buffers. Additionally the register requirements are reduced by more than 18% and 13% for integer and FP programs respectively.

[1]  Antonio González,et al.  Energy-effective issue logic , 2001, ISCA 2001.

[2]  Mateo Valero,et al.  Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[3]  James E. Smith,et al.  Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[4]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[5]  Srilatha Manne,et al.  Power and energy reduction via pipeline balancing , 2001, ISCA 2001.

[6]  Eric R. Zieyel Operations research : applications and algorithms , 1988 .

[7]  Victor V. Zyuban,et al.  The energy complexity of register files , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[8]  Gürhan Küçük,et al.  Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources , 2001, MICRO.

[9]  Babak Falsafi,et al.  Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[10]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[11]  Eric Sprangle,et al.  Increasing processor performance by implementing deeper pipelines , 2002, ISCA.

[12]  David M. Brooks,et al.  A circuit level implementation of an adaptive issue queue for power-aware microprocessors , 2001, GLSVLSI '01.

[13]  James E. Smith,et al.  Saving energy with just in time instruction delivery , 2002, ISLPED '02.

[14]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[15]  Dirk Grunwald,et al.  Pipeline gating: speculation control for energy reduction , 1998, ISCA.