System-level power consumption modeling and tradeoff analysis techniques for superscalar processor design

This paper presents systematic techniques to find low-power high-performance superscalar processors tailored to specific user applications. The model of power is novel because it separates power into architectural and technology components. The architectural component is found via trace-driven simulation, which also produces performance estimates. An example technology model is presented that estimates the technology component, along with critical delay time and real estate usage. This model is based on case studies of actual designs. It is used to solve an important problem: decreasing power consumption in a superscalar processor without greatly impacting performance. Results are presented from runs using simulated annealing to reduce power consumption subject to performance reduction bounds. The major contributions of this paper are the separation of architectural and technology components of dynamic power the use of trace-driven simulation for architectural power measurement, and the use of a near-optimal search to tailor a processor design to a benchmark.

[1]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[2]  Thomas M. Conte,et al.  Determining cost-effective multiple issue processor designs , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[3]  Y. Patt,et al.  Single instruction stream parallelism is greater than two , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  S. F. Anderson,et al.  The IBM system/360 model 91: floating-point execution unit , 1967 .

[6]  Anantha P. Chandrakasan,et al.  Low-power CMOS digital design , 1992 .

[7]  Donald B. Alpert,et al.  Architecture of the Pentium microprocessor , 1993, IEEE Micro.

[8]  Tse-Yu Yeh Two-level adaptive branch prediction and instruction fetch mechanisms for high performance superscalar processors , 1993 .

[9]  Israel Koren Computer arithmetic algorithms , 1993 .

[10]  Richard M. Blumberg,et al.  Four-Way Superscalar PA-RISC Processors , 1997 .

[11]  James E. Smith,et al.  Instruction Issue Logic in Pipelined Supercomputers , 1984, IEEE Trans. Computers.

[12]  Jeff Yetter,et al.  Performance features of the PA7100 microprocessor , 1993, IEEE Micro.

[13]  Edward McLellan The Alpha AXP architecture and 21064 processor , 1993, IEEE Micro.

[14]  Shlomo Weiss,et al.  Instruction issue logic for pipelined supercomputers , 1984, ISCA 1984.

[15]  Richard M. Stallman,et al.  Using and Porting GNU CC , 1998 .

[16]  Peter W. Markstein Computation of Elementary Functions on the IBM RISC System/6000 Processors , 1990, IBM J. Res. Dev..

[17]  R. M. Tomasulo,et al.  An efficient algorithm for exploiting multiple arithmetic units , 1995 .

[18]  Shlomo Weiss,et al.  POWER and PowerPC , 1994 .

[19]  Yale N. Patt,et al.  A two-level approach to making class predictions , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[20]  Larry L. Biro,et al.  Power considerations in the design of the Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[21]  S. Peter Song,et al.  The PowerPC 604 RISC microprocessor. , 1994, IEEE Micro.

[22]  Thomas Martin Conte,et al.  Systematic Computer Architecture Prototyping , 1992 .