Power-aware modulo scheduling for high-performance VLIW processors

For high-performance processors, the step power and peak power, which are closely related to the chip reliability, are important design constraints, often more than the average power. In VLIW processors where a single instruction may contain a variable number of operations, the step power and peak power vary significantly depending on the parallel schedule generated by a parallelizing compiler. In this paper, we propose a power-aware modulo scheduling algorithm for high-performance VLIW processors. The proposed algorithm reduces both the step power and peak power by producing a more balanced parallel schedule while not compromising performance. Experimental results show that the proposed scheduling technique significantly improves the power characteristics of high-performance processors over an existing power-unaware modulo scheduling technique.

[1]  Seiichi Nakagawa,et al.  Ramp up/down floating point unit to reduce inductive noise , 2000 .

[2]  Jenq Kuen Lee,et al.  Compiler optimization on instruction scheduling for low power , 2000, ISSS '00.

[3]  Vittorio Zaccaria,et al.  Instruction-level power estimation for embedded VLIW cores , 2000, Proceedings of the Eighth International Workshop on Hardware/Software Codesign. CODES 2000 (IEEE Cat. No.00TH8518).

[4]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[5]  Dongkun Shin,et al.  An Operation Rearrangement Technique for Low-Power VLIW Instruction Fetch , 2000 .

[6]  Naehyuck Chang,et al.  Cycle-accurate energy consumption measurement and analysis: case study of ARM7TDMI , 2000, ISLPED '00.

[7]  Vivek Tiwari,et al.  An architectural solution for the inductive noise problem due to clock-gating , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[8]  M. Schlansker,et al.  On Predicated Execution , 1991 .

[9]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[10]  Margaret Martonosi,et al.  Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[11]  Soo-Mook Moon,et al.  Evaluation of scheduling techniques on a SPARC-based VLIW testbed , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[12]  W. Robert Daasch,et al.  TEM2P2EST: A Thermal Enabled Multi-model Power/Performance ESTimator , 2000, PACS.

[13]  Ken Kennedy,et al.  Conversion of control dependence to data dependence , 1983, POPL '83.

[14]  Edwin Hsing-Mean Sha,et al.  Scheduling Data-Flow Graphs via Retiming and Unfolding , 1997, IEEE Trans. Parallel Distributed Syst..

[15]  Vivek Tiwari,et al.  Inductive noise reduction at the architectural level , 2000, VLSI Design 2000. Wireless and Digital Imaging in the Millennium. Proceedings of 13th International Conference on VLSI Design.