Exploiting Schedule Slacks for Rate-Optimal Power-Minimum Software Pipelining

Increasing power consumption in high performance processors and the proliferation of embedded systems demand new compiler techniques geared toward both high performance and low power. Software pipelining, an effective compiler optimization to exploit instruction level parallelism across loop iterations, has been studied extensively. However, previous software pipelining methods focus on performance only. This paper presents a software pipelining method that reduces power consumption while keeping performance optimality. This is accomplished as schedule slacks exist for non-critical instructions even in performance optimal schedules, and by exploiting the slack appropriately, it may be possible to reduce the number of functional units used in the schedule. This paper formulates the problem of minimizing power consumption in software pipelined loops as an integer linear programming (ILP) problem and experimented this approach in a re-engineered MIPSpro compiler. By executing the program generated by the re-engineered compiler on Wattch power simulator, we observe that our approach can reduce dynamic energy consumption of a set of kernels from SPEC benchmarks by up to 15.4% (8.5% on an average) in comparison with the existing scheduler of MIPSpro compiler, which strive only for high performance. The proposed approach is also helpful to reduce leakage power. With a leakage power reduction mechanism which applies power supply gating whenever possible, schedules generated by our approach consume up to 54.5% (or 31.8% on an average) less leakage power than that consumed by the MIPSpro compiler generated schedules. This work is supported in part by DARPA, contract number 1120-24596, SGI, Delaware Research Partnership(DRP) program, grant number 002672, and Intel Corp, grant number 2002-0814.

[1]  Josep Llosa,et al.  Hypernode reduction modulo scheduling , 1995, MICRO 28.

[2]  Santosh Pande,et al.  Optimizing Static Power Dissipation by Functional Units in Superscalar Processors , 2002, CC.

[3]  Guang R. Gao,et al.  Minimizing register requirements under resource-constrained rate-optimal software pipelining , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[4]  Dean M. Tullsen,et al.  Reducing power with dynamic critical path information , 2001, MICRO.

[5]  Jihong Kim,et al.  Power-aware modulo scheduling for high-performance VLIW processors , 2001, ISLPED '01.

[6]  Dean M. Tullsen,et al.  Reducing power with dynamic critical path information , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[7]  Wei Zhang,et al.  Exploiting VLIW schedule slacks for dynamic and leakage energy reduction , 2001, MICRO.

[8]  Gary S. Tyson,et al.  Evaluating Design Tradeoffs in Dual Speed Pipelines , 2001 .

[9]  Guang R. Gao,et al.  Scheduling and mapping: software pipelining in the presence of structural hazards , 1995, PLDI '95.

[10]  Larry L. Biro,et al.  Power considerations in the design of the Alpha 21264 microprocessor , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[11]  Guang R. Gao,et al.  A Framework for Resource-Constrained Rate-Optimal Software Pipelining , 1994, IEEE Trans. Parallel Distributed Syst..

[12]  Srinivas Mantripragada,et al.  A new framework for integrated global local scheduling , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[13]  Stephen H. Gunther,et al.  Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .

[14]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[15]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[16]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[17]  Guang R. Gao,et al.  A Framework for Resource-Constrained Rate-Optimal Software Pipelining , 1996, IEEE Trans. Parallel Distributed Syst..

[18]  Vivek De,et al.  Technology and design challenges for low power and high performance [microprocessors] , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[19]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[20]  Richard A. Huff,et al.  Lifetime-sensitive modulo scheduling , 1993, PLDI '93.

[21]  Vivek Tiwari,et al.  Reducing power in high-performance microprocessors , 1998, Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175).

[22]  Guang R. Gao,et al.  Software pipelining showdown: optimal vs. heuristic methods in a production compiler , 1996, PLDI '96.

[23]  Ken Kennedy,et al.  Automatic translation of FORTRAN programs to vector form , 1987, TOPL.

[24]  Jenq Kuen Lee,et al.  Compiler Analysis and Supports for Leakage Power Reduction on Microprocessors , 2002, LCPC.

[25]  Mahmut T. Kandemir,et al.  Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).