Energy-aware compilation and hardware design for VLIW embedded systems

Tomorrow's embedded devices need to run multimedia applications demanding high computational power with low energy consumption constraints. In this context, the register file is a key source of power consumption and its inappropriate design and management severely affects system power. In this paper, we present a new approach to reduce the energy of shared register files in forthcoming embedded VLIW processors running real-life applications up to 60% without performance penalty. This approach relies on limited hardware extensions and a compiler-based energy-aware register assignment algorithm to deactivate at run-time parts of the register file (i.e., sub-banks) in an independent way.

[1]  José L. Ayala Energy-Efficient Register Renaming in High-Performance Processors , 2003 .

[2]  Jośe L. Ayala Improving Register File Banking with a Power-Aware Unroller , 2004 .

[3]  David Blaauw,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, ISCA.

[4]  Deborah A. Wallach,et al.  Power Evaluation of a Handheld Computer , 2003, IEEE Micro.

[5]  Ricardo E. Gonzalez,et al.  Xtensa: A Configurable and Extensible Processor , 2000, IEEE Micro.

[6]  Nikil D. Dutt,et al.  Access pattern based local memory customization for low power embedded systems , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[7]  Tulika Mitra,et al.  Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.

[8]  Dean M. Tullsen,et al.  The effect of compiler optimizations on Pentium 4 power consumption , 2003, Seventh Workshop on Interaction Between Compilers and Computer Architectures, 2003. INTERACT-7 2003. Proceedings..

[9]  Geoffrey Brown,et al.  Lx: a technology platform for customizable VLIW embedded processing , 2000, ISCA '00.

[10]  Mahmut T. Kandemir,et al.  Leakage Current: Moore's Law Meets Static Power , 2003, Computer.

[11]  Margarida F. Jacome,et al.  FDRA: a software-pipelining algorithm for embedded VLIW processors , 2000, ISSS '00.

[12]  Paul Chow,et al.  Exploiting dual data-memory banks in digital signal processors , 1996, ASPLOS VII.

[13]  Rudy Lauwereins,et al.  CRISP: A Template for Reconfigurable Instruction Set Processors , 2001, FPL.

[14]  Luca Benini,et al.  System-level power optimization: techniques and tools , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[15]  Wayne H. Wolf,et al.  The future of multiprocessor systems-on-chips , 2004, Proceedings. 41st Design Automation Conference, 2004..

[16]  L. Benini,et al.  A Power Modeling and Estimation Framework for VLIW-based Embedded Systems , 2001 .

[17]  Victor V. Zyuban,et al.  Inherently Lower-Power High-Performance Superscalar Architectures , 2001, IEEE Trans. Computers.

[18]  Mahmut T. Kandemir,et al.  Automatic data migration for reducing energy consumption in multi-bank memory systems , 2002, DAC '02.

[19]  Tulika Mitra,et al.  Characterizing embedded applications for instruction-set extensible processors , 2004, Proceedings. 41st Design Automation Conference, 2004..

[20]  Margarida F. Jacome,et al.  CALiBeR: a software pipelining algorithm for clustered embedded VLIW processors , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[21]  Naehyuck Chang,et al.  Cycle-accurate energy consumption measurement and analysis: case study of ARM7TDMI , 2000, ISLPED '00.

[22]  Paolo Ienne,et al.  Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[23]  Scott A. Mahlke,et al.  Trimaran: An Infrastructure for Research in Instruction-Level Parallelism , 2004, LCPC.

[24]  Mateo Valero,et al.  Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[25]  J. C. Huang,et al.  Generalized loop-unrolling: a method for program speedup , 1999, Proceedings 1999 IEEE Symposium on Application-Specific Systems and Software Engineering and Technology. ASSET'99 (Cat. No.PR00122).

[26]  Victor V. Zyuban,et al.  The energy complexity of register files , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[27]  Nikil D. Dutt,et al.  Introduction of local memory elements in instruction set extensions , 2004, Proceedings. 41st Design Automation Conference, 2004..

[28]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[29]  T. N. Vijaykumar,et al.  Reducing register ports for higher speed and lower energy , 2002, MICRO.

[30]  James E. Smith,et al.  Early-Stage Definition of LPX: A Low Power Issue-Execute Processor , 2002, PACS.

[31]  Heinrich Meyr,et al.  Design of Energy-Efficient Application-Specific Instruction Set Processors , 2004 .

[32]  Johan A. Pouwelse,et al.  Application-directed voltage scaling , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[33]  Mahmut T. Kandemir,et al.  Energy-driven integrated hardware-software optimizations using SimplePower , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[34]  Vikas Agarwal,et al.  Static energy reduction techniques for microprocessor caches , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[35]  Diederik Verkest,et al.  Power breakdown analysis for a heterogeneous NoC platform running a video application , 2005, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05).

[36]  Jaume Abella,et al.  On reducing register pressure and energy in multiple-banked register files , 2003, Proceedings 21st International Conference on Computer Design.