SoftExplorer: Estimating and Optimizing the Power and Energy Consumption of a C Program for DSP Applications

We present a method to estimate the power and energy consumption of an algorithm directly from the C program. Three models are involved: a model for the targeted processor (the power model), a model for the algorithm, and a model for the compiler (the prediction model). A functional-level power analysis is performed to obtain the power model. Five power models have been developed so far, for different architectures, from the simple RISC ARM7 to the very complex VLIW DSP TI C64. Important phenomena are taken into account, like cache misses, pipeline stalls, and internal/external memory accesses. The model for the algorithm expresses the algorithm's influence over the processor's activity. The prediction model represents the behavior of the compiler, and how it will allow the algorithm to use the processor's resources. The data mapping is considered at that stage. We have developed a tool, SoftExplorer, which performs estimation both at the C-level and the assembly level. Estimations are performed on real-life digital signal processing applications with average errors of% at the C-level and% at the assembly level. We present how SoftExplorer can be used to optimize the consumption of an application. We first show how to find the best data mapping for an algorithm. Then we demonstrate a method to choose the processor and its operating frequency in order to minimize the global energy consumption.

[1]  Luca Benini,et al.  System-level power optimization: techniques and tools , 1999, ISLPED '99.

[2]  E. Senn,et al.  Power Estimation of a C algorithm on a VLIW Processor , 2002 .

[3]  Miodrag Potkonjak,et al.  Function-level power estimation methodology for microprocessors , 2000, DAC.

[4]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[5]  Anne Mignotte,et al.  Source Code Loop Transformations for Memory Hierarchy Optimizations , 2001, PACT 2001.

[6]  Eric Senn,et al.  Functional level power analysis: an efficient approach for modeling the power consumption of complex processors , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[7]  Michael Gschwind,et al.  Integrated analysis of power and performance for pipelined microprocessors , 2004, IEEE Transactions on Computers.

[8]  Mahmut T. Kandemir,et al.  Scheduling reusable instructions for power reduction , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[9]  Rudy Lauwereins,et al.  Data reuse exploration techniques for loop-dominated applications , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[10]  Kathryn S. McKinley,et al.  A Parametrized Loop Fusion Algorithm for Improving Parallelism and Cache Locality , 1997, Comput. J..

[11]  Mahmut T. Kandemir,et al.  The design and use of simplePower: a cycle-accurate energy estimation tool , 2000, Proceedings 37th Design Automation Conference.

[12]  Francky Catthoor,et al.  9 Transfer and Storage Architecture Issues and Exploration in Multimedia Processors , 2002 .

[13]  Lizy K. John,et al.  Is Compiling for Performance — Compiling for Power? , 2001 .

[14]  Catherine H. Gebotys Utilizing memory bandwidth in DSP embedded processors , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[15]  Catherine H. Gebotys,et al.  An empirical comparison of algorithmic, instruction, and architectural power prediction models for high performance embedded DSP processors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[16]  Ben Klass Modeling Inter-Instruction Energy Effects in a Digital Signal Processor , 2006 .

[17]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[18]  Hugo De Man,et al.  Cache conscious data layout organization for embedded multimedia applications , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[19]  Mahmut T. Kandemir,et al.  Reducing memory requirements of nested loops for embedded systems , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[20]  Chaitali Chakrabarti,et al.  Interface and cache power exploration for core-based embedded system design , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).

[21]  L. Benini,et al.  A Power Modeling and Estimation Framework for VLIW-based Embedded Systems , 2001 .

[22]  Erik Brockmeyer,et al.  Data and memory optimization techniques for embedded systems , 2001, TODE.

[23]  Margaret Martonosi,et al.  Power-performance simulation: design and validation strategies , 2004, PERV.

[24]  Peter Marwedel,et al.  An Accurate and Fine Grain Instruction-Level Energy Model Supporting Software Optimizations , 2007 .

[25]  Eric Senn,et al.  Power Consumption Modeling and Characterization of the TI C6201 , 2003, IEEE Micro.

[26]  A. Sinha,et al.  JouleTrack-a Web based tool for software energy profiling , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[27]  Michael Gschwind,et al.  New methodology for early-stage, microarchitecture-level power-performance analysis of microprocessors , 2003, IBM J. Res. Dev..