A Methodology for Power-aware Pipelining via High-Level Performance Model Evaluations

Power is one of the major constraints considered during the design of embedded software. In order to reduce power consumption without sacrificing performance, software needs to be optimized in order to run as efficiently as possible on a given platform. When attempting to optimize the mapping of a piece of software on a multiprocessor system, designers often face the chicken-and-egg problem of whether to schedule tasks first, or do memory allocation first, as either step will affect the different optimization opportunities the other may provide. Because each optimization will affect the system’s power consumption, it is critically important to be able to monitor the effects these transformations have. In this paper we present a methodology that allows designers to quickly evaluate the impact each code optimization will have in the system’s power. Our exploration engine relies on SystemC-based power/performance models to quickly and accurately evaluate the dynamic power due to memory accesses as well as the expected CPU power consumption.

[1]  Kunle Olukotun,et al.  The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[2]  Erik Brockmeyer,et al.  Layer assignment techniques for low power in multi-layered memory organisations. , 2003 .

[3]  S. Pasricha,et al.  CAPPS : A Framework for Power-Performance Trade-Offs in On-Chip Communication Architecture Synthesis ∗ , 2006 .

[4]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[5]  Tulika Mitra,et al.  Integrated scratchpad memory optimization and task scheduling for MPSoC architectures , 2006, CASES '06.

[6]  Kiyoung Choi,et al.  SoCDAL: System-on-chip design AcceLerator , 2008, TODE.

[7]  Ranga Vemuri,et al.  RECOD: a retiming heuristic to optimize resource and memory utilization in HW/SW codesigns , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[8]  Erik Brockmeyer,et al.  Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[9]  Christian Steger,et al.  Rapid exploration of multimedia system-on-chips with automatically generated software performance models , 2008, 2008 IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia.

[10]  Nikil D. Dutt,et al.  Efficient utilization of scratch-pad memory in embedded processor applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.

[11]  Erik Brockmeyer,et al.  Layer assignment techniques for low energy in multi-layered memory organisations , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[12]  Nikil D. Dutt,et al.  Methodology for multi-granularity embedded processor power model generation for an ESL design flow , 2008, CODES+ISSS '08.

[13]  Anand Raghunathan,et al.  Power monitors: a framework for system-level power estimation using heterogeneous power models , 2005, 18th International Conference on VLSI Design held jointly with 4th International Conference on Embedded Systems Design.

[14]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[15]  Nikil D. Dutt,et al.  Inter-kernel data reuse and pipelining on chip-multiprocessors for multimedia applications , 2009, 2009 IEEE/ACM/IFIP 7th Workshop on Embedded Systems for Real-Time Multimedia.

[16]  Ishfaq Ahmad,et al.  Benchmarking and Comparison of the Task Graph Scheduling Algorithms , 1999, J. Parallel Distributed Comput..

[17]  Shyamkumar Thoziyoor,et al.  CACTI 5 . 1 , 2008 .

[18]  Krzysztof Kuchcinski,et al.  A constructive algorithm for memory-aware task assignment and scheduling , 2001, CODES '01.