A State-Based Energy/Performance Model for Parallel Applications on Multicore Computers

In this paper, we propose a state-based energy/performance model for a given parallel application on multicore computer systems. By quantifying energy consumptions at fine-grained levels, defined as states, we analyze the energy/performance impact by taking into account the application characteristics and energy features of multicore computers. By combining Amdahl's Law with our proposed model, we investigate the parallel degree and computation intensity of a given application, and derive the optimal number of cores and frequencies to achieve the minimum energy consumption. We also explore the extensions of energy/performance-efficiency metrics including Energy Per Speedup<sup>α</sup> (EPS<sup>α</sup>), Power Per Speedup<sup>α</sup> (PPS<sup>α</sup>), Dynamic Energy Per Speedup<sup>α</sup> (DEPS<sup>α</sup>) and Dynamic Power Per Speedup<sup>α</sup> (DPPS<sup>α</sup>), which use speedup with a weight α to better reflect the energy/performance tradeoffs, especially for parallel applications on multicore platforms. Our proposed state-based energy/performance model and metrics provide novel approaches on estimating the energy/performance impact at the fine-grained level, and offer guidance in achieving tradeoffs between performance and energy consumption for parallel applications on multicore platforms.

[1]  Massoud Pedram,et al.  Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[2]  Binoy Ravindran,et al.  An experimental evaluation of real-time DVFS scheduling algorithms , 2012, SYSTOR '12.

[3]  Z. Huang,et al.  Metrics and task scheduling policies for energy saving in multicore computers , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[4]  Karthikeyan Sankaralingam,et al.  Power challenges may end the multicore era , 2013, CACM.

[5]  Wenguang Chen,et al.  Maotai: View-Oriented Parallel Programming on CMT Processors , 2008, 2008 37th International Conference on Parallel Processing.

[6]  Kirk W. Cameron,et al.  Energy and performance characteristics of different parallel implementations of scientific applications on multicore systems , 2011, Int. J. High Perform. Comput. Appl..

[7]  José Duato,et al.  A simple power-aware scheduling for multicore systems when running real-time applications , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  Kevin Skadron,et al.  Power-aware computing , 2003, Computer.

[9]  Rong Ge,et al.  Generalizing Amdahl's Law for Power and Energy , 2012, Computer.

[10]  Martin Schulz,et al.  A regression-based approach to scalability prediction , 2008, ICS '08.

[11]  George L.-T. Chiu,et al.  Overview of the Blue Gene/L system architecture , 2005, IBM J. Res. Dev..

[12]  Fernando Gustavo Tinetti,et al.  Parallel programming: techniques and applications using networked workstations and parallel computers. Barry Wilkinson, C. Michael Allen , 2000 .

[13]  J. Cordes The Square Kilometer Array , 2006 .

[14]  Andrew S. Cassidy,et al.  Beyond Amdahl's Law: An Objective Function That Links Multiprocessor Performance Gains to Delay and Energy , 2012, IEEE Transactions on Computers.

[15]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[16]  Yong Meng Teo,et al.  Towards Modelling Parallelism and Energy Performance of Multicore Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[17]  Ran Ginosar,et al.  The effect of communication and synchronization on Amdahl's law in multicore systems , 2013, Parallel Comput..

[18]  Keqin Li,et al.  Performance Analysis of Power-Aware Task Scheduling Algorithms on Multiprocessor Computers with Dynamic Voltage and Speed , 2008, IEEE Transactions on Parallel and Distributed Systems.

[19]  Rami G. Melhem,et al.  On the Interplay of Parallelization, Program Performance, and Energy Consumption , 2010, IEEE Transactions on Parallel and Distributed Systems.

[20]  Carey L. Williamson,et al.  Decoupled Speed Scaling: Analysis and Evaluation , 2012, 2012 Ninth International Conference on Quantitative Evaluation of Systems.

[21]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[22]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[23]  Dong Li,et al.  PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.

[24]  Keqin Li,et al.  Energy efficient scheduling of parallel tasks on multiprocessor computers , 2012, The Journal of Supercomputing.

[25]  FranzMichael,et al.  Power reduction techniques for microprocessor systems , 2005 .

[26]  Rong Ge,et al.  Power-Aware Speedup , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[27]  Carey Williamson,et al.  Decoupled speed scaling: Analysis and evaluation , 2014, Perform. Evaluation.

[28]  Michael Franz,et al.  Power reduction techniques for microprocessor systems , 2005, CSUR.

[29]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008 .

[30]  Rami G. Melhem,et al.  Corollaries to Amdahl's Law for Energy , 2008, IEEE Computer Architecture Letters.

[31]  Hsien-Hsin S. Lee,et al.  Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era , 2008, Computer.

[32]  Krisztián Flautner,et al.  Automatic Performance Setting for Dynamic Voltage Scaling , 2001, MobiCom '01.

[33]  Frank Bellosa,et al.  Memory-aware Scheduling for Energy Efficiency on Multicore Processors , 2008, HotPower.

[34]  G. Blake,et al.  A survey of multicore processors , 2009, IEEE Signal Processing Magazine.

[35]  Wan Yeon Lee,et al.  Energy-Efficient Scheduling of Periodic Real-Time Tasks on Lightly Loaded Multicore Processors , 2012, IEEE Transactions on Parallel and Distributed Systems.

[36]  Qiang Xu,et al.  On Task Allocation and Scheduling for Lifetime Extension of Platform-Based MPSoC Designs , 2011, IEEE Transactions on Parallel and Distributed Systems.

[37]  James Laudon,et al.  Performance/Watt: the new server focus , 2005, CARN.

[38]  David C. Snowdon,et al.  Koala: a platform for OS-level power management , 2009, EuroSys '09.