Learning-Based Run-Time Power and Energy Management of Multi/Many-Core Systems: Current and Future Trends

Multi/Many-core systems are prevalent in several application domains targeting different scales of computing such as embedded and cloud computing. These systems are able to fulfil the ever-increasing performance requirements by exploiting their parallel processing capabilities. However, effective power/energy management is required during system operations due to several reasons such as to increase the operational time of battery operated systems, reduce the energy cost of datacenters, and improve thermal efficiency and reliability. This article provides an extensive survey of learning-based run-time power/energy management approaches. The survey includes a taxonomy of the learning-based approaches. These approaches perform design-time and/or run-time power/energy management by employing some learning principles such as reinforcement learning. The survey also highlights the trends followed by the learning-based run-time power management approaches, their upcoming trends and open research challenges.

[1]  Haoran Li,et al.  Modular reinforcement learning for self-adaptive energy efficiency optimization in multicore system , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[2]  Piotr Dziurzanski,et al.  A Survey and Comparative Study of Hard and So Real-time Dynamic Resource Allocation Strategies for Multi / Many-core Systems , 2017 .

[3]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[4]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[5]  Bronis R. de Supinski,et al.  Prediction models for multi-dimensional power-performance optimization on many cores , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Michael O. Duff,et al.  Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.

[7]  Christine A. Shoemaker,et al.  Scalable thread scheduling and global power management for heterogeneous many-core architectures , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[8]  Qiang Xu,et al.  Learning-Based Power Management for Multicore Processors via Idle Period Manipulation , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Geoff V. Merrett,et al.  Adaptive energy minimization of embedded heterogeneous systems using regression-based learning , 2015, 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[10]  Rainer Leupers,et al.  MAPS: An integrated framework for MPSoC application parallelization , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[11]  Hao Shen,et al.  Contention aware frequency scaling on CMPs with guaranteed quality of service , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Lieven Eeckhout,et al.  Scheduling heterogeneous multi-cores through performance impact estimation (PIE) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[13]  Amit Kumar Singh,et al.  Energy optimization by exploiting execution slacks in streaming applications on Multiprocessor Systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Pasi Liljeberg,et al.  Energy-Efficient Virtual Machines Consolidation in Cloud Data Centers Using Reinforcement Learning , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[15]  Ishfaq Ahmad,et al.  Efficient heuristics for joint optimization of performance, energy, and temperature in allocating tasks to multi-core processors , 2014, International Green Computing Conference.

[16]  László Monostori,et al.  Machine Learning Approaches to Manufacturing , 1996 .

[17]  Haoran Li,et al.  JADE: a Heterogeneous Multiprocessor System Simulation Platform Using Recorded and Statistical Application Models , 2016, AISTECS '16.

[18]  Vikram Krishnamurthy,et al.  V-BLAST Power and Rate Control under Delay Constraints in Markovian Fading Channels - Optimality of Monotonic Policies , 2006, 2006 IEEE International Symposium on Information Theory.

[19]  Hiroshi Sasaki,et al.  Coordinated power-performance optimization in manycores , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.

[20]  Hao Shen,et al.  Learning based DVFS for simultaneous temperature, performance and energy management , 2012, Thirteenth International Symposium on Quality Electronic Design (ISQED).

[21]  Wai Ho Mow,et al.  A Case Study on the Communication and Computation Behaviors of Real Applications in NoC-Based MPSoCs , 2014, 2014 IEEE Computer Society Annual Symposium on VLSI.

[22]  Rashedur M. Rahman,et al.  Energy-aware VM consolidation approach using combination of heuristics and migration control , 2014, Ninth International Conference on Digital Information Management (ICDIM 2014).

[23]  Rajarshi Das,et al.  Utility-Function-Driven Resource Allocation in Autonomic Systems , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[24]  Benoît Dupont de Dinechin,et al.  Time-critical computing on a single-chip massively parallel processor , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[25]  Geoff V. Merrett,et al.  Adaptive and Hierarchical Runtime Manager for Energy-Aware Thermal Management of Embedded Systems , 2016, ACM Trans. Embed. Comput. Syst..

[26]  Pedro López,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[27]  Christian Bienia,et al.  PARSEC 2.0: A New Benchmark Suite for Chip-Multiprocessors , 2009 .

[28]  Babak Falsafi,et al.  Scale-out processors , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[29]  Hannu Tenhunen,et al.  Guest Editors' Introduction: Multiprocessor Systems-on-Chips , 2005, Computer.

[30]  Yale N. Patt,et al.  Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs , 2008, ASPLOS.

[31]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[32]  Radu Marculescu,et al.  Energy-aware mapping for tile-based NoC architectures under performance constraints , 2003, ASP-DAC '03.

[33]  Engin Ipek,et al.  Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[34]  Jing-Yang Jou,et al.  Scalable Power Management Using Multilevel Reinforcement Learning for Multiprocessors , 2014, TODE.

[35]  G. Dhiman,et al.  Dynamic Power Management Using Machine Learning , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[36]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[37]  Amit Kumar Singh,et al.  Mapping on multi/many-core systems: Survey of current and emerging trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[38]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[39]  Amit Kumar Singh,et al.  Defragmentation for Efficient Runtime Resource Management in NoC-Based Many-Core Systems , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[40]  Grant Martin,et al.  Overview of the MPSoC design challenge , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[41]  Amit Kumar Singh,et al.  Communication-aware heuristics for run-time task mapping on NoC-based MPSoC platforms , 2010, J. Syst. Archit..

[42]  Sander Stuijk,et al.  MNEMEE – An Automated Toolflow for Parallelization and Memory Management in MPSoC Platforms , 2011 .

[43]  Massoud Pedram,et al.  Dynamic voltage and frequency scaling based on workload decomposition , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[44]  Coniferous softwood GENERAL TERMS , 2003 .

[45]  Roger F. Woods,et al.  Runtime support for adaptive power capping on heterogeneous SoCs , 2016, 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS).

[46]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[47]  Siddharth Garg,et al.  Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[48]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[49]  Elwin Chandra Monie,et al.  Hardware Architecture of Reinforcement Learning Scheme for Dynamic Power Management in Embedded Systems , 2007, EURASIP J. Embed. Syst..

[50]  Xiaobo Zhou,et al.  Autonomic performance and power control for co-located Web applications on virtualized servers , 2013, 2013 IEEE/ACM 21st International Symposium on Quality of Service (IWQoS).

[51]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[52]  Massoud Pedram,et al.  Improving the Efficiency of Power Management Techniques by Using Bayesian Classification , 2008, ISQED 2008.

[53]  Xiaowei Li,et al.  An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores , 2016, IEEE Transactions on Computers.

[54]  Luis Alfonso Maeda-Nunez,et al.  Learning Transfer-Based Adaptive Energy Minimization in Embedded Systems , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[55]  Naehyuck Chang,et al.  Machine learning-based energy management in a hybrid electric vehicle to minimize total operating cost , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[56]  Diana Marculescu,et al.  Distributed reinforcement learning for power limited many-core system performance optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[57]  Giovanni De Micheli,et al.  An adaptive low-power transmission scheme for on-chip networks , 2002, 15th International Symposium on System Synthesis, 2002..

[58]  Massoud Pedram,et al.  A Reinforcement Learning-Based Power Management Framework for Green Computing Data Centers , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[59]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[60]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[61]  Wei Liu,et al.  Enhanced Q-learning algorithm for dynamic power management with performance constraint , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[62]  Amit Kumar Singh,et al.  Run-Time Computation and Communication Aware Mapping Heuristic for NoC-Based Heterogeneous MPSoC Platforms , 2011, 2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming.

[63]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[64]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[65]  Daniël Paulusma,et al.  Run-time mapping of applications to a heterogeneous reconfigurable tiled system on chip architecture , 2004, Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921).

[66]  Gurindar S. Sohi,et al.  Adaptive, efficient, parallel execution of parallel programs , 2014, PLDI.

[67]  Mark D. Pendrith On Reinforcement Learning of Control Actions in Noisy and Non-Markovian Domains , 1994 .

[68]  Massoud Pedram,et al.  Model-Free Reinforcement Learning and Bayesian Classification in System-Level Power Management , 2016, IEEE Transactions on Computers.

[69]  Gerald Tesauro,et al.  Online Resource Allocation Using Decompositional Reinforcement Learning , 2005, AAAI.

[70]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[71]  Luca Benini,et al.  Event-driven power management , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[72]  Diana Marculescu,et al.  Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors , 2012, ISLPED '12.

[73]  Jordi Torres,et al.  Towards energy-aware scheduling in data centers using machine learning , 2010, e-Energy.

[74]  Changjun Jiang,et al.  Autonomic Performance and Power Control for Co-Located Web Applications in Virtualized Datacenters , 2016, IEEE Transactions on Parallel and Distributed Systems.

[75]  Michael F. P. O'Boyle,et al.  Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[76]  Hsien-Hsin S. Lee,et al.  Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era , 2008, Computer.

[77]  Günther Palm,et al.  Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.

[78]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[79]  Massoud Pedram,et al.  Supervised Learning Based Power Management for Multicore Processors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[80]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.