Scalable Power Management Using Multilevel Reinforcement Learning for Multiprocessors

Dynamic power management has become an imperative design factor to attain the energy efficiency in modern systems. Among various power management schemes, learning-based policies that are adaptive to different environments and applications have demonstrated superior performance to other approaches. However, they suffer the scalability problem for multiprocessors due to the increasing number of cores in a system. In this article, we propose a scalable and effective online policy called MultiLevel Reinforcement Learning (MLRL). By exploiting the hierarchical paradigm, the time complexity of MLRL is O(n lg n) for n cores and the convergence rate is greatly raised by compressing redundant searching space. Some advanced techniques, such as the function approximation and the action selection scheme, are included to enhance the generality and stability of the proposed policy. By simulating on the SPLASH-2 benchmarks, MLRL runs 53% faster and outperforms the state-of-the-art work with 13.6% energy saving and 2.7% latency penalty on average. The generality and the scalability of MLRL are also validated through extensive simulations.

[1]  Meeta Sharma Gupta,et al.  System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[2]  Pradip Bose,et al.  A case for guarded power gating for multi-core processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[3]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: application in VLSI domain , 1997, DAC.

[4]  Christine A. Shoemaker,et al.  Scalable thread scheduling and global power management for heterogeneous many-core architectures , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[6]  Henry Hoffmann,et al.  Comparison of Decision-Making Strategies for Self-Optimization in Autonomic Computing Systems , 2012, TAAS.

[7]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[8]  Carl Staelin,et al.  Idleness is Not Sloth , 1995, USENIX.

[9]  Cécile Belleudy,et al.  Power Management in Real Time Embedded Systems through Online and Adaptive Interplay of DPM and DVFS Policies , 2010, 2010 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[10]  PanGung-Yu,et al.  Scalable Power Management Using Multilevel Reinforcement Learning for Multiprocessors , 2014 .

[11]  Chen-Khong Tham,et al.  Modular on-line function approximation for scaling up reinforcement learning , 1994 .

[12]  Gu-Yeon Wei,et al.  Thread motion: fine-grained power management for multi-core systems , 2009, ISCA '09.

[13]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[14]  Pedro López,et al.  Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors , 2007, 19th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'07).

[15]  Ying Tan,et al.  Stochastic Modeling and Optimization for Robust Power Management in a Partially Observable System , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[16]  Niraj K. Jha,et al.  Low power system scheduling and synthesis , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[17]  Tajana Simunic,et al.  System-Level Power Management Using Online Learning , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[18]  Massoud Pedram,et al.  Uncertainty-Aware Dynamic Power Management in Partially Observable Domains , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[19]  Hai Zhou,et al.  Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems , 2010 .

[20]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[21]  Massoud Pedram,et al.  Power minimization in IC design: principles and applications , 1996, TODE.

[22]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[23]  Vittorio Zaccaria,et al.  ARTE: An Application-specific Run-Time management framework for multi-core systems , 2011, 2011 IEEE 9th Symposium on Application Specific Processors (SASP).

[24]  Anna R. Karlin,et al.  Competitive randomized algorithms for non-uniform problems , 1990, SODA '90.

[25]  Hui Wang,et al.  Using Radial Basis Function Networks for Function Approximation and Classification , 2012 .

[26]  John Sartori,et al.  Distributed peak power management for many-core architectures , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[27]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[28]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[29]  Meeta Srivastav,et al.  Design of energy-efficient, adaptable throughput systems at near/sub-threshold voltage , 2013, TODE.

[30]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[31]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[32]  Pradip Bose,et al.  Evaluating design tradeoffs in on-chip power management for CMPs , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[33]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[34]  Günther Palm,et al.  Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.

[35]  Shie Mannor,et al.  Adaptive Timeout Policies for Fast Fine-Grained Power Management , 2007, AAAI.

[36]  Massoud Pedram,et al.  Supervised Learning Based Power Management for Multicore Processors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[37]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[38]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[39]  John Augustine,et al.  Optimal power-down strategies , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[40]  Allen C.-H. Wu,et al.  A predictive system shutdown method for energy saving of event-driven computation , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[41]  Qiang Xu,et al.  Learning-based power management for multi-core processors via idle period manipulation , 2012, 17th Asia and South Pacific Design Automation Conference.