Modular reinforcement learning for self-adaptive energy efficiency optimization in multicore system

Energy-efficiency is becoming increasingly important to modern computing systems with multi-/many-core architectures. Dynamic Voltage and Frequency Scaling (DVFS), as an effective low-power technique, has been widely applied to improve energy-efficiency in commercial multi-core systems. However, due to the large number of cores and growing complexity of emerging applications, it is difficult to efficiently find a globally optimized voltage/frequency assignment at runtime. In order to improve the energy-efficiency for the overall multicore system, we propose an online DVFS control strategy based on core-level Modular Reinforcement Learning (MRL) to adaptively select appropriate operating frequencies for each individual core. Instead of focusing solely on the local core conditions, MRL is able to make comprehensive decisions by considering the running-states of multiple cores without incurring exponential memory cost which is necessary in traditional Monolithic Reinforcement Learning (RL). Experimental results on various realistic applications and different system scales show that the proposed approach improves up to 28% energy-efficiency compared to the recent individual-RL approach.

[1]  Luis Alfonso Maeda-Nunez,et al.  Learning Transfer-Based Adaptive Energy Minimization in Embedded Systems , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[3]  Pedro Trancoso,et al.  Scalable and Dynamic Global Power Management for Multicore Chips , 2015, PARMA-DITAM '15.

[4]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[5]  Norihiko Ono,et al.  A Modular Approach to Multi-Agent Reinforcement Learning , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[6]  Qiang Xu,et al.  Learning-Based Power Management for Multicore Processors via Idle Period Manipulation , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[8]  Diana Marculescu,et al.  Analysis of dynamic voltage/frequency scaling in chip-multiprocessors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[9]  G. Dhiman,et al.  Dynamic Power Management Using Machine Learning , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[10]  Haoran Li,et al.  JADE: a Heterogeneous Multiprocessor System Simulation Platform Using Recorded and Statistical Application Models , 2016, AISTECS '16.

[11]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[12]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[13]  Bharadwaj Veeravalli,et al.  Workload uncertainty characterization and adaptive frequency scaling for energy minimization of embedded systems , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[14]  Massoud Pedram,et al.  Supervised Learning Based Power Management for Multicore Processors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[15]  Diana Marculescu,et al.  Distributed reinforcement learning for power limited many-core system performance optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[16]  Simon W. Moore,et al.  A communication characterisation of Splash-2 and Parsec , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[17]  Cho-Li Wang,et al.  Latency-Aware Dynamic Voltage and Frequency Scaling on Many-Core Architectures for Data-Intensive Applications , 2013, 2013 International Conference on Cloud Computing and Big Data.

[18]  Wei Liu,et al.  Enhanced Q-learning algorithm for dynamic power management with performance constraint , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[19]  Jonas Karlsson,et al.  Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging , 1993 .

[20]  Wai Ho Mow,et al.  A Case Study on the Communication and Computation Behaviors of Real Applications in NoC-Based MPSoCs , 2014, 2014 IEEE Computer Society Annual Symposium on VLSI.

[21]  Massoud Pedram,et al.  Model-Free Reinforcement Learning and Bayesian Classification in System-Level Power Management , 2016, IEEE Transactions on Computers.

[22]  George Theocharous,et al.  Machine Learning for Adaptive Power Management , 2006 .