Autonomous Power Management With Double-Q Reinforcement Learning Method

Energy efficiency and autonomous power management are extremely important for mobile-edge computing. Reducing energy consumption of a number of applications running concurrently in mobile devices while maintaining performance poses a challenge to energy optimization due to the limited capacity of the embedded battery. To extend battery life and offer a long-lasting working energy, dynamic voltage and frequency scaling (DVFS) has been widely used in mobile devices for energy consumption minimization. However, most conventional DVFS techniques scale operating frequency based on static policies, and thus, they are difficult to be adapted to systems of varied conditions. In order to improve adaptivity, in this article, we proposed a Double-<italic>Q</italic> power management approach to scale operating frequency based on learning. The Double-<italic>Q</italic> method stores two <italic>Q</italic> tables and two corresponding update functions. In each decision point, either of <italic>Q</italic> tables is randomly chosen and updated, while the other is used for the measurement. This mechanism reduces the overestimation in <italic>Q</italic> values, consequently enhancing the accurateness of frequency predictions. To evaluate the effectiveness of our proposed approach, a Double-<italic>Q</italic> governor is implemented in the Linux kernel. Our approach is computationally light, and experimental results indicate that it achieves at least 5–<inline-formula><tex-math notation="LaTeX">$18\%$</tex-math></inline-formula> total energy saving compared to ondemand and conservative governors, as well as <italic>Q</italic> learning-based method.

[1]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Qingchen Zhang,et al.  Double-Q Learning-Based DVFS for Multi-core Real-Time Systems , 2017, 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData).

[3]  Meikang Qiu,et al.  Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads , 2013, TECS.

[4]  Bernhard Rinner,et al.  Online learning of timeout policies for dynamic power management , 2014, ACM Trans. Embed. Comput. Syst..

[5]  Laurence T. Yang,et al.  Hybrid genetic algorithms for scheduling partially ordered tasks in a multi-processor environment , 1999, Proceedings Sixth International Conference on Real-Time Computing Systems and Applications. RTCSA'99 (Cat. No.PR00306).

[6]  Alagan Anpalagan,et al.  Efficient Energy Management for the Internet of Things in Smart Cities , 2017, IEEE Communications Magazine.

[7]  Tajana Simunic,et al.  System-Level Power Management Using Online Learning , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  Haoran Li,et al.  Collaborative Power Management Through Knowledge Sharing Among Multiple Devices , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Preeti Ranjan Panda,et al.  Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network , 2017, ACM Trans. Archit. Code Optim..

[10]  Apostolos Ampatzoglou,et al.  Investigating the effect of design patterns on energy consumption , 2017, J. Softw. Evol. Process..

[11]  David Flynn An ARM perspective on addressing low-power energy-efficient SoC designs , 2012, ISLPED '12.

[12]  Erik Jagroep,et al.  Extending software architecture views with an energy consumption perspective , 2017, Computing.

[13]  Qiang Xu,et al.  Learning-Based Power Management for Multicore Processors via Idle Period Manipulation , 2014, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[15]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[16]  Cécile Belleudy,et al.  Hybrid power management in real time embedded systems: an interplay of DVFS and DPM techniques , 2011, Real-Time Systems.

[17]  Laurence T. Yang,et al.  Task aware hybrid DVFS for multi-core real-time systems using machine learning , 2017, Inf. Sci..

[18]  Semih Salihoglu,et al.  Workload-Aware CPU Performance Scaling for Transactional Database Systems , 2018, SIGMOD Conference.

[19]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[20]  Eduard Ayguadé,et al.  PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite , 2016, ACM Trans. Archit. Code Optim..

[21]  Peng Li,et al.  A canonical polyadic deep convolutional computation model for big data feature learning in Internet of Things , 2019, Future Gener. Comput. Syst..

[22]  Meikang Qiu,et al.  Dynamic and Leakage Energy Minimization With Soft Real-Time Loop Scheduling and Voltage Assignment , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Massoud Pedram,et al.  Supervised Learning Based Power Management for Multicore Processors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[25]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[26]  Amir Hussain,et al.  Applications of Deep Learning and Reinforcement Learning to Biological Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Yang Chen,et al.  Accelerating Mobile Applications at the Network Edge with Software-Programmable FPGAs , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[28]  Luis Alfonso Maeda-Nunez,et al.  Learning Transfer-Based Adaptive Energy Minimization in Embedded Systems , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.