zTT: learning-based DVFS with zero thermal throttling for mobile devices

DVFS (dynamic voltage and frequency scaling) is a system-level technique that adjusts voltage and frequency levels of CPU/GPU at runtime to balance energy efficiency and high performance. DVFS has been studied for many years, but it is considered still challenging to realize a DVFS that performs ideally for mobile devices for two main reasons: i) an optimal power budget distribution between CPU and GPU in a power-constrained platform can only be defined by the application performance, but conventional DVFS implementations are mostly application-agnostic; ii) mobile platforms experience dynamic thermal environments for many reasons such as mobility and holding methods, but conventional implementations are not adaptive enough to such environmental changes. In this work, we propose a deep reinforcement learning-based frequency scaling technique, zTT. zTT learns thermal environmental characteristics and jointly scales CPU and GPU frequencies to maximize the application performance in an energy-efficient manner while achieving zero thermal throttling. Our evaluations for zTT implemented on Google Pixel 3a and NVIDIA JETSON TX2 platform with various applications show that zTT can adapt quickly to changing thermal environments, consistently resulting in high application performance with energy efficiency. In a high-temperature environment where a rendering application with the default mobile DVFS fails to keep producing more than a target frame rate, zTT successfully manages to do so even with 23.9% less average power consumption.

[1]  Amit Kumar Singh,et al.  TEEM: Online Thermal- and Energy-Efficiency Management on CPU-GPU MPSoCs , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2]  Laurence T. Yang,et al.  Energy-Efficient Scheduling for Real-Time Systems Based on Deep Q-Learning Model , 2019, IEEE Transactions on Sustainable Computing.

[3]  Tajana Simunic,et al.  Modeling and mitigation of extra-SoC thermal coupling effects and heat transfer variations in mobile devices , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[4]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[5]  Anuj Pathania,et al.  Power-performance modelling of mobile gaming workloads on heterogeneous MPSoCs , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Tulika Mitra,et al.  OPTiC: Optimizing Collaborative CPU–GPU Computing on Mobile Devices With Thermal Constraints , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[7]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[8]  Ümit Y. Ogras,et al.  Predictive dynamic thermal and power management for heterogeneous mobile platforms , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  Ümit Y. Ogras,et al.  Power and Thermal Analysis of Commercial Mobile Platforms: Experiments and Case Studies , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Qiang Wang,et al.  HKBU Institutional Repository , 2018 .

[11]  Xin Fu,et al.  Redefining QoS and customizing the power management policy to satisfy individual mobile users , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Onur Sahin,et al.  Providing sustainable performance in thermally constrained mobile devices , 2016, 2016 14th ACM/IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia).

[13]  Julie A. Kientz,et al.  Developing and Validating the User Burden Scale: A Tool for Assessing User Burden in Computing Systems , 2016, CHI.

[14]  Jorg Henkel,et al.  Application and Thermal-reliability-aware Reinforcement Learning Based Multi-core Power Management , 2019, ACM J. Emerg. Technol. Comput. Syst..

[15]  Patrick Mochel The sysfs Filesystem , 2005 .

[16]  Zhiping Jia,et al.  Cooperative DVFS for energy-efficient HEVC decoding on embedded CPU-GPU architecture , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[17]  Jinwoo Shin,et al.  MetaSense: few-shot adaptation to untrained conditions in deep mobile sensing , 2019, SenSys.

[18]  Lei Yang,et al.  Frequency Scaling for Processor Power Efficiency , 2013 .

[19]  Krishna Sekar,et al.  Power and thermal challenges in mobile devices , 2013, MobiCom.

[20]  Geoff V. Merrett,et al.  Accurate and Stable Run-Time Power Modeling for Mobile and Embedded CPUs , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[21]  Hojung Cha,et al.  Graphics-aware Power Governing for Mobile Devices , 2019, MobiSys.

[22]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[23]  Umit Y. Ogras,et al.  Dynamic Power Budgeting for Mobile Systems Running Graphics Workloads , 2018, IEEE Transactions on Multi-Scale Computing Systems.

[24]  Naehyuck Chang,et al.  Dynamic thermal management in mobile devices considering the thermal coupling between battery and application processor , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[25]  Muhammad Shafique,et al.  Improving mobile gaming performance through cooperative CPU-GPU thermal management , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[26]  Lothar Thiele,et al.  Maestro: Autonomous QoS Management for Mobile Applications Under Thermal Constraints , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[28]  Chaitali Chakrabarti,et al.  A Deep Q-Learning Approach for Dynamic Management of Heterogeneous Processors , 2019, IEEE Computer Architecture Letters.

[29]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sung-Ju Lee,et al.  Fire in Your Hands: Understanding Thermal Behavior of Smartphones , 2019, MobiCom.

[31]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[32]  Yansong Feng,et al.  Proteus: network-aware web browsing on heterogeneous mobile systems , 2018, CoNEXT.

[33]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[34]  Michael Kishinevsky,et al.  A control-theoretic approach for energy efficient CPU-GPU subsystem in mobile platforms , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Nikil D. Dutt,et al.  Synergistic CPU-GPU Frequency Capping for Energy-Efficient Mobile Games , 2018, ACM Trans. Embed. Comput. Syst..

[36]  Dilip Krishnaswamy,et al.  PROMETHEUS: A Proactive Method for Thermal Management of Heterogeneous MPSoCs , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[37]  Mehdi Kamal,et al.  A heuristic machine learning-based algorithm for power and thermal management of heterogeneous MPSoCs , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[38]  Umit Y. Ogras,et al.  Algorithmic Optimization of Thermal and Power Management for Heterogeneous Mobile Platforms , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.