Q-Learning-based Adaptive Power Management for IoT System-on-Chips with Embedded Power States

This paper introduces an Adaptive Power Management (APM) hardware module based on reinforcement learning techniques. The APM provides power consumption optimization during the suspend state of an Internet-of-Things (IoT) System-on-Chip (SoC) with 8 embedded power states. A Q-Learning algorithm with a counter-based exploration policy has been chosen and implemented. A complete analysis has been performed to properly define the parameters of the algorithm and characterize the proposed solution. A hardware implementation is also shown and introduces the APM design and simplification made for an Ultra Low Power hardware. This solution gives a long term average gain of 17% of power consumption during the system suspend time.

[1]  Luca Benini,et al.  Event-driven power management , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[2]  Massoud Pedram,et al.  Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).