Hardware Architecture of Reinforcement Learning Scheme for Dynamic Power Management in Embedded Systems

Dynamic power management (DPM) is a technique to reduce power consumption of electronic systems by selectively shutting down idle components. In this paper, a novel and nontrivial enhancement of conventional reinforcement learning (RL) is adopted to choose the optimal policy out of the existing DPM policies. A hardware architecture evolved from the VHDL model of Temporal Difference RL algorithm is proposed in this paper, which can suggest the winner policy to be adopted for any given workload to achieve power savings. The effectiveness of this approach is also demonstrated by an event-driven simulator, which is designed using JAVA for power-manageable embedded devices. The results show that RL applied to DPM can lead up to 28% power savings.

[1]  John E. Freund,et al.  Probability and statistics for engineers , 1965 .

[2]  Richard L. Scheaffer,et al.  Probability and statistics for engineers , 1986 .

[3]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[4]  Luca Benini,et al.  Policy optimization for dynamic power management , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  C. Ribeiro A Tutorial on Reinforcement Learning Techniques , 1999 .

[6]  Luca Benini,et al.  Dynamic power management using adaptive learning tree , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).

[7]  Qinru Qiu,et al.  Dynamic power management based on continuous-time Markov decision processes , 1999, Proceedings - Design Automation Conference.

[8]  Giovanni De Micheli,et al.  Software controlled power management , 1999, CODES '99.

[9]  Sandeep K. Shukla,et al.  A model checking approach to evaluating system level dynamic power management policies for embedded systems , 2001, Sixth IEEE International High-Level Design Validation and Test Workshop.

[10]  Giovanni De Micheli,et al.  Comparing System-Level Power Management Policies , 2001, IEEE Des. Test Comput..

[11]  Sandy Irani,et al.  Competitive analysis of dynamic power management strategies for systems with multiple power saving states , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.

[12]  R. Ambatipudi,et al.  Dynamic Energy management in embedded systems , 2003 .

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.