Dynamic power management utilizing reinforcement learning with fuzzy reward for energy harvesting wireless sensor nodes

This paper considers a scenario that wireless sensor node is powered by harvesting energy with the characteristics of ambiguity and uncertainty. Reinforcement learning with fuzzy reward (RLFR) is used in this study for the dynamic power management of such energy harvesting wireless sensor nodes. By interacting with the given environment, the RLFR adjusts the duty-cycle in data sensing task according to the variable incoming energy related signals. The outcomes of these interactions are evaluated by fuzzy reward that express how well the duty-cycle adjustments in satisfying given requirement of energy neutrality. Simulation results show that the RLFR not only satisfies the sensing requirement in maintaining energy neutrality, but it also achieves better energy utilization in terms of residual battery energy in comparing with another existing dynamic power management method.

[1]  Chang-Soo Kim,et al.  Energy management strategies of a fuel cell/battery hybrid system using fuzzy logics , 2005 .

[2]  Lothar Thiele,et al.  Reward Maximization for Embedded Systems with Renewable Energies , 2008, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.

[3]  Kuan-Chieh Wang,et al.  QoS-Aware Power Management for Energy Harvesting Wireless Sensor Network Utilizing Reinforcement Learning , 2009, 2009 International Conference on Computational Science and Engineering.

[4]  Majid Nili Ahmadabadi,et al.  Exploration and exploitation balance management in fuzzy reinforcement learning , 2010, Fuzzy Sets Syst..

[5]  Mani B. Srivastava,et al.  Power management in energy harvesting sensor networks , 2007, TECS.

[6]  Masami Yasuda,et al.  Markov decision processes with fuzzy rewards , 2003 .

[7]  Luca Benini,et al.  Adaptive Power Management in Energy Harvesting Systems , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[8]  Andrew G. Barto,et al.  Adaptive Control of Duty Cycling in Energy-Harvesting Wireless Sensor Networks , 2007, 2007 4th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.

[9]  S. S. Masoumzadeh,et al.  Deep Blue: A Fuzzy Q-Learning Enhanced Active Queue Management Scheme , 2009, 2009 International Conference on Adaptive and Intelligent Systems.

[10]  Yuji Yoshida Continuous-time fuzzy decision processes with discounted rewards , 2003, Fuzzy Sets Syst..

[11]  Toshihiko Watanabe,et al.  A study on multi-agent reinforcement learning problem based on hierarchical modular fuzzy model , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[12]  Hiok Chai Quek,et al.  Maximum reward reinforcement learning: A non-cumulative reward criterion , 2006, Expert Syst. Appl..

[13]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14]  M. Kurano,et al.  A limit theorem in some dynamic fuzzy systems , 1992 .

[15]  Roy Chaoming Hsu,et al.  Reinforcement Learning-Based Dynamic Power Management for Energy Harvesting Wireless Sensor Network , 2009, IEA/AIE.