Controlling Action Space of Reinforcement-Learning-Based Energy Management in Batteryless Applications

Duty cycle management is critical for the energy-neutral operation of batteryless devices. Many efforts have been made to develop an effective duty cycling method, including machine-learning-based approaches, but existing methods can barely handle the dynamic harvesting environments of batteryless devices. Specifically, most machine-learning-based methods require the harvesting patterns to be collected in advance, as well as manual configuration of the duty-cycle boundaries. In this article, we propose a configuration-free duty cycling scheme for batteryless devices, called CTRL, with which energy harvesting nodes tune the duty cycle themselves adapting to the surrounding environment without user intervention. This approach combines reinforcement learning (RL) with a control system to allow the learning algorithm to explore all possible search space automatically. The learning algorithm sets the target State of Charge (SoC) of the energy storage, instead of explicitly setting the target task frequency at a given time. The control system then satisfies the target SoC by controlling the duty cycle. An evaluation based on the real implementation of the system using publicly available trace data shows that CTRL outperforms state-of-the-art approaches, resulting in 40% less frequent power failures in energy-scarce environments while achieving more than ten times the task frequency in energy-rich environments.

[1]  Hojung Cha,et al.  State-of-Charge Estimation of Supercapacitors in Transiently-Powered Sensor Nodes , 2022, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  S. Nirjon,et al.  SmartON: Just-in-Time Active Event Detection on Energy Harvesting Systems , 2021, 2021 17th International Conference on Distributed Computing in Sensor Systems (DCOSS).

[3]  Rajesh K. Gupta,et al.  Ember: energy management of batteryless event detection sensors with deep reinforcement learning , 2020, SenSys.

[4]  Brandon Lucia,et al.  Adaptive low-overhead scheduling for periodic and reactive intermittent execution , 2020, PLDI.

[5]  Josiah D. Hester,et al.  Reliable Timekeeping for Intermittent Computing , 2020, ASPLOS.

[6]  Alex S. Weddell,et al.  Energy-driven computing , 2019, Philosophical Transactions of the Royal Society A.

[7]  L. Thiele,et al.  Dataset: Tracing Indoor Solar Harvesting , 2019, DATA@SenSys.

[8]  Shahriar Nirjon,et al.  Poster Abstract: On-Device Training from Sensor Data on Batteryless Platforms , 2019, 2019 18th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[9]  Sile Ma,et al.  Using Energy-Aware Scheduling Weather Forecast Based Harvesting for Reconfigurable Hardware , 2019, IEEE Transactions on Sustainable Computing.

[10]  Rajesh E. Gupta,et al.  Pible: battery-free mote for perpetual indoor BLE applications: demo abstract , 2018, BuildSys@SenSys.

[11]  Francesco Fraternali,et al.  Scaling configuration of energy harvesting sensors with reinforcement learning , 2018, ENSsys@SenSys.

[12]  Przemyslaw Pawelczak,et al.  InK: Reactive Kernel for Tiny Batteryless Sensors , 2018, SenSys.

[13]  Brandon Lucia,et al.  Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems , 2018, ASPLOS.

[14]  Shie Mannor,et al.  Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.

[15]  Brandon Lucia,et al.  A Reconfigurable Energy Storage Architecture for Energy-harvesting Devices , 2018, ASPLOS.

[16]  Olivier Berder,et al.  RLMan: An Energy Manager Based on Reinforcement Learning for Energy Harvesting Wireless Sensor Networks , 2018, IEEE Transactions on Green Communications and Networking.

[17]  Hiroshi Nakamura,et al.  Adaptive Power Management in Solar Energy Harvesting Sensor Node Using Reinforcement Learning , 2017, ACM Trans. Embed. Comput. Syst..

[18]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[19]  Brandon Lucia,et al.  Chain: tasks and channels for reliable intermittent programs , 2016, OOPSLA.

[20]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[21]  Olivier Berder,et al.  Fuzzy power management for energy harvesting Wireless Sensor Nodes , 2016, 2016 IEEE International Conference on Communications (ICC).

[22]  Faisal Karim Shaikh,et al.  Energy harvesting in wireless sensor networks: A comprehensive review , 2016 .

[23]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[24]  Brandon Lucia,et al.  A simpler, safer programming and execution model for intermittent systems , 2015, PLDI.

[25]  Luca Benini,et al.  Hibernus: Sustaining Computation During Intermittent Supply for Energy-Harvesting Systems , 2015, IEEE Embedded Systems Letters.

[26]  Lothar Thiele,et al.  Dynamic power management for long-term energy neutral operation of solar energy harvesting systems , 2014, SenSys.

[27]  Kevin Fu,et al.  Mementos: system support for long-running computation on RFID-scale devices , 2011, ASPLOS XVI.

[28]  Mani B. Srivastava,et al.  Power management in energy harvesting sensor networks , 2007, TECS.

[29]  Andrew G. Barto,et al.  Adaptive Control of Duty Cycling in Energy-Harvesting Wireless Sensor Networks , 2007, 2007 4th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.