Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications

To overcome the difficulties of charging the wireless sensors in the wild with conventional energy supply, more and more researchers have focused on the sensor networks with renewable generations. Considering the uncertainty of the renewable generations, an effective energy management strategy is necessary for the sensors. In this paper, we propose a novel energy management algorithm based on the reinforcement learning. By utilizing deep deterministic policy gradient (DDPG), the proposed algorithm is applicable for the continuous states and realizes the continuous energy management. We also propose a state normalization algorithm to help the neural network initialize and learn. With only one day’s real solar data and the simulative channel data for training, the proposed algorithm shows excellent performance in the validation with about 800 days length of real solar data. Compared with the state-of-the-art algorithms, the proposed algorithm achieves better performance in terms of long-term average net bit rate.

[1]  K. J. Ray Liu,et al.  On Outage Probability for Two-Way Relay Networks With Stochastic Energy Harvesting , 2016, IEEE Transactions on Communications.

[2]  Haibin Zhang,et al.  When Smart Wearables Meet Intelligent Vehicles: Challenges and Future Directions , 2017, IEEE Wireless Communications.

[3]  Kaibin Huang,et al.  Energy Harvesting Wireless Communications: A Review of Recent Advances , 2015, IEEE Journal on Selected Areas in Communications.

[4]  Yan Chen,et al.  Lyapunov Optimization for Energy Harvesting Wireless Sensor Communications , 2018, IEEE Internet of Things Journal.

[5]  Yan Chen,et al.  Lyapunov Optimized Cooperative Communications With Stochastic Energy Harvesting Relay , 2018, IEEE Internet of Things Journal.

[6]  Tobias Weber,et al.  Reinforcement Learning for Energy Harvesting Decode-and-Forward Two-Hop Communications , 2017, IEEE Transactions on Green Communications and Networking.

[7]  Jia Yuan Yu,et al.  A Reinforcement Learning Technique for Optimizing Downlink Scheduling in an Energy-Limited Vehicular Network , 2017, IEEE Transactions on Vehicular Technology.

[8]  Jing Yang,et al.  Transmission with Energy Harvesting Nodes in Fading Wireless Channels: Optimal Policies , 2011, IEEE Journal on Selected Areas in Communications.

[9]  Mehul Motani,et al.  Online Policies for Energy Harvesting Receivers With Time-Switching Architectures , 2019, IEEE Transactions on Wireless Communications.

[10]  Octavia A. Dobre,et al.  Energy Management for Energy Harvesting Wireless Sensors With Adaptive Retransmission , 2017, IEEE Transactions on Communications.

[11]  Dongweon Yoon,et al.  On the general BER expression of one- and two-dimensional amplitude modulations , 2002, IEEE Trans. Commun..

[12]  K. J. Ray Liu,et al.  On Outage Probability for Stochastic Energy Harvesting Communications in Fading Channels , 2015, IEEE Signal Processing Letters.

[13]  Olivier Berder,et al.  RLMan: An Energy Manager Based on Reinforcement Learning for Energy Harvesting Wireless Sensor Networks , 2018, IEEE Transactions on Green Communications and Networking.

[14]  K. J. Ray Liu,et al.  On Energy Harvesting Gain and Diversity Analysis in Cooperative Communications , 2015, IEEE Journal on Selected Areas in Communications.

[15]  K. J. Ray Liu,et al.  Data-Driven Stochastic Models and Policies for Energy Harvesting Sensor Communications , 2014, IEEE Journal on Selected Areas in Communications.

[16]  K. J. Ray Liu,et al.  Advances in Energy Harvesting Communications: Past, Present, and Future Challenges , 2016, IEEE Communications Surveys & Tutorials.

[17]  Yan Chen,et al.  Lyapunov-Optimized Two-Way Relay Networks With Stochastic Energy Harvesting , 2018, IEEE Transactions on Wireless Communications.

[18]  Zhenyu Zhou,et al.  On the Time Scales of Energy Arrival and Channel Fading in Energy Harvesting Communications , 2018, IEEE Transactions on Green Communications and Networking.

[19]  Deniz Gündüz,et al.  A Learning Theoretic Approach to Energy Harvesting Communication System Optimization , 2012, IEEE Transactions on Wireless Communications.

[20]  Zhu Han,et al.  User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[21]  Wei Cao,et al.  Intelligent Offloading in Multi-Access Edge Computing: A State-of-the-Art Review and Framework , 2019, IEEE Communications Magazine.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[25]  Vincent K. N. Lau,et al.  A Survey on Delay-Aware Resource Control for Wireless Systems—Large Deviation Theory, Stochastic Lyapunov Drift, and Distributed Stochastic Learning , 2011, IEEE Transactions on Information Theory.

[26]  Shihua Zhu,et al.  Performance Analysis for Two-Way Network-Coded Dual-Relay Networks With Stochastic Energy Harvesting , 2017, IEEE Transactions on Wireless Communications.

[27]  Xiaodong Wang,et al.  Iterative Dynamic Water-Filling for Fading Multiple-Access Channels With Energy Harvesting , 2014, IEEE Journal on Selected Areas in Communications.

[28]  Amiya Nayak,et al.  Energy-Efficient Sleep Scheduling in WBANs: From the Perspective of Minimum Dominating Set , 2019, IEEE Internet of Things Journal.

[29]  Purushottam Kulkarni,et al.  Energy Harvesting Sensor Nodes: Survey and Implications , 2011, IEEE Communications Surveys & Tutorials.

[30]  Min Dong,et al.  Online Power Control Optimization for Wireless Transmission With Energy Harvesting and Storage , 2016, IEEE Transactions on Wireless Communications.