Reinforcement Learning Based Adaptive Resource Allocation for Wireless Powered Communication Systems

Wireless powered communication (WPC) is one of the promising techniques for future energy-constrained wireless networks. In this letter, we consider a WPC system composed of a hybrid access point and an energy harvesting node (EHN). In this system, we propose a reinforcement learning based adaptive resource allocation scheme that dynamically assigns the channel resources to minimize the outage probability of information transfer while satisfying the average power constraint at the EHN, which is formulated as a constrained Markov decision process (MDP) problem. To solve this challenging problem, we first transform the originally formulated problem into its equivalent unconstrained MDP with multi-objective. Then, to find the resource allocation policy, we propose a novel Q-learning algorithm. Numerical results demonstrate the superior performance and effectiveness of the proposed scheme.

[1]  Eduardo Tovar,et al.  Reinforcement Learning for Scheduling Wireless Powered Sensor Communications , 2019, IEEE Transactions on Green Communications and Networking.

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Feng Zhao,et al.  Optimal Time Allocation for Wireless Information and Power Transfer in Wireless Powered Communication Systems , 2016, IEEE Transactions on Vehicular Technology.

[4]  Robert Schober,et al.  On-off transmission policy for wireless powered communication with energy storage , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[5]  Robert Schober,et al.  Performance analysis of wireless powered communication with finite/infinite energy storage , 2014, 2015 IEEE International Conference on Communications (ICC).

[6]  Giacinto Gelli,et al.  Decision Fusion Rules in Ambient Backscatter Wireless Sensor Networks , 2019, 2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC).

[7]  Shigenobu Sasaki,et al.  RF Energy Transfer for Cooperative Networks: Data Relaying or Energy Harvesting? , 2012, IEEE Communications Letters.

[8]  Hyungsik Ju,et al.  Throughput Maximization in Wireless Powered Communication Networks , 2013, IEEE Trans. Wirel. Commun..

[9]  Yong Li,et al.  Joint Power Control and Channel Allocation for Interference Mitigation Based on Reinforcement Learning , 2019, IEEE Access.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Jae-Mo Kang,et al.  Deep-Learning-Based Channel Estimation for Wireless Energy Transfer , 2018, IEEE Communications Letters.

[12]  Victor C. M. Leung,et al.  Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks , 2017, IEEE Transactions on Vehicular Technology.

[13]  Dong In Kim,et al.  Stochastic Optimal Control for Wireless Powered Communication Networks , 2016, IEEE Transactions on Wireless Communications.

[14]  Michele Zorzi,et al.  Battery-Powered Devices in WPCNs , 2016, IEEE Transactions on Communications.

[15]  Wei Yu,et al.  Dual methods for nonconvex spectrum optimization of multicarrier systems , 2006, IEEE Transactions on Communications.

[16]  Jae-Mo Kang,et al.  Adaptive Rate and Energy Harvesting Interval Control Based on Reinforcement Learning for SWIPT , 2018, IEEE Communications Letters.

[17]  Branka Vucetic,et al.  A Discrete Time-Switching Protocol for Wireless-Powered Communications with Energy Accumulation , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).