Reinforcement learning in MIMO wireless networks with energy harvesting

Energy harvesting wireless nodes provide much longer lifetime and higher energy efficiency for wireless networks compared to battery operated systems. In this paper, we study a MIMO wireless communication link in which the nodes are equipped with energy harvesters and rechargeable batteries that are continuously charging from a renewable energy source. Since the harvested energy arrival and thus the future remaining energy of the nodes is not deterministic in practice, we propose a learning approach in order to find the most efficient transmission policy for data communication that maximizes throughput. The problem is formulated as a Markov Decision Process (MDP) with unknown transition probabilities. A Q-Learning approach is proposed to solve the MDP model and find the optimal transmission policy.

[1]  Ali Ghrayeb,et al.  Coding for MIMO Communication Systems , 2007 .

[2]  Jing Yang,et al.  Optimal Packet Scheduling in an Energy Harvesting Communication System , 2010, IEEE Transactions on Communications.

[3]  John G. Proakis,et al.  Digital Communications , 1983 .

[4]  Kaibin Huang,et al.  Energy Harvesting Wireless Communications: A Review of Recent Advances , 2015, IEEE Journal on Selected Areas in Communications.

[5]  Miquel Payaró,et al.  Optimal power allocation for a wireless multi-antenna energy harvesting node with arbitrary input distribution , 2012, 2012 IEEE International Conference on Communications (ICC).

[6]  Prasanna Chaporkar,et al.  Optimal power allocation for a renewable energy source , 2011, 2012 National Conference on Communications (NCC).

[7]  Deniz Gündüz,et al.  A Learning Theoretic Approach to Energy Harvesting Communication System Optimization , 2012, IEEE Transactions on Wireless Communications.

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Jing Yang,et al.  Transmission with Energy Harvesting Nodes in Fading Wireless Channels: Optimal Policies , 2011, IEEE Journal on Selected Areas in Communications.

[10]  Roy D. Yates,et al.  A generic model for optimizing single-hop transmission policy of replenishable sensors , 2009, IEEE Transactions on Wireless Communications.

[11]  Marco Wiering,et al.  Explorations in efficient reinforcement learning , 1999 .

[12]  Anja Klein,et al.  Reinforcement learning for energy harvesting point-to-point communications , 2016, 2016 IEEE International Conference on Communications (ICC).

[13]  Deniz Gündüz,et al.  A general framework for the optimization of energy harvesting communication systems with battery imperfections , 2011, Journal of Communications and Networks.

[14]  Wendi B. Heinzelman,et al.  Transmitter-receiver energy efficiency: A trade-off in MIMO wireless sensor networks , 2015, 2015 IEEE Wireless Communications and Networking Conference (WCNC).