Harvest-or-Transmit Policy for Cognitive Radio Networks: A Learning Theoretic Approach

We consider an underlay cognitive radio network where the secondary user (SU) harvests energy from the environment. We consider a slotted-mode of operation where each slot of SU is used for either energy harvesting or data transmission. Considering block fading with memory, we model the energy arrival and fading processes as a stationary Markov process of first-order. We propose a harvest-or-transmit policy for the SU along with optimal transmit powers that maximize its expected throughput under three different settings. First, we consider a learning-theoretic approach where we do not assume any a priori knowledge about the underlying Markov processes. In this case, we obtain an online policy using Q-learning. Then, we assume that the full statistical knowledge of the governing Markov process is known a priori. Under this assumption, we obtain an optimal online policy using infinite horizon stochastic dynamic programming. Finally, we obtain an optimal offline policy using the generalized Benders decomposition algorithm. The offline policy assumes that for a given time deadline, the energy arrivals and channel states are known in advance at all the transmitters. Finally, we compare all policies and study the effects of various system parameters on the system performance.

[1]  Sang-Jo Yoo,et al.  Reinforcement learning for dynamic sensing parameter control in cognitive radio systems , 2017, 2017 International Conference on Information and Communication Technology Convergence (ICTC).

[2]  Robin J. Evans,et al.  Reinforcement learning based secondary user transmissions in cognitive radio networks , 2013, 2013 IEEE Globecom Workshops (GC Wkshps).

[3]  Adrish Banerjee,et al.  On optimal offline time sharing policy for energy harvesting underlay cognitive radio , 2016, 2016 International Conference on Signal Processing and Communications (SPCOM).

[4]  Vishnu Raj,et al.  Spectrum Access In Cognitive Radio Using a Two-Stage Reinforcement Learning Approach , 2017, IEEE Journal of Selected Topics in Signal Processing.

[5]  Abolfazl Razi,et al.  An Online Learning Method to Maximize Energy Efficiency of Cognitive Sensor Networks , 2018, IEEE Communications Letters.

[6]  Qi Zhang,et al.  Robust Transceiver Design for Wireless Information and Power Transmission in Underlay MIMO Cognitive Radio Networks , 2014, IEEE Communications Letters.

[7]  Jacques F. Benders,et al.  Partitioning procedures for solving mixed-variables programming problems , 2005, Comput. Manag. Sci..

[8]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[9]  José Ferreira de Rezende,et al.  Channel sensing order for cognitive radio networks using reinforcement learning , 2011, 2011 IEEE 36th Conference on Local Computer Networks.

[10]  Qun Li,et al.  Joint Power Control and Time Allocation for Wireless Powered Underlay Cognitive Radio Networks , 2017, IEEE Wireless Communications Letters.

[11]  Alexandra Duel-Hallen,et al.  Fading Channel Prediction for Mobile Radio Adaptive Transmission Systems , 2007, Proceedings of the IEEE.

[12]  Adrish Banerjee,et al.  Online Time Sharing Policy in Energy Harvesting Cognitive Radio Network with Channel Uncertainty , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[13]  Aylin Yener,et al.  Delay Constrained Energy Harvesting Networks with Limited Energy and Data Storage , 2016, IEEE Journal on Selected Areas in Communications.

[14]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[15]  Shalabh Bhatnagar,et al.  Q-Learning Based Energy Management Policies for a Single Sensor Node with Finite Buffer , 2013, IEEE Wireless Communications Letters.

[16]  Jeff T. Linderoth,et al.  Algorithms and Software for Convex Mixed Integer Nonlinear Programs , 2012 .

[17]  Mahtab Mirmohseni,et al.  Energy Harvesting Systems With Continuous Energy and Data Arrivals: The Optimal Offline and Heuristic Online Algorithms , 2015, IEEE Journal on Selected Areas in Communications.

[18]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[19]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[20]  Wei Peng,et al.  A survey of energy harvesting communications: models and offline optimal policies , 2015, IEEE Communications Magazine.

[21]  H. Robbins A Stochastic Approximation Method , 1951 .

[22]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[23]  Valentin Rakovic,et al.  Optimal time sharing in underlay cognitive radio systems with RF energy harvesting , 2015, 2015 IEEE International Conference on Communications (ICC).

[24]  Pingzhi Fan,et al.  A Survey on High Mobility Wireless Communications: Challenges, Opportunities and Solutions , 2016, IEEE Access.

[25]  Hao-Li Wang,et al.  A Reinforcement Learning-Based ToD Provisioning Dynamic Power Management for Sustainable Operation of Energy Harvesting Wireless Sensor Node , 2014, IEEE Transactions on Emerging Topics in Computing.

[26]  Wei Liang,et al.  End-to-End Throughput Maximization for Underlay Multi-Hop Cognitive Radio Networks With RF Energy Harvesting , 2017, IEEE Transactions on Wireless Communications.

[27]  Torbjörn Ekman Prediction of Mobile Radio Channels : Modeling and Design , 2002 .

[28]  Christodoulos A. Floudas,et al.  Nonlinear and Mixed-Integer Optimization , 1995 .

[29]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[30]  Lian Zhao,et al.  Optimal power control for energy harvesting cognitive radio networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[31]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[32]  Ying-Chang Liang,et al.  Outage Performance of Underlay Multihop Cognitive Relay Networks With Energy Harvesting , 2016, IEEE Communications Letters.

[33]  Mahtab Mirmohseni,et al.  An Optimal Transmission Policy for Energy Harvesting Systems with Continuous Energy and Data Arrivals , 2015, ArXiv.

[34]  Kok-Lim Alvin Yau,et al.  Applications of Reinforcement Learning to Cognitive Radio Networks , 2010, 2010 IEEE International Conference on Communications Workshops.

[35]  Yisheng Zhao,et al.  Energy-Efficient Resource Allocation in Energy Harvesting Communication Systems: A Heuristic Algorithm , 2016, ChinaCom.

[36]  Deniz Gündüz,et al.  A Learning Theoretic Approach to Energy Harvesting Communication System Optimization , 2012, IEEE Transactions on Wireless Communications.

[37]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[38]  Insoo Koo,et al.  Access Strategy for Hybrid Underlay-Overlay Cognitive Radios With Energy Harvesting , 2014, IEEE Sensors Journal.

[39]  Anja Klein,et al.  Reinforcement learning for energy harvesting point-to-point communications , 2016, 2016 IEEE International Conference on Communications (ICC).

[40]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[41]  A. M. Geoffrion Generalized Benders decomposition , 1972 .

[42]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[43]  Adrish Banerjee,et al.  On energy cooperation in energy harvesting underlay cognitive radio network , 2016, 2016 Twenty Second National Conference on Communication (NCC).

[44]  Gregory E. Bottomley,et al.  Channel estimation in narrowband wireless communication systems , 2001, Wirel. Commun. Mob. Comput..

[45]  Adrish Banerjee,et al.  Optimal Harvest-or-Transmit Strategy for Energy Harvesting Underlay Cognitive Radio Network , 2018, 2018 International Conference on Signal Processing and Communications (SPCOM).

[46]  Asuman E. Ozdaglar,et al.  Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods , 2008, SIAM J. Optim..

[47]  Nirwan Ansari,et al.  On Green-Energy-Powered Cognitive Radio Networks , 2014, IEEE Communications Surveys & Tutorials.

[48]  Tian Zhang,et al.  Balancing Delay and Energy Efficiency in Energy Harvesting Cognitive Radio Networks: A Stochastic Stackelberg Game Approach , 2017, IEEE Transactions on Cognitive Communications and Networking.

[49]  H. Vincent Poor,et al.  Multiagent Reinforcement Learning Based Spectrum Sensing Policies for Cognitive Radio Networks , 2013, IEEE Journal of Selected Topics in Signal Processing.

[50]  Tamer Khattab,et al.  Optimal cooperative cognitive relaying and spectrum access for an energy harvesting cognitive radio: Reinforcement learning approach , 2014, 2015 International Conference on Computing, Networking and Communications (ICNC).

[51]  Jing Yang,et al.  Optimal Packet Scheduling in an Energy Harvesting Communication System , 2010, IEEE Transactions on Communications.