Using Markov decision process in cognitive radio networks towards the optimal reward

The Learning is an indispensable phase in the cognition cycle of cognitive radio network. It corresponds between the executed actions and the estimated rewards. Based on this phase, the agent learns from past experiences to improve his actions in the next interventions. In the literature, there are several methods that treat the artificial learning. Among them, we cite the reinforcement learning that look for the optimal policy, for ensuring the maximum reward. The present work exposes an approach, based on a model of reinforcement learning, namely Markov decision process, to maximize the sum of transfer rates of all secondary users. Such conception defines all notions relative to an environment with finite set of states, including: the agent, all states, the allowed actions with a given state, the obtained reward after the execution of an action and the optimal policy. After the implementation, we remark a correlation between the started policy and the optimal policy, and we improve the performances by referring to a previous work.

[1]  Masoud Sabaei,et al.  Non-cooperative reinforcement learning based routing in cognitive radio networks , 2017, Comput. Commun..

[2]  Honggang Zhang,et al.  Low complexity and efficient dynamic spectrum learning and tunable bandwidth access for heterogeneous decentralized cognitive radio networks , 2015, Digit. Signal Process..

[3]  Xin Xu,et al.  Reinforcement learning algorithms with function approximation: Recent advances and applications , 2014, Inf. Sci..

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Leonid Sheremetov,et al.  Weiss, Gerhard. Multiagent Systems a Modern Approach to Distributed Artificial Intelligence , 2009 .

[6]  Aduwati Sali,et al.  An energy efficient Reinforcement Learning based Cooperative Channel Sensing for Cognitive Radio Sensor Networks , 2017, Pervasive Mob. Comput..

[7]  Yasir Saleem,et al.  SMART: A SpectruM-Aware ClusteR-based rouTing scheme for distributed cognitive radio networks , 2015, Comput. Networks.

[8]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[9]  He Huang,et al.  Auction-based resource allocation for cooperative cognitive radio networks , 2017, Comput. Commun..

[10]  Ting-Wen Chang,et al.  Learning style Identifier: Improving the precision of learning style identification through computational intelligence algorithms , 2017, Expert Syst. Appl..

[11]  Dario Pompili,et al.  RescueNet: Reinforcement-learning-based communication framework for emergency networking , 2016, Comput. Networks.

[12]  Joseph Mitola,et al.  Cognitive Radio An Integrated Agent Architecture for Software Defined Radio , 2000 .

[13]  Tim Clarke,et al.  Cognitive spectrum management in dynamic cellular environments: A case-based Q-learning approach , 2016, Eng. Appl. Artif. Intell..

[14]  Feten Slimeni,et al.  Cognitive Radio Jamming Mitigation using Markov Decision Process and Reinforcement Learning , 2015 .

[15]  François Gagnon,et al.  Performance of distributed multi-agent multi-state reinforcement spectrum management using different exploration schemes , 2013, Expert Syst. Appl..

[16]  Moez Esseghir,et al.  ON USING MULTI AGENT SYSTEMS IN COGNITIVE RADIO NETWORKS : A SURVEY , 2012 .