Hybrid UCB-HMM: A Machine Learning Strategy for Cognitive Radio in HF Band

Multiple users transmit in the HF band with worldwide coverage but collide with other HF users. New techniques based on cognitive radio principles are discussed to reduce the inefficient use of this band. In this paper, we show the feasibility of the Upper Confidence Bound (UCB) algorithm, based on reinforcement learning, for an opportunistic access to the HF band. The exploration vs. exploitation dilemma is evaluated in single-channel and multi-channel UCB algorithms in order to obtain their best performance in the HF environment. Furthermore, we propose a new hybrid system, which combines two types of machine learning techniques based on reinforcement learning and learning with Hidden Markov Models. This system can be understood as a metacognitive engine that automatically adapts its data transmission strategy according to HF environment's behaviour to efficiently use spectrum holes. The proposed hybrid UCB-HMM system increases the duration of data transmission's slots when conditions are favourable, and is also able to reduce the required signalling transmissions between transmitter and receiver to inform which channels have been selected for data transmission. This reduction can be as high as 61% with respect to the signalling required by multi-channel UCB.

[1]  A. Chu,et al.  Neural network prediction of HF ionospheric propagation loss , 1999 .

[2]  H. Smalley The systems approach. , 1972, Hospitals.

[3]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[4]  Harri Saarnisaari,et al.  Cognitive HF — New perspectives to use the high frequency band , 2014, 2014 9th International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CROWNCOM).

[5]  Wassim Jouini,et al.  Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[6]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[7]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.

[8]  Santiago Zazo,et al.  Upper Confidence Bound learning approach for real HF measurements , 2015, 2015 IEEE International Conference on Communication Workshop (ICCW).

[9]  Michael M. Marefat,et al.  Metacognitive Radio Engine Design and Standardization , 2015, IEEE Journal on Selected Areas in Communications.

[10]  Santiago Zazo,et al.  Real link of a high data rate OFDM modem: Description and performance , 2009 .

[11]  Haris Haralambous,et al.  24-Hour Neural Network Congestion Models for High-Frequency Broadcast Users , 2009, IEEE Transactions on Broadcasting.

[12]  Mohsen Guizani,et al.  Cognitive Radio Technology , 2006 .

[13]  William N. Furman,et al.  Applying cognitive radio concepts to HF communications , 2009 .

[14]  L. Libin,et al.  Forecasting of Ionospheric Critical Frequency Using Neural Networks , 2005, Chinese Journal of Space Science.

[15]  Santiago Zazo,et al.  Interactive digital voice over HF , 2003 .

[16]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[17]  Cheng-Xiang Wang,et al.  Reinforcement learning approaches and evaluation criteria for opportunistic spectrum access , 2014, 2014 IEEE International Conference on Communications (ICC).

[18]  Christophe Moy,et al.  Reinforcement Learning Real Experiments for Opportunistic Spectrum Access , 2014 .

[19]  Santiago Zazo,et al.  Special Issue on MC-SS Validation of a HF spread spectrum multi-carrier technology through real-link measurements , 2006, Eur. Trans. Telecommun..

[20]  William N. Furman,et al.  Next generation ALE concepts , 2009 .

[21]  Santiago Zazo,et al.  HF spectrum activity prediction model based on HMM for cognitive radio applications , 2012, Phys. Commun..