Algorithms for Dynamic Spectrum Access With Learning for Cognitive Radio

We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperatively tries to exploit vacancies in primary (licensed) channels whose occupancies follow a Markovian evolution. We first consider the scenario where the cognitive users have perfect knowledge of the distribution of the signals they receive from the primary users. For this problem, we obtain a greedy channel selection and access policy that maximizes the instantaneous reward, while satisfying a constraint on the probability of interfering with licensed transmissions. We also derive an analytical universal upper bound on the performance of the optimal policy. Through simulation, we show that our scheme achieves good performance relative to the upper bound and improved performance relative to an existing scheme. We then consider the more practical scenario where the exact distribution of the signal from the primary is unknown. We assume a parametric model for the distribution and develop an algorithm that can learn the true distribution, still guaranteeing the constraint on the interference probability. We show that this algorithm outperforms the naive design that assumes a worst case value for the parameter. We also provide a proof for the convergence of the learning algorithm.

[1]  Bhaskar Krishnamachari,et al.  On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance , 2007, IEEE Transactions on Wireless Communications.

[2]  V. Veeravalli,et al.  Dynamic spectrum access with learning for cognitive radio , 2008 .

[3]  V. Borkar Probability Theory: An Advanced Course , 1995 .

[4]  Mingyan Liu,et al.  Optimality of Myopic Sensing in Multi-Channel Opportunistic Access , 2008, 2008 IEEE International Conference on Communications.

[5]  Ananthram Swami,et al.  Joint Design and Separation Principle for Opportunistic Spectrum Access in the Presence of Sensing Errors , 2007, IEEE Transactions on Information Theory.

[6]  Anant Sahai,et al.  Some Fundamental Limits on Cognitive Radio , 2004 .

[7]  Ahmad Bahai,et al.  Optimal Channel Selection for Spectrum-Agile Low-Power Wireless Packet Switched Networks in Unlicensed Band , 2008, EURASIP J. Wirel. Commun. Netw..

[8]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[9]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[10]  H. Vincent Poor,et al.  Minimax robust decentralized detection , 1994, IEEE Trans. Inf. Theory.

[11]  Vivek S. Borkar,et al.  Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost , 1998 .

[12]  Venugopal V. Veeravalli,et al.  Cooperative Sensing for Primary Detection in Cognitive Radio , 2008, IEEE Journal of Selected Topics in Signal Processing.

[13]  B. Leroux Maximum-likelihood estimation for hidden Markov models , 1992 .

[14]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[15]  H. Vincent Poor,et al.  An introduction to signal detection and estimation (2nd ed.) , 1994 .