On the Optimality of Myopic Sensing in Multi-State Channels

We consider the channel sensing problem arising in opportunistic scheduling over fading channels, cognitive radio networks, and resource constrained jamming. The same problem arises in many other areas of science and technology as it is an instance of restless bandit problems. The communication system consists of N channels. Each channel is modeled as a multi-state Markov chain. At each time instant a user selects one channel to sense and uses it to transmit information. A reward depending on the state of the selected channel is obtained for each transmission. The objective is to design a channel sensing policy that maximizes the expected total reward collected over a finite or infinite horizon. This problem can be viewed as an instance of restless bandit problems, for which the form of optimal policies is unknown in general. We discover sets of conditions sufficient to guarantee the optimality of a myopic sensing policy; we show that under one particular set of conditions the myopic policy coincides with the Gittins index rule.

[1]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[2]  Mingyan Liu,et al.  Optimality of Myopic Sensing in Multi-Channel Opportunistic Access , 2008, 2008 IEEE International Conference on Communications.

[3]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[4]  E. Feron,et al.  Multi-UAV dynamic routing with partial observations using restless bandit allocation indices , 2008, 2008 American Control Conference.

[5]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[6]  John N. Tsitsiklis,et al.  A survey of computational complexity results in systems and control , 2000, Autom..

[7]  P. Whittle Restless Bandits: Activity Allocation in a Changing World , 1988 .

[8]  José Niño-Mora,et al.  Dynamic priority allocation via restless bandit marginal productivity indices , 2007, 2304.06115.

[9]  Mingyan Liu,et al.  Server allocation with delayed state observation: Sufficient conditions for the optimality of an index policy , 2009, IEEE Transactions on Wireless Communications.

[10]  D. Teneketzis,et al.  Optimal stochastic scheduling of forest networks with switching penalties , 1994, Advances in Applied Probability.

[11]  Kevin D. Glazebrook,et al.  Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .

[12]  R. Weber,et al.  On an index policy for restless bandits , 1990, Journal of Applied Probability.

[13]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[14]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[15]  John N. Tsitsiklis,et al.  The complexity of optimal queueing network control , 1994, Proceedings of IEEE 9th Annual Conference on Structure in Complexity Theory.

[16]  J. Gittins Bandit processes and dynamic allocation indices , 1979 .

[17]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[18]  Jean Walrand,et al.  Extensions of the multiarmed bandit problem: The discounted case , 1985 .

[19]  Mingyan Liu,et al.  Multi-channel opportunistic access: A case of restless bandits with multiple plays , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[20]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[21]  Yi Ouyang,et al.  On the optimality of a myopic policy in multi-state channel probing , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Demosthenis Teneketzis,et al.  ON THE OPTIMALITY OF AN INDEX RULE IN MULTICHANNEL ALLOCATION FOR SINGLE-HOP MOBILE NETWORKS WITH MULTIPLE SERVICE CLASSES , 2000 .

[23]  Bhaskar Krishnamachari,et al.  On myopic sensing for multi-channel opportunistic access: structure, optimality, and performance , 2007, IEEE Transactions on Wireless Communications.

[24]  Peng Shi,et al.  Approximation algorithms for restless bandit problems , 2007, JACM.

[25]  Qing Zhao,et al.  Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access , 2008, IEEE Transactions on Information Theory.

[26]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.