Optimizing media access strategy for competing cognitive radio networks

This paper describes an adaptation of cognitive radio technology for tactical wireless networking. We introduce Competing Cognitive Radio Network (CCRN) featuring both communicator and jamming cognitive radio nodes that strategize in taking actions on an open spectrum under the presence of adversarial threats. We present the problem in the Multi-armed Bandit (MAB) framework and develop the optimal media access strategy consisting of mixed communicator and jammer actions in a Bayesian setting for Thompson sampling based on extreme value theory. Empirical results are promising that the proposed strategy seems to outperform Lai & Robbins and UCB, some of the most important MAB algorithms known to date.

[1]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[2]  J. Gittins Bandit processes and dynamic allocation indices , 1979 .

[3]  J. Walrand,et al.  Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .

[4]  B. Gnedenko Sur La Distribution Limite Du Terme Maximum D'Une Serie Aleatoire , 1943 .

[5]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[6]  A. F. Smith,et al.  Conjugate likelihood distributions , 1993 .

[7]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[8]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[9]  Ronald L. Rivest,et al.  Simulation results for a new two-armed bandit heuristic , 1994, Annual Conference Computational Learning Theory.

[10]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  R. Bellman A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .

[13]  R. Fisher,et al.  Limiting forms of the frequency distribution of the largest or smallest member of a sample , 1928, Mathematical Proceedings of the Cambridge Philosophical Society.

[14]  P. Whittle Restless Bandits: Activity Allocation in a Changing World , 1988 .

[15]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.

[16]  L. Haan,et al.  Extreme value theory : an introduction , 2006 .