论文信息 - Distributed algorithm under cooperative or competitive priority users in cognitive networks

Distributed algorithm under cooperative or competitive priority users in cognitive networks

Opportunistic spectrum access (OSA) problem in cognitive radio (CR) networks allows a secondary (unlicensed) user (SU) to access a vacant channel allocated to a primary (licensed) user (PU). By finding the availability of the best channel, i.e., the channel that has the highest availability probability, a SU can increase its transmission time and rate. To maximize the transmission opportunities of a SU, various learning algorithms are suggested: Thompson sampling (TS), upper confidence bound (UCB), ε -greedy, etc. In our study, we propose a modified UCB version called AUCB (Arctan-UCB) that can achieve a logarithmic regret similar to TS or UCB while further reducing the total regret, defined as the reward loss resulting from the selection of non-optimal channels. To evaluate AUCB’s performance for the multi-user case, we propose a novel uncooperative policy for a priority access where the k th user should access the k th best channel. This manuscript theoretically establishes the upper bound on the sum regret of AUCB under the single or multi-user cases. The users thus may, after finite time slots, converge to their dedicated channels. It also focuses on the Quality of Service AUCB (QoS-AUCB) using the proposed policy for the priority access. Our simulations corroborate AUCB’s performance compared to TS or UCB.

[1] Joseph Mitola,et al. Cognitive radio: making software radios more personal , 1999, IEEE Wirel. Commun..

[2] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[3] Ying-Chang Liang,et al. Optimal power allocation for fading channels in cognitive radio networks: Ergodic capacity and outage capacity , 2008, IEEE Transactions on Wireless Communications.

[4] A. Assoum,et al. Opportunistic Spectrum Access in Cognitive Radio for Tactical Network , 2018, 2018 2nd European Conference on Electrical Engineering and Computer Science (EECS).

[5] Xianfu Chen,et al. Stochastic Power Adaptation with Multiagent Reinforcement Learning for Cognitive Wireless Mesh Networks , 2013, IEEE Transactions on Mobile Computing.

[6] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[7] Bhaskar Krishnamachari,et al. Decentralized multi-armed bandit with imperfect observations , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[8] Symeon Papavassiliou,et al. Energy-efficient subcarrier allocation in SC-FDMA wireless networks based on multilateral model of bargaining , 2013, 2013 IFIP Networking Conference.

[9] Symeon Papavassiliou,et al. Uplink resource allocation in SC-FDMA wireless networks: A survey and taxonomy , 2016, Comput. Networks.

[10] Tao Luo,et al. An Energy Detection Algorithm Based on Double-Threshold in Cognitive Radio Systems , 2009, 2009 First International Conference on Information Science and Engineering.

[11] Xiaoying Gan,et al. Cooperative Spectrum Sharing in Cognitive Radio Networks: A Distributed Matching Approach , 2014, IEEE Transactions on Communications.

[12] Christophe Moy,et al. QoS Driven Channel Selection Algorithm for Cognitive Radio Network: Multi-User Multi-Armed Bandit Approach , 2017, IEEE Transactions on Cognitive Communications and Networking.

[13] Ying-Chang Liang,et al. Optimal power allocation for OFDM-based cognitive radio with new primary transmission protection criteria , 2010, IEEE Transactions on Wireless Communications.

[14] Qing Zhao,et al. Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.

[15] Rémi Munos,et al. A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences , 2011, COLT.

[16] A. Assoum,et al. Distributed Algorithm to Learn OSA Channels Availability and Enhance the Transmission Rate of Secondary Users , 2019, 2019 19th International Symposium on Communications and Information Technologies (ISCIT).

[17] Ming Li,et al. Blind Energy-based Detection for Spatial Spectrum Sensing , 2015, IEEE Wireless Communications Letters.

[18] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.

[19] Wassim Jouini,et al. Decision making for cognitive radio equipment: analysis of the first 10 years of exploration , 2012, EURASIP Journal on Wireless Communications and Networking.

[20] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[21] Mahmoud Almasri,et al. All-Powerful Learning Algorithm for the Priority Access in Cognitive Network , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[22] Mingyan Liu,et al. Online algorithms for the multi-armed bandit problem with Markovian rewards , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[23] Chandra R. Murthy,et al. Performance comparison of energy, matched-filter and cyclostationarity-based spectrum sensing , 2010, 2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[24] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.

[25] Yi Gai,et al. Decentralized Online Learning Algorithms for Opportunistic Spectrum Access , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[26] Mingyan Liu,et al. Online learning in opportunistic spectrum access: A restless bandit approach , 2010, 2011 Proceedings IEEE INFOCOM.

[27] H. Tang,et al. Some physical layer issues of wide-band cognitive radio systems , 2005, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005..

[28] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[29] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[30] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[31] John G. van Bosse. Signaling in Telecommunication Networks , 1997 .

[32] Yi Gai,et al. Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[33] Wassim Jouini,et al. Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[34] Bryan Paul,et al. Radar-Communications Convergence: Coexistence, Cooperation, and Co-Design , 2017, IEEE Transactions on Cognitive Communications and Networking.

[35] Ali Mansour,et al. Spectrum sensing based on cumulative power spectral density , 2017, EURASIP Journal on Advances in Signal Processing.

[36] Ananthram Swami,et al. Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[37] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[38] M. Bóna. A Walk Through Combinatorics: An Introduction to Enumeration and Graph Theory , 2006 .

[39] D.J. Goodman,et al. Single carrier FDMA for uplink wireless transmission , 2006, IEEE Vehicular Technology Magazine.

[40] Jason L. Loeppky,et al. A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit , 2015, ArXiv.

[41] Qing Zhao,et al. Learning in a Changing World: Restless Multiarmed Bandit With Unknown Dynamics , 2010, IEEE Transactions on Information Theory.

[42] Ohad Shamir,et al. Multi-player bandits: a musical chairs approach , 2016, ICML 2016.

[43] Leonardo Badia,et al. A Superprocess with Upper Confidence Bounds for Cooperative Spectrum Sharing , 2016, IEEE Transactions on Mobile Computing.

[44] Victor C. M. Leung,et al. Rank-optimal channel selection strategy in cognitive networks , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[45] D. Ernst,et al. Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access , 2010, 2010 IEEE International Conference on Communications.

[46] Augustin-Louis Cauchy. Sur la convergence des séries , 2009 .

[47] Anant Sahai,et al. Fundamental design tradeoffs in cognitive radio systems , 2006, TAPAS '06.

[48] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .

[49] Santiago Zazo,et al. Upper Confidence Bound learning approach for real HF measurements , 2015, 2015 IEEE International Conference on Communication Workshop (ICCW).

[50] Santiago Zazo,et al. Hybrid UCB-HMM: A Machine Learning Strategy for Cognitive Radio in HF Band , 2015, IEEE Transactions on Cognitive Communications and Networking.

[51] Aurélien Garivier,et al. On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.

[52] John G. van Bosse,et al. Signaling in Telecommunication Networks (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[53] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[54] J. I. Mararm,et al. Energy Detection of Unknown Deterministic Signals , 2022 .