Contribution to learning and decision making under uncertainty for Cognitive Radio.

During the last century, most of the meaningful frequency bands were licensed to emerging wireless applications. Because of the static model of frequency allocation, the growing number of spectrum demanding services led to a spectrum scarcity. However, recently, series of measurements on the spectrum utilization showed that the different frequency bands were underutilized (sometimes even unoccupied) and thus that the scarcity of the spectrum resource is virtual and only due to the static allocation of the different bands to specific wireless services. Moreover, the underutilization of the spectrum resource varies on different scales in time and space offering many opportunities to an unlicensed user or network to access the spectrum. Cognitive Radio (CR) and Opportunistic Spectrum Access (OSA) were introduced as possible solutions to alleviate the spectrum scarcity issue.In this dissertation, we aim at enabling CR equipments to exploit autonomously communication opportunities found in their vicinity. For that purpose, we suggest decision making mechanisms designed and/or adapted to answer CR related problems in general, and more specifically, OSA related scenarios. Thus, we argue that OSA scenarios can be modeled as Multi-Armed Bandit (MAB) problems. As a matter of fact, within OSA contexts, CR equipments are assumed to have no prior knowledge on their environment. Acquiring the necessary information relies on a sequential interaction between the CR equipment and its environment. Finally, the CR equipment is modeled as a cognitive agent whose purpose is to learn while providing an improving service to its user. Thus, firstly we analyze the performance of UCB1 algorithm when dealing with OSA problems with imperfect sensing. More specifically, we show that UCB1 can efficiently cope with sensing errors. We prove its convergence to the optimal channel and quantify its loss of performance compared to the case with perfect sensing. Secondly, we combine UCB1 algorithm with collaborative and coordination mechanism to model a secondary network (i.e. several SUs). We show that within this complex scenario, a coordinated learning mechanism can lead to efficient secondary networks. These scenarios assume that a SU can efficiently detect incumbent users’ activity while having no prior knowledge on their characteristics. Usually, energy detection is suggested as a possible approach to handle such task. Unfortunately, energy detection in known to perform poorly when dealing with uncertainty. Consequently, we ventured in this Ph.D. to revisit the problem of energy detection limits under uncertainty. We present new results on its performances as well as its limits when the noise level is uncertain and the uncertainty is modeled by a log-normal distribution (as suggested by Alexander Sonnenschein and Philip M. Fishman in 1992). Within OSA contexts, we address a final problem where a sensor aims at quantifying the quality of a channel in fading environments. In such contexts, UCB1 algorithms seem to fail. Consequently, we designed a new algorithm called Multiplicative UCB (UCB) and prove its convergence. Moreover, we prove that MUCB algorithms are order optimal (i.e., the order of their learning rate is optimal). This last work provides a contribution that goes beyond CR and OSA. As a matter of fact, MUCB algorithms are introduced and solved within a general MAB framework.

[1]  J. I. Mararm,et al.  Energy Detection of Unknown Deterministic Signals , 2022 .

[2]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[3]  Deepayan Chakrabarti,et al.  Bandits for Taxonomies: A Model-based Approach , 2007, SDM.

[4]  Qing Zhao,et al.  Channel probing for opportunistic access with multi-channel sensing , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[5]  Rémi Munos,et al.  Adaptive Bandits: Towards the best history-dependent strategy , 2011, AISTATS.

[6]  Ambuj Tewari,et al.  Optimal Stragies and Minimax Lower Bounds for Online Convex Games , 2008, COLT.

[7]  Nikola K. Kasabov,et al.  ECOS: Evolving Connectionist Systems and the ECO Learning Paradigm , 1998, ICONIP.

[8]  Rachid Hachemani,et al.  Multilayer sensors for the Sensorial Radio Bubble , 2009, Phys. Commun..

[9]  Sergio Barbarossa,et al.  Distributed resource allocation in cognitive radio systems based on social foraging swarms , 2010, 2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[10]  Jean-Yves Audibert,et al.  Regret Bounds and Minimax Policies under Partial Monitoring , 2010, J. Mach. Learn. Res..

[11]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[12]  Csaba Szepesvári,et al.  Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[13]  D. Ernst,et al.  Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access , 2010, 2010 IEEE International Conference on Communications.

[14]  Sarah Filippi,et al.  Optimism in reinforcement learning and Kullback-Leibler divergence , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[16]  Apostolos A. Kountouris,et al.  Cognitive Decision Making Process Supervising the Radio Dynamic Reconfiguration , 2008, 2008 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom 2008).

[17]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[18]  Mieczyslaw M. Kokar,et al.  COLLABORATIVE ADAPTATION OF COGNITIVE RADIO PARAMETERS USING ONTOLOGY AND POLICY APPROACH , 2010 .

[19]  Eli Upfal,et al.  Adapting to a Changing Environment: the Brownian Restless Bandits , 2008, COLT.

[20]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[21]  Hongzhi Wang,et al.  Blind standard identification with bandwidth shape and GI recognition using USRP platforms and SDR4all tools , 2010, 2010 Proceedings of the Fifth International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[22]  Hongzhi Wang,et al.  Blind Bandwidth Shape Recognition for Standard Identification Using USRP Platforms and SDR4all Tools , 2010, 2010 Sixth Advanced International Conference on Telecommunications.

[23]  M.M. Buddhikot,et al.  Understanding Dynamic Spectrum Access: Models,Taxonomy and Challenges , 2007, 2007 2nd IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks.

[24]  Petri Mähönen,et al.  Evaluation of Spectrum Occupancy in Indoor and Outdoor Scenario in the Context of Cognitive Radio , 2007, 2007 2nd International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[25]  Jordi Pérez-Romero,et al.  Spectral occupation measurements and blind standard recognition sensor for cognitive radio networks , 2009, 2009 4th International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[26]  Rémi Munos,et al.  A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences , 2011, COLT.

[27]  Jean-Yves Audibert,et al.  Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.

[28]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[29]  Ao Tang,et al.  Opportunistic Spectrum Access with Multiple Users: Learning under Competition , 2010, 2010 Proceedings IEEE INFOCOM.

[30]  Joseph Mitola,et al.  Cognitive radio: making software radios more personal , 1999, IEEE Wirel. Commun..

[31]  Jan Poland,et al.  Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments , 2008, Theor. Comput. Sci..

[32]  Joseph Mitola,et al.  Cognitive Radio Architecture: The Engineering Foundations of Radio XML , 2006 .

[33]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[34]  Deepayan Chakrabarti,et al.  Multi-armed bandit problems with dependent arms , 2007, ICML '07.

[35]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[36]  G. Fettweis,et al.  ICT ENERGY CONSUMPTION – TRENDS AND CHALLENGES , 2008 .

[37]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[38]  A. Sonnenschein,et al.  Radiometric detection of spreadspectrum signals in noise of uncertain power , 1992 .

[39]  Hüseyin Arslan,et al.  A survey of spectrum sensing algorithms for cognitive radio applications , 2009, IEEE Communications Surveys & Tutorials.

[40]  W. Jouini,et al.  Apprentissage pour l'Accès Opportuniste au Spectre : Prise en Compte des Erreurs d'Observation , 2011 .

[41]  Rémi Munos,et al.  Optimistic Planning of Deterministic Systems , 2008, EWRL.

[42]  Wassim Jouini,et al.  Reinforcement learning application scenario for Opportunistic Spectrum Access , 2011, 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS).

[43]  Honggang Zhang,et al.  Swarm Intelligence Based Dynamic Control Channel Assignment in Cogmesh , 2008, ICC Workshops - 2008 IEEE International Conference on Communications Workshops.

[44]  Bhaskar Krishnamachari,et al.  Distributed learning under imperfect sensing in cognitive radio networks , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[45]  Robert W. Heath,et al.  Is the PHY layer dead? , 2011, IEEE Communications Magazine.

[46]  Wassim Jouini,et al.  Decision making for cognitive radio equipment: analysis of the first 10 years of exploration , 2012, EURASIP Journal on Wireless Communications and Networking.

[47]  John N. Tsitsiklis,et al.  Linearly Parameterized Bandits , 2008, Math. Oper. Res..

[48]  Wassim Jouini,et al.  Temperature-Power Consumption Relationship and Hot-Spot Migration for FPGA-Based System , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.

[49]  K. J. Ray Liu,et al.  Game theory for cognitive radio networks: An overview , 2010, Comput. Networks.

[50]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[51]  W. A. Gambling,et al.  General Assembly of the International Union of Radio Science , 1982 .

[52]  Dipankar Raychaudhuri,et al.  Future Directions in Cognitive Radio Network , 2009 .

[53]  Christophe Moy,et al.  High-Level Design Approach for the Specification of Cognitive Radio Equipments Management APIs , 2010, Journal of Network and Systems Management.

[54]  Anant Sahai,et al.  SNR Walls for Signal Detection , 2008, IEEE Journal of Selected Topics in Signal Processing.

[55]  Charles W. Bostian,et al.  Biologically Inspired Cognitive Radio Engine Model Utilizing Distributed Genetic Algorithms for Secure and Robust Wireless Communications and Networking , 2004 .

[56]  Feng Wang,et al.  Cognitive Radio Decision Engine Based on Priori Knowledge , 2010, 2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming.

[57]  Akimichi Takemura,et al.  An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.

[58]  Dietmar Kunz,et al.  Channel assignment for cellular radio using simulated annealing , 1993 .

[59]  Hamid Reza Karimi,et al.  Geolocation databases for white space devices in the UHF TV bands: Specification of maximum permitted emission levels , 2011, 2011 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).

[60]  Charles W. Bostian,et al.  Cognitive Radio Formulation and Implementation , 2006, 2006 1st International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[61]  Jeffrey H. Reed,et al.  Survey of cognitive radio architectures , 2010, Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon).

[62]  Kevin W. Sowerby,et al.  A Quantitative Analysis of Spectral Occupancy Measurements for Cognitive Radio , 2007, 2007 IEEE 65th Vehicular Technology Conference - VTC2007-Spring.

[63]  Timothy J. O'Shea,et al.  Applications of Machine Learning to Cognitive Radio Networks , 2007, IEEE Wireless Communications.

[64]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[65]  Wassim Jouini,et al.  Log-normal approximation of chi-square distributions for signal processing , 2011, 2011 XXXth URSI General Assembly and Scientific Symposium.

[66]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[67]  Ian F. Akyildiz,et al.  Cooperative spectrum sensing in cognitive radio networks: A survey , 2011, Phys. Commun..

[68]  Wassim Jouini,et al.  Energy Detection Limits Under Log-Normal Approximated Noise Uncertainty , 2011, IEEE Signal Processing Letters.

[69]  David P. Reed,et al.  How wireless networks scale: the illusion of spectrum scarcity , 2002 .

[70]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[71]  Martin Pál,et al.  Contextual Multi-Armed Bandits , 2010, AISTATS.

[72]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[73]  Shane Greenstein,et al.  Promoting Efficient Use of Spectrum Through Elimination of Barriers to the Development of Secondary Markets , 2001 .

[74]  Qing Zhao,et al.  Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[75]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[76]  Qinyu Zhang,et al.  A design of energy detector in cognitive radio under noise uncertainty , 2008, 2008 11th IEEE Singapore International Conference on Communication Systems.

[77]  A. Goldsmith,et al.  Area spectral efficiency of cellular mobile radio systems , 1997, 1997 IEEE 47th Vehicular Technology Conference. Technology in Motion.

[78]  Michele Zorzi,et al.  Fuzzy Logic for Cross-layer Optimization in Cognitive Radio Networks , 2008, 2007 4th IEEE Consumer Communications and Networking Conference.

[79]  Akimichi Takemura,et al.  An asymptotically optimal policy for finite support models in the multiarmed bandit problem , 2009, Machine Learning.

[80]  Louis Wehenkel,et al.  Learning to Play K-armed Bandit Problems , 2012, ICAART.

[81]  Wassim Jouini,et al.  Channel selection with Rayleigh fading: A multi-armed bandit framework , 2012, 2012 IEEE 13th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[82]  Brian M. Sadler,et al.  Dynamic Spectrum Access: Signal Processing, Networking, and Regulatory Policy , 2006, ArXiv.

[83]  Friedrich Jondral,et al.  Software-Defined Radio—Basics and Evolution to Cognitive Radio , 2005, EURASIP J. Wirel. Commun. Netw..

[84]  Jordi Pérez Romero,et al.  Joint learning-detection framework: an empirical analysis , 2011 .

[85]  Peter Auer,et al.  Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..

[86]  Joseph Mitola,et al.  Cognitive Radio An Integrated Agent Architecture for Software Defined Radio , 2000 .

[87]  Hamidou Tembine,et al.  Radio Engineering: From Software Radio to Cognitive Radio , 2011 .

[88]  Badr Benmammar,et al.  Dynamic Spectrum Access , 2013 .

[89]  Xianming Qing,et al.  Spectrum Survey in Singapore: Occupancy Measurements and Analyses , 2008, 2008 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom 2008).

[90]  Aurélien Garivier,et al.  On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008, 0805.3415.

[91]  Yi Gai,et al.  Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial Multi-Armed Bandit Formulation , 2010, 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN).

[92]  Wassim Jouini,et al.  Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[93]  H. Vincent Poor,et al.  Cognitive Medium Access: Exploration, Exploitation, and Competition , 2007, IEEE Transactions on Mobile Computing.

[94]  Nikola Kasabov,et al.  Evolving Connectionist Systems: The Knowledge Engineering Approach , 2007 .

[95]  Wassim Jouini,et al.  Upper Confidence Bound Algorithm for Opportunistic Spectrum Access with Sensing Errors , 2011 .

[96]  R. Saeed Cognitive Radio and advanced spectrum management , 2008, 2008 Mosharaka International Conference on Communications, Computers and Applications.

[97]  Loïg Godard Modèle de Gestion Hiérarchique Distribuée pour la Reconfiguration et la Prise de Décision dans les Équipements de Radio Cognitive , 2008 .

[98]  Charles W. Bostian,et al.  Application of artificial intelligence to wireless communications , 2007 .

[99]  Bhaskar Krishnamachari,et al.  Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations , 2010, IEEE/ACM Transactions on Networking.

[100]  Özgür B. Akan,et al.  BIOlogically-Inspired Spectrum Sharing in Cognitive Radio Networks , 2007, 2007 IEEE Wireless Communications and Networking Conference.

[101]  Wassim Jouini,et al.  On decision making for dynamic configuration adaptation problem in cognitive radio equipments: a mul , 2010 .

[102]  Q. Zhao,et al.  Decentralized cognitive mac for dynamic spectrum access , 2005, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005..

[103]  Vijay K. Bhargava,et al.  Cognitive Wireless Communication Networks , 2007 .

[104]  Troy Weingart,et al.  A Statistical Method for Reconfiguration of Cognitive Radios , 2007, IEEE Wireless Communications.

[105]  Peter Auer,et al.  Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.

[106]  M. Zorzi,et al.  Learning and Adaptation in Cognitive Radios Using Neural Networks , 2008, 2008 5th IEEE Consumer Communications and Networking Conference.

[107]  L. Berlemann,et al.  Policy-based reasoning for spectrum sharing in radio networks , 2005, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005..

[108]  Jacques Palicot,et al.  From a Configuration Management to a Cognitive Radio Management of SDR Systems , 2006, 2006 1st International Conference on Cognitive Radio Oriented Wireless Networks and Communications.

[109]  Jeffrey G. Andrews,et al.  Femtocell networks: a survey , 2008, IEEE Communications Magazine.

[110]  Luciano Bononi,et al.  Learning with the Bandit: A Cooperative Spectrum Selection Scheme for Cognitive Radio Networks , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.