Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments

This paper describes the theoretical principles and experimental results of reinforcement learning algorithms embedded into IoT devices (Internet of Things), in order to tackle the problem of radio collision mitigation in ISM unlicensed bands. Multi-armed bandit (MAB) learning algorithms are used here to improve both the IoT network capability to support the expected massive number of objects and the energetic autonomy of the IoT devices. We first illustrate the efficiency of the proposed approach in a proof-of-concept, based on USRP software radio platforms operating on real radio signals. It shows how collisions with other RF signals are diminished for IoT devices that use MAB learning. Then we describe the first implementation of such algorithms on LoRa devices operating in a real LoRaWAN network at 868 MHz. We named this solution IoTligent. IoTligent does not add neither processing overhead, so it can be run into the IoT devices, nor network overhead, so that it requires no change to LoRaWAN protocol. Real-life experiments done in a real LoRa network show that IoTligent devices' battery life can be extended by a factor of 2, in the scenarios we faced during our experiment. Finally we submit IoTligent devices to very constrained conditions that are expected in the future with the growing number of IoT devices, by generating an artificial IoT massive radio traffic in anechoic chamber. We show that IoTligent devices can cope with spectrum scarcity that will occur at that time in unlicensed bands.

[1]  Wassim Jouini,et al.  Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).

[2]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[3]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[4]  Jacques Palicot,et al.  Proof-of-Concept System for Opportunistic Spectrum Access in Multi-user Decentralized Networks , 2016, EAI Endorsed Trans. Cogn. Commun..

[5]  Brian M. Sadler,et al.  A Survey of Dynamic Spectrum Access , 2007, IEEE Signal Processing Magazine.

[6]  D. Ernst,et al.  Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access , 2010, 2010 IEEE International Conference on Communications.

[7]  Christophe Moy,et al.  Multi-Armed bandit Learning in Iot Networks (MALIN) , 2018 .

[8]  Jacques Palicot,et al.  Multi-Armed Bandit Learning in IoT Networks: Learning Helps Even in Non-stationary Settings , 2017, CrownCom.

[9]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[10]  Ananthram Swami,et al.  A Survey of Dynamic Spectrum Access: Signal Processing and Networking Perspectives , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Christophe Moy,et al.  Reinforcement Learning Real Experiments for Opportunistic Spectrum Access , 2014 .

[12]  Ananthram Swami,et al.  Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.

[13]  W. R. Thompson ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .

[14]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[15]  Christophe MOY IoTligent: First World-Wide Implementation of Decentralized Spectrum Learning for IoT Wireless Networks , 2019, 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC).

[16]  Christophe Moy,et al.  Decentralized Spectrum Learning for IoT Wireless Networks Collision Mitigation , 2019, 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS).