RSS-Based Q-Learning for Indoor UAV Navigation

In this paper, we focus on the potential use of unmanned aerial vehicles (UAVs) for search and rescue (SAR) missions in GPS-denied indoor environments. We consider the problem of navigating a UAV to a wireless signal source, e.g., a smartphone or watch owned by a victim. We assume that the source periodically transmits RF signals to nearby wireless access points. Received signal strength (RSS) at the UAV, which is a function of the UAV and source positions, is fed to a Q-learning algorithm, and the UAV is navigated to the vicinity of the source. Unlike the traditional location-based Q-learning approach that uses the GPS coordinates of the agent, our method uses the RSS to define the states and rewards of the algorithm. It does not require any a priori information about the environment. These, in turn, make it possible to use the UAVs in indoor SAR operations. Two indoor scenarios with different dimensions are created using a ray tracing software. Then, the corresponding heat maps that show the RSS at each possible UAV location are extracted for more realistic analysis. Performance of the RSS-based Q-learning algorithm is compared with the baseline (location-based) Q-learning algorithm in terms of convergence speed, average number of steps per episode, and the total length of the final trajectory. Our results show that the RSS-based Q-learning provides competitive performance with the location-based Q-learning.

[1]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[2]  Hriday Bavle,et al.  A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform , 2018, Journal of Intelligent & Robotic Systems.

[3]  Abhinav Gupta,et al.  Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Wei Wang,et al.  Feasibility study of mobile phone WiFi detection in aerial search and rescue operations , 2013, APSys.

[5]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[6]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[7]  Fatih Erden,et al.  Micro-UAV Detection and Classification from RF Fingerprints Using Machine Learning Techniques , 2019, 2019 IEEE Aerospace Conference.

[8]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[9]  Cong Pu,et al.  RescueMe: Smartphone-Based Self Rescue System for Disaster Rescue , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[10]  Ling-Jyh Chen,et al.  Krypto: Assisting Search and Rescue Operations using Wi-Fi Signal with UAV , 2015, DroNet@MobiSys.

[11]  Victor M. Becerra,et al.  Autonomous Control of Unmanned Aerial Vehicles , 2019, Electronics.

[12]  Xiao Zhang,et al.  Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[13]  Paul de Kerret,et al.  Trajectory Optimization for Autonomous Flying Base Station via Reinforcement Learning , 2018, 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[14]  Abbas Jamalipour,et al.  Modeling air-to-ground path loss for low altitude platforms in urban environments , 2014, 2014 IEEE Global Communications Conference.

[15]  Greyson Daugherty,et al.  A Q-learning approach to automated unmanned air vehicle demining , 2012 .

[16]  Chunquan Du,et al.  Research on Urban Public Safety Emergency Management Early Warning System based on Technologies for the Internet of Things , 2012 .

[17]  I-Ming Chen,et al.  Autonomous navigation of UAV by using real-time model-based reinforcement learning , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[18]  Ismail Güvenç,et al.  Localization of WiFi Devices Using Probe Requests Captured at Unmanned Aerial Vehicles , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[19]  Mingzhe Chen,et al.  Optimized Trajectory Design in UAV Based Cellular Networks: A Double Q-Learning Approach , 2018, 2018 IEEE International Conference on Communication Systems (ICCS).

[20]  Hung Manh La,et al.  Reinforcement Learning for Autonomous UAV Navigation Using Function Approximation , 2018, 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[21]  Tsuhan Chen,et al.  Deep Neural Network for Real-Time Autonomous Indoor Navigation , 2015, ArXiv.