论文信息 - RSS-Based Q-Learning for Indoor UAV Navigation

RSS-Based Q-Learning for Indoor UAV Navigation

In this paper, we focus on the potential use of unmanned aerial vehicles (UAVs) for search and rescue (SAR) missions in GPS-denied indoor environments. We consider the problem of navigating a UAV to a wireless signal source, e.g., a smartphone or watch owned by a victim. We assume that the source periodically transmits RF signals to nearby wireless access points. Received signal strength (RSS) at the UAV, which is a function of the UAV and source positions, is fed to a Q-learning algorithm, and the UAV is navigated to the vicinity of the source. Unlike the traditional location-based Q-learning approach that uses the GPS coordinates of the agent, our method uses the RSS to define the states and rewards of the algorithm. It does not require any a priori information about the environment. These, in turn, make it possible to use the UAVs in indoor SAR operations. Two indoor scenarios with different dimensions are created using a ray tracing software. Then, the corresponding heat maps that show the RSS at each possible UAV location are extracted for more realistic analysis. Performance of the RSS-based Q-learning algorithm is compared with the baseline (location-based) Q-learning algorithm in terms of convergence speed, average number of steps per episode, and the total length of the final trajectory. Our results show that the RSS-based Q-learning provides competitive performance with the location-based Q-learning.

[1] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .

[2] Hriday Bavle,et al. A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform , 2018, Journal of Intelligent & Robotic Systems.

[3] Abhinav Gupta,et al. Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] Wei Wang,et al. Feasibility study of mobile phone WiFi detection in aerial search and rescue operations , 2013, APSys.

[5] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[6] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[7] Fatih Erden,et al. Micro-UAV Detection and Classification from RF Fingerprints Using Machine Learning Techniques , 2019, 2019 IEEE Aerospace Conference.

[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[9] Cong Pu,et al. RescueMe: Smartphone-Based Self Rescue System for Disaster Rescue , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[10] Ling-Jyh Chen,et al. Krypto: Assisting Search and Rescue Operations using Wi-Fi Signal with UAV , 2015, DroNet@MobiSys.

[11] Victor M. Becerra,et al. Autonomous Control of Unmanned Aerial Vehicles , 2019, Electronics.

[12] Xiao Zhang,et al. Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[13] Paul de Kerret,et al. Trajectory Optimization for Autonomous Flying Base Station via Reinforcement Learning , 2018, 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[14] Abbas Jamalipour,et al. Modeling air-to-ground path loss for low altitude platforms in urban environments , 2014, 2014 IEEE Global Communications Conference.

[15] Greyson Daugherty,et al. A Q-learning approach to automated unmanned air vehicle demining , 2012 .

[16] Chunquan Du,et al. Research on Urban Public Safety Emergency Management Early Warning System based on Technologies for the Internet of Things , 2012 .

[17] I-Ming Chen,et al. Autonomous navigation of UAV by using real-time model-based reinforcement learning , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[18] Ismail Güvenç,et al. Localization of WiFi Devices Using Probe Requests Captured at Unmanned Aerial Vehicles , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[19] Mingzhe Chen,et al. Optimized Trajectory Design in UAV Based Cellular Networks: A Double Q-Learning Approach , 2018, 2018 IEEE International Conference on Communication Systems (ICCS).

[20] Hung Manh La,et al. Reinforcement Learning for Autonomous UAV Navigation Using Function Approximation , 2018, 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[21] Tsuhan Chen,et al. Deep Neural Network for Real-Time Autonomous Indoor Navigation , 2015, ArXiv.