Vision-Based Mobile Robotics Obstacle Avoidance With Deep Reinforcement Learning

Obstacle avoidance is a fundamental and challenging problem for autonomous navigation of mobile robots. In this paper, we consider the problem of obstacle avoidance in simple 3D environments where the robot has to solely rely on a single monocular camera. In particular, we are interested in solving this problem without relying on localization, mapping, or planning techniques. Most of the existing work consider obstacle avoidance as two separate problems, namely obstacle detection, and control. Inspired by the recent advantages of deep reinforcement learning in Atari games and understanding highly complex situations in Go, we tackle the obstacle avoidance problem as a data-driven end-to-end deep learning approach. Our approach takes raw images as input and generates control commands as output. We show that discrete action spaces are outperforming continuous control commands in terms of expected average reward in maze-like environments. Furthermore, we show how to accelerate the learning and increase the robustness of the policy by incorporating predicted depth maps by a generative adversarial network.

[1]  Prashant Doshi,et al.  A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress , 2018, Artif. Intell..

[2]  Sergey Levine,et al.  Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.

[3]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[4]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[5]  Ming Liu,et al.  Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Ming Liu,et al.  Deep-learning in Mobile Robotics - from Perception to Control Systems: A Survey on Why and Why not , 2016, ArXiv.

[7]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[9]  Jay M. Wong,et al.  Towards Lifelong Self-Supervision: A Deep Learning Direction for Robotics , 2016, ArXiv.

[10]  Ming Liu,et al.  Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots , 2016, ArXiv.

[11]  Alejandro Hernández Cordero,et al.  Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo , 2016, ArXiv.

[12]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[14]  J. Schulman,et al.  OpenAI Gym , 2016, ArXiv.

[15]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[16]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[17]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[18]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[19]  Marc G. Bellemare,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[23]  Jürgen Schmidhuber,et al.  Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.

[24]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[25]  Stan Birchfield,et al.  Real-time obstacle detection and avoidance in the presence of specular surfaces using an active 3D sensor , 2013, 2013 IEEE Workshop on Robot Vision (WORV).

[26]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.