Collision Anticipation via Deep Reinforcement Learning for Visual Navigation

Visual navigation is the ability of an autonomous agent to find its way in a large and complex environment based on visual information. It is indeed a fundamental problem in computer vision and robotics. In this paper, we propose a deep reinforcement learning approach which is able to learn to navigate a scene to reach a given visual target, but anticipating the possible collisions with the environment. Technically, we propose a map-less-based model, which follows an actor-critic reinforcement learning method where the reward function has been designed to be collision aware. We offer a thorough experimental evaluation of our solution in the AI2-THOR virtual environment, where the results show that our proposed method: (1) improves the state of the art in terms of number of steps and collisions; (2) is able to converge faster than a model which does not care about the collisions, simply searching for the shortest paths; and (3) offers an interesting generalization capability to reach visual targets that have never been seen during training.

[1]  Yoram Koren,et al.  Real-time obstacle avoidance for fast mobile robots in cluttered environments , 1990, Proceedings., IEEE International Conference on Robotics and Automation.

[2]  David Wooden,et al.  A guide to vision-based map building , 2006, IEEE Robotics & Automation Magazine.

[3]  Yoram Koren,et al.  The vector field histogram-fast obstacle avoidance for mobile robots , 1991, IEEE Trans. Robotics Autom..

[4]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[5]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Shigenobu Kobayashi,et al.  Reinforcement learning of walking behavior for a four-legged robot , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).

[7]  Masahiro Tomono,et al.  3-D Object Map Building Using Dense Object Models with SIFT-based Recognition Features , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Ali Farhadi,et al.  AI2-THOR: An Interactive 3D Environment for Visual AI , 2017, ArXiv.

[9]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[10]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Minoru Asada,et al.  Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.

[12]  Ben Tse,et al.  Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[13]  Yoshiaki Shirai,et al.  Autonomous visual navigation of a mobile robot using a human-guided experience , 2002, Robotics Auton. Syst..

[14]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[15]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Michel Dhome,et al.  Outdoor autonomous navigation using monocular vision , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.