High speed obstacle avoidance using monocular vision and reinforcement learning

We consider the task of driving a remote control car at high speeds through unstructured outdoor environments. We present an approach in which supervised learning is first used to estimate depths from single monocular images. The learning algorithm can be trained either on real camera images labeled with ground-truth distances to the closest obstacles, or on a training set consisting of synthetic graphics images. The resulting algorithm is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene. Reinforcement learning/policy search is then applied within a simulator that renders synthetic scenes. This learns a control policy that selects a steering direction as a function of the vision system's output. We present results evaluating the predictive ability of the algorithm both on held out test data, and in actual autonomous driving experiments.

[1]  Rama Chellappa,et al.  New algorithms from reconstruction of a 3-D depth map from one or more images , 1988, Proceedings CVPR '88: The Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  Bernd Jähne,et al.  Depth from focus with one image , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Wilson S. Geisler,et al.  Maximum-likelihood depth-from-defocus for active vision , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[5]  Bernard Heit,et al.  Visual depth perception based on optical blur , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[6]  E. R. Davies,et al.  Machine vision - theory, algorithms, practicalities (2. ed.) , 1997 .

[7]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[8]  Pawan Sinha,et al.  Top-down influences on stereoscopic depth-perception , 1998, Nature Neuroscience.

[9]  Michael Kearns,et al.  Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.

[10]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[11]  H. Kudo,et al.  Measurement of the ability in monocular depth perception during gazing at near visual target-effect of the ocular parallax cue , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[12]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[13]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[14]  J. Loomis Looking down is looking up , 2001, Nature.

[15]  Masaaki Ikehara,et al.  HMM-based surface reconstruction from single images , 2002, Proceedings. International Conference on Image Processing.

[16]  Giuseppina C. Gini,et al.  Indoor Robot Navigation With Single Camera Vision , 2002, PRIS.

[17]  William T. B. Uther,et al.  Automatic Gait Optimisation for Quadruped Robots , 2003 .

[18]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[19]  Zijiang J. He,et al.  Perceiving distance accurately by a directional process of integrating ground information , 2004, Nature.

[20]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[21]  Ian D. Reid,et al.  Single View Metrology , 2000, International Journal of Computer Vision.

[22]  E. R. Davies,et al.  Machine vision - theory, algorithms, practicalities , 2004 .

[23]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.