Vision-based robot localization without explicit object models

We consider the problem of locating a robot in an initially-unfamiliar environment from visual input. The robot is not given a map of the environment, but it does have access to a collection of training examples, each of which specifies the video image observed when the robot is at a particular location and orientation. We address two variants of this problem: how to estimate translation of a moving robot assuming the orientation is known, and how to estimate translation and orientation for a mobile robot. Performing scene reconstruction to construct a metric map of the environment using only video images is difficult. We avoid this by using an approach in which the robot learns to convert a set of image measurements into a representation of its pose (position and orientation). This provides a metric estimate of the robot's location within a region covered by the statistical map we build. Localization can be performed online without a prior location estimate, The conversion from visual data to camera pose is implemented using a multilayer neural network that is trained using backpropagation. An aspect of the approach is the use of an inconsistency measure to eliminate incorrect data and estimate components of the pose vector. The experimental data reported in this paper suggests that the accuracy and flexibility of the technique is good, while the online computational cost is very low.

[1]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[2]  Avinash C. Kak,et al.  Vision-guided mobile robot navigation using retroactive updating of position uncertainty , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.

[3]  D. Pomerleau,et al.  MANIAC : A Next Generation Neurally Based Autonomous Road Follower , 1993 .

[4]  Gregory Dudek,et al.  Precise positioning using model-based maps , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[5]  Ehud Rivlin,et al.  Homing Using Combinations of Model Views , 1993, IJCAI.

[6]  Azriel Rosenfeld,et al.  Digital Picture Processing , 1976 .

[7]  Ehud Rivlin,et al.  Localization and Homing Using Combinations of Model Views , 1995, Artif. Intell..

[8]  Alex Pentland,et al.  Face Processing: Models For Recognition , 1990, Other Conferences.

[9]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[11]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  D. E. Rumelhart,et al.  Learning internal representations by back-propagating errors , 1986 .

[14]  John J. Leonard,et al.  Directed Sonar Sensing for Mobile Robot Navigation , 1992 .

[15]  Hiroshi Murase,et al.  Learning, positioning, and tracking visual appearance , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[16]  Rodney A. Brooks,et al.  Visual map making for a mobile robot , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[17]  David Haussler,et al.  Classifying learnable geometric concepts with the Vapnik-Chervonenkis dimension , 1986, STOC '86.