Learning Generative Models of Scene Features

We present a method for learning a set of generative models which are suitable for representing variations of selected image-domain features of the scene as a function of changes in the camera viewpoint. Such models are important for robotic tasks, such as probabilistic position estimation (i.e. localization), as well as visualization. Our approach entails the selection of image-domain features, as well as the synthesis of models of their visual behavior. The model we propose is capable of generating maximum likelihood views of automatically selected features, as well as a measure of the likelihood of a particular view from a particular camera position. Training the models involves regularizing observations of the features from known camera locations. The uncertainty of the model is evaluated using cross validation. The features themselves are initially selected automatically as salient points by a measure of visual attention, and are tracked across multiple views. While the motivation for this work is for robot localization, the results have implications for image interpolation, virtual scene reconstruction and object recognition. The paper presents a formulation of the problem and illustrative experimental results.

[1]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[2]  G. Wahba Convergence rates of "thin plate" smoothing splines wihen the data are noisy , 1979 .

[3]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  R. Dennis Cook,et al.  Cross-Validation of Regression Models , 1984 .

[5]  John K. Tsotsos Analyzing vision at the complexity level , 1990, Behavioral and Brain Sciences.

[6]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[7]  Hugh F. Durrant-Whyte,et al.  Simultaneous map building and localization for an autonomous mobile robot , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[8]  Hiroshi Murase,et al.  Learning, positioning, and tracking visual appearance , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[9]  Timothy F. Cootes,et al.  Modelling Object Appearance using The Grey-Level Surface , 1994, BMVC.

[10]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[11]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[12]  Alex Pentland,et al.  Generalized Image Matching: Statistical Learning of Physically-Based Deformations , 1996, ECCV.

[13]  David J. Kriegman,et al.  What is the set of images of an object under all possible lighting conditions? , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Tomaso A. Poggio,et al.  Linear Object Classes and Image Synthesis From a Single Example Image , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Amnon Shashua,et al.  Model-based brightness constraints: on direct estimation of structure and motion , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Wolfram Burgard,et al.  A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots , 1998, Auton. Robots.

[17]  Wolfram Burgard,et al.  Active Markov localization for mobile robots , 1998, Robotics Auton. Syst..

[18]  Alan C. Schultz,et al.  Mobile robot exploration and map-building with continuous localization , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[19]  Wolfram Burgard,et al.  Using the CONDENSATION algorithm for robust, vision-based mobile robot localization , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[20]  Cordelia Schmid,et al.  A structured probabilistic model for recognition , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[21]  Long Quan,et al.  Image interpolation by joint view triangulation , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[22]  Gregory Dudek,et al.  Learning and evaluating visual features for pose estimation , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[23]  Alexei A. Efros,et al.  Texture synthesis by non-parametric sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Gregory Dudek,et al.  Robust place recognition using local appearance based methods , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[26]  Wolfram Burgard,et al.  A real-time algorithm for mobile robot mapping with applications to multi-robot and 3D mapping , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[27]  James L. Crowley,et al.  Continuity properties of the appearance manifold for mobile robot position estimation , 2001, Image Vis. Comput..

[28]  Keiji Nagatani,et al.  Topological simultaneous localization and mapping (SLAM): toward exact localization without explicit localization , 2001, IEEE Trans. Robotics Autom..

[29]  Gregory Dudek,et al.  Learning environmental features for pose estimation , 2001, Image Vis. Comput..

[30]  James J. Little,et al.  Global localization using distinctive visual features , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Gregory Dudek,et al.  Comparing image-based localization methods , 2003, IJCAI.