Clustering in image space for place recognition and visual annotations for human-robot interaction

The most classical way of attempting to solve the vision-guided navigation problem for autonomous robots corresponds to the use of three-dimensional (3-D) geometrical descriptions of the scene; what is known as model-based approaches. However, these approaches do not facilitate the user's task because they require that geometrically precise models of the 3-D environment be given by the user. In this paper, we propose the use of "annotations" posted on some type of blackboard or "descriptive" map to facilitate this user-robot interaction. We show that, by using this technique, user commands can be as simple as "go to label 5." To build such a mechanism, new approaches for vision-guided mobile robot navigation have to be found. We show that this can be achieved by using mixture models within an appearance-based paradigm. Mixture models are more useful in practice than other pattern recognition methods such as principal component analysis (PCA) or Fisher discriminant analysis (FDA)-also known as linear discriminant analysis (LDA), because they can represent nonlinear subspaces. However, given the fact that mixture models are usually learned using the expectation-maximization (EM) algorithm which is a gradient ascent technique, the system cannot always converge to a desired final solution, due to the local maxima problem. To resolve this, a genetic version of the EM algorithm is used. We then show the capabilities of this latest approach on a navigation task that uses the above described "annotations."

[1]  R. Fisher THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .

[2]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[3]  Azriel Rosenfeld,et al.  Digital Picture Processing , 1976 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Hans P. Moravec Robot Rover Visual Navigation , 1981 .

[6]  B. V. K. Vijaya Kumar,et al.  Efficient Calculation of Primary Images from a Set of Images , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Hans P. Moravec,et al.  The Stanford Cart and the CMU Rover , 1983, Proceedings of the IEEE.

[8]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[9]  K. Fukunaga,et al.  Nonparametric Discriminant Analysis , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[11]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[12]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[13]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[15]  Olivier Faugeras,et al.  Maintaining representations of the environment of a mobile robot , 1988, IEEE Trans. Robotics Autom..

[16]  Dana H. Ballard,et al.  Animate Vision , 1991, Artif. Intell..

[17]  Jean-Claude Latombe,et al.  Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[18]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[19]  Avinash C. Kak,et al.  Fast Vision-guided Mobile Robot Navigation Using Model-based Reasoning And Prediction Of Uncertainties , 1992, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[21]  S. LaValle,et al.  Motion Planning , 2008, Springer Handbook of Robotics.

[22]  Shimon Ullman,et al.  Face Recognition: The Problem of Compensating for Changes in Illumination Direction , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Heinz Mühlenbein,et al.  Analysis of Selection, Mutation and Recombination in Genetic Algorithms , 1995, Evolution and Biocomputation.

[24]  Lothar Thiele,et al.  A Comparison of Selection Schemes used in Genetic Algorithms , 1995 .

[25]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[26]  Hiroshi Murase,et al.  Subspace methods for robot vision , 1996, IEEE Trans. Robotics Autom..

[27]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Thomas Bäck,et al.  Evolutionary computation: an overview , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[29]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[30]  J. Weng Cresceptron and Shoslif: toward Comprehensive Visual Learning 1 , 1996 .

[31]  S. Nayar,et al.  Early Visual Learning , 1996 .

[32]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[33]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[35]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Christopher M. Bishop,et al.  A Hierarchical Latent Variable Model for Data Visualization , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Jordi Vitrià,et al.  Learning mixture models using a genetic version of the EM algorithm , 2000, Pattern Recognition Letters.

[39]  Aleix M. Martínez,et al.  Recognition of partially occluded and/or imprecisely localized faces using a probabilistic approach , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[40]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..