论文信息 - Clustering in image space for place recognition and visual annotations for human-robot interaction

Clustering in image space for place recognition and visual annotations for human-robot interaction

The most classical way of attempting to solve the vision-guided navigation problem for autonomous robots corresponds to the use of three-dimensional (3-D) geometrical descriptions of the scene; what is known as model-based approaches. However, these approaches do not facilitate the user's task because they require that geometrically precise models of the 3-D environment be given by the user. In this paper, we propose the use of "annotations" posted on some type of blackboard or "descriptive" map to facilitate this user-robot interaction. We show that, by using this technique, user commands can be as simple as "go to label 5." To build such a mechanism, new approaches for vision-guided mobile robot navigation have to be found. We show that this can be achieved by using mixture models within an appearance-based paradigm. Mixture models are more useful in practice than other pattern recognition methods such as principal component analysis (PCA) or Fisher discriminant analysis (FDA)-also known as linear discriminant analysis (LDA), because they can represent nonlinear subspaces. However, given the fact that mixture models are usually learned using the expectation-maximization (EM) algorithm which is a gradient ascent technique, the system cannot always converge to a desired final solution, due to the local maxima problem. To resolve this, a genetic version of the EM algorithm is used. We then show the capabilities of this latest approach on a navigation task that uses the above described "annotations."

Jordi Vitrià | Aleix M. Martínez | Aleix M. Martinez | Jordi Vitrià

[1] R. Fisher. THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .

[2] Keinosuke Fukunaga,et al. Introduction to Statistical Pattern Recognition , 1972 .

[3] Azriel Rosenfeld,et al. Digital Picture Processing , 1976 .

[4] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5] Hans P. Moravec. Robot Rover Visual Navigation , 1981 .

[6] B. V. K. Vijaya Kumar,et al. Efficient Calculation of Primary Images from a Set of Images , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Hans P. Moravec,et al. The Stanford Cart and the CMU Rover , 1983, Proceedings of the IEEE.

[8] New York Dover,et al. ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[9] K. Fukunaga,et al. Nonparametric Discriminant Analysis , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[11] L Sirovich,et al. Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[12] Geoffrey J. McLachlan,et al. Mixture models : inference and applications to clustering , 1989 .

[13] Lawrence Sirovich,et al. Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[14] Keinosuke Fukunaga,et al. Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[15] Olivier Faugeras,et al. Maintaining representations of the environment of a mobile robot , 1988, IEEE Trans. Robotics Autom..

[16] Dana H. Ballard,et al. Animate Vision , 1991, Artif. Intell..

[17] Jean-Claude Latombe,et al. Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[18] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[19] Avinash C. Kak,et al. Fast Vision-guided Mobile Robot Navigation Using Model-based Reasoning And Prediction Of Uncertainties , 1992, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Stephen M. Omohundro,et al. Surface Learning with Applications to Lipreading , 1993, NIPS.

[21] S. LaValle,et al. Motion Planning , 2008, Springer Handbook of Robotics.

[22] Shimon Ullman,et al. Face Recognition: The Problem of Compensating for Changes in Illumination Direction , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[23] Heinz Mühlenbein,et al. Analysis of Selection, Mutation and Recombination in Genetic Algorithms , 1995, Evolution and Biocomputation.

[24] Lothar Thiele,et al. A Comparison of Selection Schemes used in Genetic Algorithms , 1995 .

[25] T. Moon. The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[26] Hiroshi Murase,et al. Subspace methods for robot vision , 1996, IEEE Trans. Robotics Autom..

[27] Juyang Weng,et al. Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[28] Thomas Bäck,et al. Evolutionary computation: an overview , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[29] G. McLachlan,et al. The EM algorithm and extensions , 1996 .

[30] J. Weng. Cresceptron and Shoslif: toward Comprehensive Visual Learning 1 , 1996 .

[31] S. Nayar,et al. Early Visual Learning , 1996 .

[32] Geoffrey E. Hinton,et al. Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[33] Alex Pentland,et al. Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34] David J. Kriegman,et al. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[35] Tomaso A. Poggio,et al. Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[36] Christopher M. Bishop,et al. A Hierarchical Latent Variable Model for Data Visualization , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[37] Anil K. Jain,et al. Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[38] Jordi Vitrià,et al. Learning mixture models using a genetic version of the EM algorithm , 2000, Pattern Recognition Letters.

[39] Aleix M. Martínez,et al. Recognition of partially occluded and/or imprecisely localized faces using a probabilistic approach , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[40] Avinash C. Kak,et al. PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..