Probabilistic Visual Learning for Object Representation

We present an unsupervised technique for visual learning, which is based on density estimation in high-dimensional spaces using an eigenspace decomposition. Two types of density estimates are derived for modeling the training data: a multivariate Gaussian (for unimodal distributions) and a mixture-of-Gaussians model (for multimodal distributions). Those probability densities are then used to formulate a maximum-likelihood estimation framework for visual search and target detection for automatic object recognition and coding. Our learning technique is applied to the probabilistic visual modeling, detection, recognition, and coding of human faces and nonrigid objects, such as hands.

[1]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[2]  Michel Loève,et al.  Probability Theory I , 1977 .

[3]  King-Sun Fu,et al.  Shape Discrimination Using Fourier Descriptors , 1977, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  David Casasent,et al.  Principal-Component Imagery For Statistical Pattern Recognition Correlators , 1982 .

[6]  S. Palmer The Psychology of Perceptual Organization: A Transformational Approach , 1983 .

[7]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[8]  P. J. Burt,et al.  Change Detection and Tracking Using Pyramid Transform Techniques , 1985, Other Conferences.

[9]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[10]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Yehezkel Yeshurun,et al.  Detection of interest points using symmetry , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[12]  Alex Pentland,et al.  Closed-Form Solutions for Physically Based Shape Modeling and Recognition , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[15]  Timothy F. Cootes,et al.  Active Shape Models - 'smart snakes' , 1992, BMVC.

[16]  D. J. Myers,et al.  Automatic location of visual features by a system of multilayered perceptrons , 1992 .

[17]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[20]  Hiroshi Murase,et al.  Learning and recognition of 3D objects from appearance , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[21]  Timothy F. Cootes,et al.  Use of active shape models for locating structures in medical images , 1994, Image Vis. Comput..

[22]  Richard J. Mammone,et al.  Automatic systems for the identification and inspection of humans : 28-29 July 1994, San Diego, California , 1994 .

[23]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Electronic Imaging.

[24]  John,et al.  On Comprehensive Visual Learning , 1994 .

[25]  Alex Pentland,et al.  Face recognition using view-based and modular eigenspaces , 1994, Optics & Photonics.

[26]  Pietro Perona,et al.  Automating the hunt for volcanoes on Venus , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Alex Pentland,et al.  Human Face Recognition and the Face Image Set's Topology , 1994 .

[29]  Hiroshi Murase,et al.  General learning algorithm for robot vision , 1994, Optics & Photonics.

[30]  G. Anspach,et al.  Fourier descriptors and neural networks far shape classification , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[31]  Alex Pentland,et al.  Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Other Conferences.

[33]  Roberto Brunelli,et al.  Robust estimation of correlation with applications to computer vision , 1995, Pattern Recognit..

[34]  Takeo Kanade,et al.  Human Face Detection in Visual Scenes , 1995, NIPS.

[35]  Alex Pentland,et al.  A Bayesian similarity measure for direct image matching , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[36]  Hyeonjoon Moon,et al.  The FERET September 1996 Database and Evaluation Procedure , 1997, AVBPA.

[37]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..