论文信息 - Learning Appearance Based Models: Hierarchical Mixtures of Experts Approach

Learning Appearance Based Models: Hierarchical Mixtures of Experts Approach

This paper describes a new technique for object recognition based on learning appearance models. The image is decomposed into local regions which are described by a new texture representation derived from the output of multiscale, multiorientation filter banks. We call this representation ``Generalized Second Moments'''' as it can be viewed as a generalization of the windowed second moment matrix representation used by Garding & Lindeberg. Class-characteristic local texture features and their global composition is learned by a hierarchical mixture of experts architecture. The technique is applied to a vehicle database consisting of 5 general car categories (Sedan, Van with back-doors, Van without back-doors, old Sedan, and Volkswagen Bug). This is a difficult problem with considerable in-class variation. Our technique has a 6.5 % misclassification rate, compared to eigen-images which give 17.4 % misclassification rate, and nearest neighbors which give 15.7 % misclassification rate.

Jitendra Malik | Christoph Bregler

[1] Alex Pentland,et al. View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[2] Tomaso Poggio,et al. Example Based Image Analysis and Synthesis , 1993 .

[3] S. Nayar,et al. Learning and Recognition of 3-D Objects from Brightness Images * , 1993 .

[4] Timothy F. Cootes,et al. Automatic interpretation of human faces and hand gestures using flexible models. , 1995 .

[5] P Perona,et al. Preattentive texture discrimination with early vision mechanisms. , 1990, Journal of the Optical Society of America. A, Optics and image science.

[6] Jitendra Malik,et al. Detecting and localizing edges composed of steps, peaks and roofs , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[7] Jitendra Malik,et al. A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters , 1991, ECCV.

[8] Jitendra Malik,et al. Towards realtime visual based tracking in cluttered traffic scenes , 1994, Proceedings of the Intelligent Vehicles '94 Symposium.

[9] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[10] Shree K. Nayar,et al. Automatic generation of GRBF networks for visual learning , 1995, Proceedings of IEEE International Conference on Computer Vision.

[11] Dean A. Pomerleau,et al. Neural Network Perception for Mobile Robot Guidance , 1993 .

[12] Michael C. Burl,et al. Finding faces in cluttered scenes using random labeled graph matching , 1995, Proceedings of IEEE International Conference on Computer Vision.

[13] Yann LeCun,et al. Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[14] Alex Pentland,et al. Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[15] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[16] Arthur R. Pope,et al. Learning 3D Object Recognition Models from 2D Images , 1993 .

[17] Roberto Brunelli,et al. Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[18] Yann LeCun,et al. Off Line Recognition of Handwritten Postal Words Using Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[19] Edward H. Adelson,et al. The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.