Learning a Sparse Representation for Object Detection

We present an approach for learning to detect objects in still gray images, that is based on a sparse, part-based representation of objects. A vocabulary of information-rich object parts is automatically constructed from a set of sample images of the object class of interest. Images are then represented using parts from this vocabulary, along with spatial relations observed among them. Based on this representation, a feature-efficient learning algorithm is used to learn to detect instances of the object class. The framework developed can be applied to any object with distinguishable parts in a relatively fixed spatial configuration. We report experiments on images of side views of cars. Our experiments show that the method achieves high detection accuracy on a difficult test set of real-world images, and is highly robust to partial occlusion and background variation.In addition, we discuss and offer solutions to several methodological issues that are significant for the research community to be able to evaluate object detection approaches.

[1]  Hans P. Morevec Towards automatic visual obstacle avoidance , 1977, IJCAI 1977.

[2]  Hans P. Moravec Towards Automatic Visual Obstacle Avoidance , 1977, IJCAI.

[3]  S. Palmer Hierarchical structure in perceptual representation , 1977, Cognitive Psychology.

[4]  W. Eric L. Grimson,et al.  Recognition and localization of overlapping parts from sparse data in two and three dimensions , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[5]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[6]  Linda G. Shapiro,et al.  Computer and Robot Vision , 1991 .

[7]  D. Perrett,et al.  Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. , 1994, Cerebral cortex.

[8]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Thomas S. Huang,et al.  Face detection with information-based maximum discrimination , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[13]  H. Bülthoff,et al.  Learning to recognize objects , 1999, Trends in Cognitive Sciences.

[14]  Narendra Ahuja,et al.  A SNoW-Based Face Detector , 1999, NIPS.

[15]  Yali Amit,et al.  A Computational Model for Visual Selection , 1999, Neural Computation.

[16]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[17]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Narendra Ahuja,et al.  Learning to recognize objects , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[19]  Ming-Hsuan Yang,et al.  Learning to Recognize 3D Objects , 2000 .

[20]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  Narendra Ahuja,et al.  Learning to Recognize Three-Dimensional Objects , 2002, Neural Computation.

[23]  Linda G. Shapiro,et al.  Computer and Robot Vision (Volume II) , 2002 .

[24]  M. Tarr,et al.  Visual Object Recognition , 1996, ISTCS.