Weak Hypotheses and Boosting for Generic Object Detection and Recognition

In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting (5) as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a com- mon framework: Boosting — together with a weak hypotheses finder — may choose very inhomogeneous features as most relevant for combina- tion into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth (1) where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants (8) and SIFTs (12). The descriptors are calculated from local patches detected by an inter- est point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses (22) or complex classifiers (20) have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art (3). But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented re- gions and even spatial relationships, leads us a significant step towards generic object recognition.

[1]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[2]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Roberto Cipolla,et al.  Computer Vision — ECCV '96 , 1996, Lecture Notes in Computer Science.

[4]  Tony Lindeberg,et al.  Edge Detection and Ridge Detection with Automatic Scale Selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Luc Van Gool,et al.  Affine/ Photometric Invariants for Planar Intensity Patterns , 1996, ECCV.

[6]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8]  Michael Werman,et al.  Ridge's corner detection and correspondence , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Rolf P. Würtz,et al.  Corner Detection in Color Images by Multiscale Combination of End-Stopped Cortical Cells , 1997, ICANN.

[10]  ROBERT LAGANIÈRE,et al.  A morphological operator for corner detection , 1998, Pattern Recognit..

[11]  Manfred K. Warmuth,et al.  Efficient Learning With Virtual Threshold Gates , 1995, Inf. Comput..

[12]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Pietro Perona,et al.  Unsupervised learning of models for object recognition , 2000 .

[14]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[15]  Christopher M. Bishop,et al.  Non-linear Bayesian Image Modelling , 2000, ECCV.

[16]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[17]  Martial Hebert,et al.  Object recognition using boosted discriminants , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[18]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[20]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[21]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[22]  Thomas S. Huang,et al.  Fusion of global and local information for object detection , 2002, Object recognition supported by user interaction for service robots.

[23]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Cordelia Schmid,et al.  Selection of scale-invariant parts for object class recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[26]  Takeo Kanade,et al.  Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.

[27]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[28]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.