Training Object Detection Models with Weakly Labeled Data

Appearance based object detection systems utilizing statistical models to capture real world variations in appearance have been shown to exhibit good detection performance. The parameters of these statistical models are typically learned automatically from labeled training images. This process can be difficult in that a large number of labeled training examples may be needed to accurately model appearance variation. In this work we describe a method whereby a training set consisting of a small number of fully labeled training examples augmented with a set of weakly labeled examples can be used to train a detector which exhibits performance better than that which can be obtained with a reduced set of fully labeled training examples alone.

[1]  Brendan J. Frey,et al.  Fast, Large-Scale Transformation-Invariant Clustering , 2001, NIPS.

[2]  Takeo Kanade,et al.  Probabilistic modeling of local appearance and spatial relationships for object recognition , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[3]  Sebastian Thrun,et al.  Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.

[4]  Bernt Schiele,et al.  Recognition without Correspondence using Multidimensional Receptive Field Histograms , 2004, International Journal of Computer Vision.

[5]  Robert Nowak,et al.  Template Learning from Atomic Representations: A Wavelet-Based Approach to Pattern Analysis , 2001 .

[6]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[9]  Takeo Kanade,et al.  Rotation Invariant Neural Network-Based Face Detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[10]  Alex Pentland,et al.  Expectation Maximization for Weakly Labeled Data , 2001, ICML.

[11]  Andrea Salgian,et al.  Minimally supervised acquisition of 3D recognition models from cluttered images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[13]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[16]  Brendan J. Frey,et al.  Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).