A weakly supervised approach for object detection based on Soft-Label Boosting

Object detection is an important and challenging problem in the field of computer vision. Classical object detection approaches such as background subtraction and saliency detection do not require manual collection of training samples, but can be easily affected by noise factors, such as luminance changes and cluttered background. On the other hand, supervised learning based approaches such as Boosting and SVM usually have robust performance, but require substantial human effort to collect and label training samples. This study aims to combine the comparative advantages of both kinds of approaches, and its contributions are two-fold: (i) a weakly supervised approach for object detection, which does not require manual collection and labelling of training samples; (ii) an extension of Boosting algorithm denoted as Soft-Label Boosting, which is able to employ training samples with soft (probabilistic) labels instead of hard (binary) labels. Experimental results show that the proposed weakly supervised approach outperforms the state-of-the-art, and even achieves comparable performance to supervised approaches.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Horst Bischof,et al.  On-line semi-supervised multiple-instance boosting , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Yang Wang,et al.  Saliency detection based on proto-objects and topic model , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[4]  C. J. Stone,et al.  An Asymptotically Optimal Window Selection Rule for Kernel Density Estimates , 1984 .

[5]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[6]  James W. Davis,et al.  Robust Background-Subtraction for Person Detection in Thermal Imagery , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  Serge J. Belongie,et al.  Matching with shape contexts , 2000, 2000 Proceedings Workshop on Content-based Access of Image and Video Libraries.

[8]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Wen Gao,et al.  Pedestrian detection via logistic multiple instance boosting , 2008, 2008 15th IEEE International Conference on Image Processing.

[10]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[11]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  James D. Keeler,et al.  Integrated Segmentation and Recognition of Hand-Printed Numerals , 1990, NIPS.

[13]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[14]  Yang Wang,et al.  A Dynamic Hidden Markov Random Field Model for Foreground and Shadow Segmentation , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[18]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[19]  Alberto Broggi,et al.  Pedestrian detection in infrared images , 2003, IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No.03TH8683).

[20]  M. Rosenblatt Remarks on Some Nonparametric Estimates of a Density Function , 1956 .

[21]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[22]  Kostia Robert Night-Time Traffic Surveillance: A Robust Framework for Multi-vehicle Detection, Classification and Tracking , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[23]  Xin Xu,et al.  Logistic Regression and Boosting for Labeled Bags of Instances , 2004, PAKDD.

[24]  Alain Rakotomamonjy,et al.  A Pedestrian Detector Using Histograms of Oriented Gradients and a Support Vector Machine Classifier , 2007, 2007 IEEE Intelligent Transportation Systems Conference.

[25]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[26]  James W. Davis,et al.  A Two-Stage Template Approach to Person Detection in Thermal Imagery , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[27]  Ming-Hsuan Yang,et al.  Visual tracking with online Multiple Instance Learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Serge J. Belongie,et al.  Simultaneous Learning and Alignment: Multi-Instance and Multi-Pose Learning ? , 2008 .

[29]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[30]  I. Parra,et al.  Automatic LightBeam Controller for driver assistance , 2011, Machine Vision and Applications.

[31]  Yang Wang,et al.  Unsupervised Moving Object Detection with On-line Generalized Hough Transform , 2010, ACCV.

[32]  Weihong Wang,et al.  A Two-Layer Night-Time Vehicle Detector , 2009, 2009 Digital Image Computing: Techniques and Applications.

[33]  Richard A. Davis,et al.  Remarks on Some Nonparametric Estimates of a Density Function , 2011 .

[34]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).