Improved Boosting Performance by Explicit Handling of Ambiguous Positive Examples

Visual classes naturally have ambiguous examples, that are different depending on feature and classifier and are hard to disambiguate from surrounding negatives without overfitting. Boosting in particular tends to overfit to such hard and ambiguous examples, due to its flexibility and typically aggressive loss functions. We propose a two-pass learning method for identifying ambiguous examples and relearning, either subjecting them to an exclusion function or using them in a later stage of an inverted cascade. We provide an experimental comparison of different boosting algorithms on the VOC2007 dataset, training them with and without our proposed extension. Using our exclusion extension improves the performance of almost all of the tested boosting algorithms, without adding any additional test-time cost. Our proposed inverted cascade adds some test-time cost but gives additional improvements in performance. Our results also suggest that outlier exclusion is complementary to positive jittering and hard negative mining.

[1]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent in Function Space , 2007 .

[2]  Horst Bischof,et al.  On robustness of on-line boosting - a competitive study , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[3]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[4]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[5]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[7]  Andrew Zisserman,et al.  Efficient discriminative learning of parts-based models , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[9]  Gunnar Rätsch,et al.  Boosting Algorithms for Maximizing the Soft Margin , 2007, NIPS.

[10]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[11]  Rocco A. Servedio,et al.  Random classification noise defeats all convex potential boosters , 2008, ICML '08.

[12]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[13]  Kristen Grauman,et al.  Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds , 2011, CVPR 2011.

[14]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[15]  Charless C. Fowlkes,et al.  Do We Need More Training Data or Better Models for Object Detection? , 2012, BMVC.

[16]  Nuno Vasconcelos,et al.  On the design of robust classifiers for computer vision , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Pietro Perona,et al.  Pruning training sets for learning of object categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[21]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[22]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT '99.

[23]  Yoav Freund,et al.  A more robust boosting algorithm , 2009, 0905.2138.

[24]  Ivan Laptev,et al.  Improving object detection with boosted histograms , 2009, Image Vis. Comput..

[25]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[26]  Alexander Vezhnevets,et al.  Avoiding Boosting Overfitting by Removing Confusing Samples , 2007, ECML.

[27]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[28]  Nuno Vasconcelos,et al.  On the Design of Loss Functions for Classification: theory, robustness to outliers, and SavageBoost , 2008, NIPS.

[29]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .