Regularization Through Feature Knock Out

Abstract : In this paper, we present and analyze a novel regularization technique based on enhancing our dataset with corrupted copies of the original data. The motivation is that since the learning algorithm lacks information about which parts of the data are reliable, it has to produce more robust classification functions. We then demonstrate how this regularization leads to redundancy in the resulting classifiers, which is somewhat in contrast to the common interpretations of the Occam's razor principle. Using this framework, we propose a simple addition to the gentle boosting algorithm which enables it to work with only a few examples. We test this new algorithm on a variety of datasets and show convincing results.

[1]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[2]  Arnold Neumaier,et al.  Solving Ill-Conditioned and Singular Linear Systems: A Tutorial on Regularization , 1998, SIAM Rev..

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[5]  V. Koltchinskii,et al.  Complexities of convex combinations and bounding the generalization error in classification , 2004, math/0405356.

[6]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[8]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[9]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[10]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Pedro M. Domingos A Unified Bias-Variance Decomposition for Zero-One and Squared Loss , 2000, AAAI/IAAI.

[12]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[13]  G. Lugosi,et al.  On Concentration-of-Measure Inequalities , 1998 .

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[16]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..