Bagging Provides Assumption-free Stability

Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places no assumptions on the distribution of the data, on the properties of the base algorithm, or on the dimensionality of the covariates. Our guarantee applies to many variants of bagging and is optimal up to a constant. Empirical results validate our findings, showing that bagging successfully stabilizes even highly unstable base algorithms.

[1]  H. Leeb,et al.  Conditional predictive inference for stable algorithms , 2018, The Annals of Statistics.

[2]  Pratik V. Patil,et al.  Asymptotics of the Sketched Pseudoinverse , 2022, ArXiv.

[3]  Vasilis Syrgkanis,et al.  Debiased Machine Learning without Sample-Splitting for Stable Estimators , 2022, NeurIPS.

[4]  Eugène Ndiaye Stable Conformal Prediction Sets , 2021, ICML.

[5]  R. Barber,et al.  Black box tests for algorithmic stability , 2021, ArXiv.

[6]  Zhimei Ren,et al.  Derandomizing Knockoffs , 2020, 2012.02717.

[7]  Richard Baraniuk,et al.  The Implicit Regularization of Ordinary Least Squares Ensembles , 2019, AISTATS.

[8]  E. Candès,et al.  The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression , 2018, The Annals of Statistics.

[9]  Ryan J. Tibshirani,et al.  Predictive inference with the jackknife+ , 2019, The Annals of Statistics.

[10]  Vitaly Feldman,et al.  Privacy-preserving Prediction , 2018, COLT.

[11]  James B. Brown,et al.  Iterative random forests to discover predictive and stable high-order interactions , 2017, Proceedings of the National Academy of Sciences.

[12]  Hannes Leeb,et al.  Leave-one-out prediction intervals in linear regression models with many variables , 2016, 1602.05801.

[13]  Yoram Singer,et al.  Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.

[14]  Ameet Talwalkar,et al.  Knowing when you're wrong: building fast and reliable approximate query processing systems , 2014, SIGMOD Conference.

[15]  Shie Mannor,et al.  Sparse Algorithms Are Not Stable: A No-Free-Lunch Theorem , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Purnamrita Sarkar,et al.  A scalable bootstrap for massive data , 2011, 1112.5016.

[17]  Rajen Dinesh Shah,et al.  Variable selection with error control: another look at stability selection , 2011, 1105.5578.

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  Ohad Shamir,et al.  Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..

[20]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[21]  T. Poggio,et al.  Sufficient Conditions for Uniform Stability of Regularization Algorithms , 2009 .

[22]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[23]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[24]  Shai Ben-David,et al.  Stability of k -Means Clustering , 2007, COLT.

[25]  Sayan Mukherjee,et al.  Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization , 2006, Adv. Comput. Math..

[26]  Massimiliano Pontil,et al.  Stability of Randomized Learning Algorithms , 2005, J. Mach. Learn. Res..

[27]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[28]  Y. Mansour,et al.  Generalization bounds for averaged classifiers , 2004, math/0410092.

[29]  Yves Grandvalet,et al.  Bagging Equalizes Influence , 2004, Machine Learning.

[30]  T. Poggio,et al.  General conditions for predictivity in learning theory , 2004, Nature.

[31]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[32]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[33]  Edward F. Harrington,et al.  Online Ranking/Collaborative Filtering Using the Perceptron Algorithm , 2003, ICML.

[34]  Partha Niyogi,et al.  Almost-everywhere Algorithmic Stability and Generalization Error , 2002, UAI.

[35]  Massimiliano Pontil,et al.  A Simple Algorithm for Learning Stable Machines , 2002, ECAI.

[36]  Giorgio Valentini,et al.  Ensembles of Learning Machines , 2002, WIRN.

[37]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[38]  T. Poggio,et al.  Bagging Regularizes , 2002 .

[39]  P. Bühlmann,et al.  Analyzing Bagging , 2001 .

[40]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[41]  Andreas Buja,et al.  Smoothing Effects of Bagging , 2000 .

[42]  Dana Ron,et al.  Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation , 1997, Neural Computation.

[43]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[44]  Luc Devroye,et al.  Distribution-free performance bounds for potential function rules , 1979, IEEE Trans. Inf. Theory.

[45]  Luc Devroye,et al.  Distribution-free inequalities for the deleted and holdout error estimates , 1979, IEEE Trans. Inf. Theory.

[46]  W. Rogers,et al.  A Finite Sample Distribution-Free Performance Bound for Local Discrimination Rules , 1978 .