论文信息 - Boosting for high-dimensional linear models

Boosting for high-dimensional linear models

We prove that boosting with the squared error loss, L 2 Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as O (exp(sample size)), assuming that the true underlying regression function is sparse in terms of the ℓ 1 -norm of the regression coeﬃcients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the ℓ 1 -norm. We also propose here an AIC based method for tuning, namely for choosing the number of boosting iterations. This makes L 2 Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate L 2 Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a diﬃcult tumor-classiﬁcation problem with gene expression microarray data.

Peter Buhlmann

[1] P. Bühlmann,et al. Sparse Boosting , 2006, J. Mach. Learn. Res..

[2] Bin Yu,et al. Boosting with early stopping: Convergence and consistency , 2005, math/0508276.

[3] H. Zou,et al. Regularization and variable selection via the elastic net , 2005 .

[4] Y. Ritov,et al. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[5] Peter Bühlmann,et al. Finding predictive gene groups from microarray data , 2004 .

[6] D. Madigan,et al. [Least Angle Regression]: Discussion , 2004 .

[7] G. Lugosi,et al. On the Bayes-risk consistency of regularized boosting methods , 2003 .

[8] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .

[9] S. Dudoit,et al. Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[10] Alexander Goldenshluger,et al. Adaptive Prediction and Estimation in Linear Regression with Infinitely Many Parameters , 2001 .

[11] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .

[12] R. Spang,et al. Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13] P. Bühlmann,et al. Boosting with the L2-loss: regression and classification , 2001 .

[14] Leo Breiman,et al. Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[15] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[16] Clifford M. Hurvich,et al. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion , 1998 .

[17] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.

[18] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[19] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[20] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[21] M. Braga,et al. Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..