Localized Gaussian width of $M$-convex hulls with applications to Lasso and convex aggregation

Upper and lower bounds are derived for the Gaussian mean width of the intersection of a convex hull of $M$ points with an Euclidean ball of a given radius. The upper bound holds for any collection of extreme point bounded in Euclidean norm. The upper bound and the lower bound match up to a multiplicative constant whenever the extreme points satisfy a one sided Restricted Isometry Property. This bound is then applied to study the Lasso estimator in fixed-design regression, the Empirical Risk Minimizer in the anisotropic persistence problem, and the convex aggregation problem in density estimation.

[1]  A. Tsybakov,et al.  Exponential Screening and optimal rates of sparse estimation , 2010, 1003.2654.

[2]  M.E.Sc. Wieslaw Stepniewski,et al.  The Prediction of Performance , 2013 .

[3]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[4]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[5]  A. Tsybakov,et al.  Linear and convex aggregation of density estimators , 2006, math/0605292.

[6]  Shahar Mendelson,et al.  Learning without Concentration , 2014, COLT.

[7]  A. Tsybakov,et al.  Sparse Estimation by Exponential Weighting , 2011, 1108.5116.

[8]  P. Bartlett,et al.  ℓ1-regularized linear regression: persistence and oracle inequalities , 2012 .

[9]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[10]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[11]  S. Mendelson,et al.  Learning subgaussian classes : Upper and minimax bounds , 2013, 1305.4825.

[12]  E. Candès,et al.  Near-ideal model selection by ℓ1 minimization , 2008, 0801.0345.

[13]  P. Rigollet,et al.  Optimal learning with Q-aggregation , 2013, 1301.6080.

[14]  Guillaume Lecué Lower Bounds and Aggregation in Density Estimation , 2006, J. Mach. Learn. Res..

[15]  A. Tsybakov,et al.  Slope meets Lasso: Improved oracle bounds and optimality , 2016, The Annals of Statistics.

[16]  G. Pisier Remarques sur un résultat non publié de B. Maurey , 1981 .

[17]  Shahar Mendelson,et al.  Gaussian averages of interpolated bodies and applications to approximate reconstruction , 2007, J. Approx. Theory.

[18]  P. Bellec Sharp oracle inequalities for Least Squares estimators in shape restricted regression , 2015, 1510.08029.

[19]  P. Bellec The noise barrier and the large signal bias of the Lasso and other convex estimators , 2018, 1804.01230.

[20]  A. Juditsky,et al.  Learning by mirror averaging , 2005, math/0511468.

[21]  Yaniv Plan,et al.  The Generalized Lasso With Non-Linear Observations , 2015, IEEE Transactions on Information Theory.

[22]  A. Tsybakov,et al.  Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[23]  Cun-Hui Zhang,et al.  Scaled sparse linear regression , 2011, 1104.4595.

[24]  P. Bellec Optimistic lower bounds for convex regularized least-squares , 2017, 1703.01332.

[25]  A. Dalalyan,et al.  On the prediction loss of the lasso in the partially labeled setting , 2016, 1606.06179.

[26]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[27]  Karim Lounici Generalized mirror averaging and D-convex aggregation , 2007 .

[28]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[29]  Tong Zhang,et al.  Deviation Optimal Learning using Greedy Q-aggregation , 2012, ArXiv.

[30]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[31]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[32]  Arnak S. Dalalyan,et al.  Aggregation by Exponential Weighting and Sharp Oracle Inequalities , 2007, COLT.

[33]  P. Bartlett,et al.  Empirical minimization , 2006 .

[34]  Y. Plan,et al.  High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[35]  Andrew R. Barron,et al.  Information Theory and Mixing Least-Squares Regressions , 2006, IEEE Transactions on Information Theory.

[36]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[37]  C. Giraud Introduction to High-Dimensional Statistics , 2014 .

[38]  Optimal bounds for aggregation of affine estimators , 2014, 1410.0346.

[39]  Pierre C. Bellec,et al.  Bounds on the Prediction Error of Penalized Least Squares Estimators with Convex Penalty , 2016, 1609.06675.

[40]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[41]  Guillaume Lecu'e,et al.  Empirical risk minimization is optimal for the convex aggregation problem , 2013, 1312.4349.

[42]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[43]  Yuhong Yang Mixing Strategies for Density Estimation , 2000 .

[44]  S. Chatterjee A new perspective on least squares under convex constraint , 2014, 1402.0830.

[45]  S. Mendelson,et al.  Aggregation via empirical risk minimization , 2009 .

[46]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[47]  Y. Ritov,et al.  Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[48]  P. Bellec Optimal exponential bounds for aggregation of density estimators , 2014, 1405.3907.

[49]  A. Tsybakov Aggregation and minimax optimality in high-dimensional estimation , 2014 .

[50]  Alexandre B. Tsybakov,et al.  Optimal Rates of Aggregation , 2003, COLT.

[51]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[52]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .