HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO

We consider high-dimensional generalized linear models with Lipschitz loss functions, and prove a nonasymptotic oracle inequality for the empirical risk minimizer with Lasso penalty. The penalty is based on the coefficients in the linear predictor, after normalization with the empirical norm. The examples include logistic regression, density estimation and classification with hinge loss. Least squares regression is also discussed.

[1]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[2]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[3]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[4]  P. Massart Some applications of concentration inequalities to statistics , 2000 .

[5]  P. Massart,et al.  About the constants in Talagrand's concentration inequalities for empirical processes , 2000 .

[6]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[7]  S. R. Jammalamadaka,et al.  Empirical Processes in M-Estimation , 2001 .

[8]  O. Bousquet A Bennett concentration inequality and its application to suprema of empirical processes , 2002 .

[9]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[10]  S. Geer Adaptive quantile regression , 2003 .

[11]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[12]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[13]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[14]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[15]  S. Geer,et al.  Classifiers of support vector machine type with \ell1 complexity regularization , 2006 .

[16]  E. Greenshtein Best subset selection, persistence in high-dimensional statistical learning and optimization under l1 constraint , 2006, math/0702684.

[17]  Florentina Bunea,et al.  Aggregation and sparsity via 1 penalized least squares , 2006 .

[18]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[19]  Florentina Bunea,et al.  Sparse Density Estimation with l1 Penalties , 2007, COLT.

[20]  Peter Bühlmann,et al.  Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries , 2007, BMC Bioinformatics.

[21]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[22]  A. Tsybakov,et al.  Sparsity oracle inequalities for the Lasso , 2007, 0705.3308.

[23]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[24]  N. Meinshausen,et al.  LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA , 2008, 0806.0145.

[25]  P. Sabatier Adaptive estimation in regression , using soft thresholding type penalties , 2022 .