论文信息 - Self-concordant analysis for logistic regression - 字舞流文

Self-concordant analysis for logistic regression

Most of the non-asymptotic theoretical work in regression is carried out for the square loss, where estimators can be obtained through closed-form expressions. In this paper, we use and extend tools from the convex optimization literature, namely self-concordant functions, to provide simple extensions of theoretical results for the square loss to the logistic loss. We apply the extension techniques to logistic regression with regularization by the $\ell_2$-norm and regularization by the $\ell_1$-norm, showing that new results for binary classification through logistic regression can be easily derived from corresponding results for least-squares regression.

Francis R. Bach | F. Bach

[1] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .

[2] C. L. Mallows. Some comments on C_p , 1973 .

[3] Peter Craven,et al. Smoothing noisy data with spline functions , 1978 .

[4] B. Yandell,et al. Automatic Smoothing of Regression Functions in Generalized Linear Models , 1986 .

[5] H. Bozdogan. Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[6] Ker-Chau Li,et al. Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set , 1987 .

[7] David W. Hosmer,et al. Applied Logistic Regression , 1991 .

[8] R. Shibata. Statistical aspects of model selection , 1989 .

[9] G. Wahba. Spline models for observational data , 1990 .

[10] Chong Gu. Adaptive Spline Smoothing in Non-Gaussian Regression Models , 1990 .

[11] P. Mykland,et al. Nonlinear Experiments: Optimal Design and Inference Based on Likelihood , 1993 .

[12] Yurii Nesterov,et al. Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[13] C. Mallows. More comments on C p , 1995 .

[14] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[15] Ronald Christensen,et al. Log-Linear Models and Logistic Regression , 1997 .

[16] Adrian S. Lewis,et al. Convex Analysis And Nonlinear Optimization , 2000 .

[17] Colin L. Mallows,et al. Some Comments on Cp , 2000, Technometrics.

[18] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.

[19] Chong Gu. Smoothing Spline Anova Models , 2002 .

[20] P. Reynaud-Bouret,et al. Exponential Inequalities, with Constants, for U-statistics of Order Two , 2003 .

[21] Jean Charles Gilbert,et al. Numerical Optimization: Theoretical and Practical Aspects , 2003 .

[22] Philip D. Plowright,et al. Convexity , 2019, Optimization for Chemical and Biochemical Engineering.

[23] B. Efron. The Estimation of Prediction Error , 2004 .

[24] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .

[25] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[26] N. Meinshausen,et al. High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[27] Ingo Steinwart,et al. A new concentration result for regularized risk minimizers , 2006, math/0612779.

[28] H. Zou. The Adaptive Lasso and Its Oracle Properties , 2006 .

[29] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[30] Peng Zhao,et al. On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[31] M. Yuan,et al. On the non‐negative garrotte estimator , 2007 .

[32] Alexandre d'Aspremont,et al. Model Selection Through Sparse Maximum Likelihood Estimation , 2007, ArXiv.

[33] Zaïd Harchaoui,et al. Testing for Homogeneity with Kernel Fisher Discriminant Analysis , 2007, NIPS.

[34] S. Geer. HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[35] Francis R. Bach,et al. Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[36] N. Meinshausen,et al. Stability selection , 2008, 0809.2932.

[37] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.

[38] Nathan Srebro,et al. Fast Rates for Regularized Objectives , 2008, NIPS.

[39] Francis R. Bach,et al. Bolasso: model consistent Lasso estimation through the bootstrap , 2008, ICML '08.

[40] F. Bunea. Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization , 2008, 0808.4051.

[41] Francis R. Bach,et al. Data-driven calibration of linear estimators with minimal penalties , 2009, NIPS.

[42] Tong Zhang. Some sharp performance bounds for least squares regression with L1 regularization , 2009, 0908.2869.

[43] P. Bickel,et al. SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[44] Arkadi Nemirovski,et al. On verifiable sufficient conditions for sparse signal recovery via ℓ1 minimization , 2008, Math. Program..

[45] Robert H. Halstead,et al. Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[46] Alexandre d'Aspremont,et al. Testing the nullspace property using semidefinite programming , 2008, Math. Program..

[47] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .