Penalized principal logistic regression for sparse sufficient dimension reduction

Sufficient dimension reduction (SDR) is a successive tool for reducing the dimensionality of predictors by finding the central subspace, a minimal subspace of predictors that preserves all the regression information. When predictor dimension is large, it is often assumed that only a small number of predictors is informative. In this regard, sparse SDR is desired to achieve variable selection and dimension reduction simultaneously. We propose a principal logistic regression (PLR) as a new SDR tool and further develop its penalized version for sparse SDR. Asymptotic analysis shows that the penalized PLR enjoys the oracle property. Numerical investigation supports the advantageous performance of the proposed methods.

[1]  Bing Li,et al.  Principal support vector machines for linear and nonlinear sufficient dimension reduction , 2011, 1203.2790.

[2]  Howard D. Bondell,et al.  Shrinkage inverse regression estimation for model‐free variable selection , 2009 .

[3]  Yingcun Xia,et al.  Sliced Regression for Dimension Reduction , 2008 .

[4]  Yufeng Liu,et al.  VARIABLE SELECTION IN QUANTILE REGRESSION , 2009 .

[5]  H. Zha,et al.  Contour regression: A general approach to dimension reduction , 2005, math/0508277.

[6]  Lexin Li,et al.  Sparse sufficient dimension reduction , 2007 .

[7]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[8]  Lexin Li,et al.  ASYMPTOTIC PROPERTIES OF SUFFICIENT DIMENSION REDUCTION WITH A DIVERGING NUMBER OF PREDICTORS. , 2011, Statistica Sinica.

[9]  D. Hunter,et al.  Variable Selection using MM Algorithms. , 2005, Annals of statistics.

[10]  Shaoli Wang,et al.  On Directional Regression for Dimension Reduction , 2007 .

[11]  R. Cook,et al.  Using intraslice covariances for improved estimation of the central subspace in regression , 2006 .

[12]  Ker-Chau Li,et al.  On almost Linearity of Low Dimensional Projections from High Dimensional Data , 1993 .

[13]  B. Li,et al.  Dimension reduction for nonelliptically distributed predictors , 2009, 0904.3842.

[14]  Ker-Chau Li Sliced inverse regression for dimension reduction (with discussion) , 1991 .

[15]  Hoai An Le Thi,et al.  Solving a Class of Linearly Constrained Indefinite Quadratic Problems by D.C. Algorithms , 1997 .

[16]  Ruth M. Pfeiffer,et al.  On the distribution of the left singular vectors of a random matrix and its applications , 2008 .

[17]  D. Pollard Asymptotics for Least Absolute Deviation Regression Estimators , 1991, Econometric Theory.

[18]  Ker-Chau Li,et al.  Regression Analysis Under Link Violation , 1989 .

[19]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[20]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[21]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[22]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[23]  Jian Huang,et al.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION. , 2011, The annals of applied statistics.

[24]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .

[25]  R. Dennis Cook,et al.  Testing predictor contributions in sufficient dimension reduction , 2004, math/0406520.

[26]  R. Cook Graphics for regressions with a binary response , 1996 .

[27]  Richard A. Lewis,et al.  Drug design by machine learning: the use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[28]  S. Weisberg,et al.  Comments on "Sliced inverse regression for dimension reduction" by K. C. Li , 1991 .