Regularized Quantile Regression and Robust Feature Screening for Single Index Models.

We propose both a penalized quantile regression and an independence screening procedure to identify important covariates and to exclude unimportant ones for a general class of ultrahigh dimensional single-index models, in which the conditional distribution of the response depends on the covariates via a single-index structure. We observe that the linear quantile regression yields a consistent estimator of the direction of the index parameter in the single-index model. Such an observation dramatically reduces computational complexity in selecting important covariates in the single-index model. We establish an oracle property for the penalized quantile regression estimator when the covariate dimension increases at an exponential rate of the sample size. From a practical perspective, however, when the covariate dimension is extremely large, the penalized quantile regression may suffer from at least two drawbacks: computational expediency and algorithmic stability. To address these issues, we propose an independence screening procedure which is robust to model misspecification, and has reliable performance when the distribution of the response variable is heavily tailed or response realizations contain extreme values. The new independence screening procedure offers a useful complement to the penalized quantile regression since it helps to reduce the covariate dimension from ultrahigh dimensionality to a moderate scale. Based on the reduced model, the penalized linear quantile regression further refines selection of important covariates at different quantile levels. We examine the finite sample performance of the newly proposed procedure by Monte Carlo simulations and demonstrate the proposed methodology by an empirical analysis of a real data set.

[1]  Jinchi Lv,et al.  High dimensional thresholded regression and shrinkage effect , 2014, 1605.03306.

[2]  Jianqing Fan,et al.  ADAPTIVE ROBUST VARIABLE SELECTION. , 2012, Annals of statistics.

[3]  Yingying Fan,et al.  Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space , 2013, 1605.03310.

[4]  Runze Li,et al.  SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. , 2012, Statistica Sinica.

[5]  Runze Li,et al.  Feature Screening via Distance Correlation Learning , 2012, Journal of the American Statistical Association.

[6]  Runze Li,et al.  Quantile Regression for Analyzing Heterogeneity in Ultra-High Dimension , 2012, Journal of the American Statistical Association.

[7]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[8]  Jin-Guan Lin,et al.  Variable selection in a class of single-index models , 2011 .

[9]  Runze Li,et al.  Model-Free Feature Screening for Ultrahigh-Dimensional Data , 2011, Journal of the American Statistical Association.

[10]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[11]  Runze Li,et al.  ESTIMATION AND TESTING FOR PARTIALLY LINEAR SINGLE-INDEX MODELS. , 2010, Annals of statistics.

[12]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[13]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[14]  Yufeng Liu,et al.  VARIABLE SELECTION IN QUANTILE REGRESSION , 2009 .

[15]  Peter Hall,et al.  Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems , 2009 .

[16]  Jeffrey S. Morris,et al.  Sure independence screening for ultrahigh dimensional feature space Discussion , 2008 .

[17]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[18]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[19]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[20]  Yingcun Xia,et al.  Variable selection for the single‐index model , 2007 .

[21]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[22]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[23]  P. Altham Improving the Precision of Estimation by fitting a Generalized Linear Model , and Quasi-likelihood . , 2006 .

[24]  Kam D. Dahlquist,et al.  Regression Approaches for Microarray Data Analysis , 2002, J. Comput. Biol..

[25]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[26]  Prasad A. Naik,et al.  Single‐index model selections , 2001 .

[27]  Q. Shao,et al.  On Parameters of Increasing Dimensions , 2000 .

[28]  Keith Knight,et al.  Limiting distributions for $L\sb 1$ regression estimators under general conditions , 1998 .

[29]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[30]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[31]  Ker-Chau Li,et al.  On almost Linearity of Low Dimensional Projections from High Dimensional Data , 1993 .

[32]  W. Härdle,et al.  Optimal Smoothing in Single-index Models , 1993 .

[33]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[34]  Thomas M. Stoker,et al.  Semiparametric Estimation of Index Coefficients , 1989 .

[35]  P. M. E. Altham,et al.  Improving the Precision of Estimation by Fitting a Model , 1984 .

[36]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .