Variable selection in high-dimensional partially linear additive models for composite quantile regression

A new estimation procedure based on the composite quantile regression is proposed for the semiparametric additive partial linear models, of which the nonparametric components are approximated by polynomial splines. The proposed estimation method can simultaneously estimate both the parametric regression coefficients and nonparametric components without any specification of the error distributions. The proposed estimation method is empirically shown to be much more efficient than the popular least-squares-based estimation method for non-normal random errors, especially for Cauchy error, and almost as efficient for normal random errors. To achieve sparsity in high-dimensional and sparse additive partial linear models, of which the number of linear covariates is much larger than the sample size but that of significant covariates is small relative to the sample size, a variable selection procedure based on adaptive Lasso is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property, and is much superior to the adaptive Lasso penalized least-squares-based method regardless of the random error distributions. In particular, two kinds of weights in the penalty are considered, namely the composite quantile regression estimates and Lasso penalized composite quantile regression estimates. Both types of weights perform very well with the latter performing especially well in terms of precisely selecting significant variables. The simulation results are consistent with the theoretical properties. A real data example is used to illustrate the application of the proposed methods.

[1]  H. Zou,et al.  Composite quantile regression and the oracle Model Selection Theory , 2008, 0806.2905.

[2]  S. D. Silvey,et al.  Optimal design measures with singular information matrices , 1978 .

[3]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[4]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[5]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[6]  Giampiero Marra,et al.  Practical variable selection for generalized additive models , 2011, Comput. Stat. Data Anal..

[7]  David Ruppert,et al.  A Root-n Consistent Backfitting Estimator for Semiparametric Additive Modeling , 1999 .

[8]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[9]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[10]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Wolfgang Härdle,et al.  Direct estimation of low-dimensional components in additive models , 1998 .

[13]  Kai,et al.  NEW EFFICIENT AND ROBUST ESTIMATION IN VARYING-COEFFICIENT MODELS WITH HETEROSCEDASTICITY , 2012 .

[14]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[15]  David Ruppert,et al.  Fitting a Bivariate Additive Model by Local Polynomial Regression , 1997 .

[16]  Guang Cheng,et al.  Semiparametric regression models with additive nonparametric components and high dimensional parametric components , 2012, Comput. Stat. Data Anal..

[17]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[18]  L. Schumaker Spline Functions: Basic Theory , 1981 .

[19]  Keith Knight,et al.  Limiting distributions for $L\sb 1$ regression estimators under general conditions , 1998 .

[20]  David Ruppert,et al.  Additive Partial Linear Models with Measurement Errors. , 2008, Biometrika.

[21]  Hua Liang,et al.  Estimation and Variable Selection for Semiparametric Additive Partial Linear Models (SS-09-140). , 2011, Statistica Sinica.

[22]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[23]  R. Koenker Additive models for quantile regression: Model selection and confidence bandaids , 2010 .

[24]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[25]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[26]  Fengrong Wei,et al.  Group selection in high-dimensional partially linear additive models , 2012 .

[27]  Jianqing Fan,et al.  Generalized Partially Linear Single-Index Models , 1997 .