Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized likelihood approaches are proposed to handle these kinds of problems. The proposed methods select variables and estimate coefficients simultaneously. Hence they enable us to construct confidence intervals for estimated parameters. The proposed approaches are distinguished from others in that the penalty functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing penalized likelihood functions. The proposed ideas are widely applicable. They are readily applied to a variety of parametric models such as generalized linear models and robust regression models. They can also be applied easily to nonparametric modeling by using wavelets and splines. Rates of convergence of the proposed penalized likelihood estimators are established. Furthermore, with proper choice of regularization parameters, we show that the proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well as if the correct submodel were known. Our simulation shows that the newly proposed methods compare favorably with other variable selection techniques. Furthermore, the standard error formulas are tested to be accurate enough for practical applications.

[1]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[2]  P. Bickel One-Step Huber Estimates in the Linear Model , 1975 .

[3]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[4]  P. Robinson,et al.  The stochastic difference between econometric statistics , 1988 .

[5]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[6]  G. Wahba Spline models for observational data , 1990 .

[7]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[8]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[9]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[10]  I. Johnstone,et al.  Minimax risk overlp-balls forlp-error , 1994 .

[11]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[12]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[15]  Young K. Truong,et al.  Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture , 1997 .

[16]  A. Bruce,et al.  WAVESHRINK WITH FIRM SHRINKAGE , 1997 .

[17]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[18]  A. Antoniadis Wavelets in statistics: A review , 1997 .

[19]  Jianqing Fan,et al.  Comments on «Wavelets in statistics: A review» by A. Antoniadis , 1997 .

[20]  Michael H. Neumann,et al.  Exact Risk Analysis of Wavelet Regression , 1998 .

[21]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[22]  W. Andrew LO, . Finance: Survey.. Journal of the American Statistical Association, , . , 2000 .

[23]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[24]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[25]  D. Donoho,et al.  Minimax risk over / p-balls for / q-error , 2022 .