Nonconcave penalized likelihood with a diverging number of parameters

A class of variable selection procedures for parametric models via nonconcave penalized likelihood was proposed by Fan and Li to simultaneously estimate parameters and select important variables. They demonstrated that this class of procedures has an oracle property when the number of parameters is finite. However, in most model selection problems the number of parameters should be large and grow with the sample size. In this paper some asymptotic properties of the nonconcave penalized likelihood are established for situations in which the number of parameters tends to ∞ as the sample size increases. Under regularity conditions we have established an oracle property and the asymptotic normality of the penalized likelihood estimators. Furthermore, the consistency of the sandwich formula of the covariance matrix is demonstrated. Nonconcave penalized likelihood ratio statistics are discussed, and their asymptotic distributions under the null hypothesis are obtained by imposing some mild conditions on the penalty functions. The asymptotic results are augmented by a simulation study, and the newly developed methodology is illustrated by an analysis of a court case on the sexual discrimination of salary.

[1]  J. Neyman,et al.  Consistent Estimates Based on Partially Consistent Observations , 1948 .

[2]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[3]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[4]  C. L. Mallows Some comments on C_p , 1973 .

[5]  P. J. Huber Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[8]  Carl de Boor,et al.  A Practical Guide to Splines , 1978, Applied Mathematical Sciences.

[9]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[10]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[11]  S. Portnoy Asymptotic Behavior of Likelihood Methods for Exponential Families when the Number of Parameters Tends to Infinity , 1988 .

[12]  Stuart German,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1988 .

[13]  Andrew Blake,et al.  Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  G. Wahba Spline models for observational data , 1990 .

[15]  D. Cox,et al.  Asymptotic Analysis of Penalized Likelihood and Related Estimators , 1990 .

[16]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[17]  Susan A. Murphy,et al.  Testing for a Time Dependent Coefficient in Cox's Regression Model , 1993 .

[18]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[19]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[20]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[21]  C. Mallows More comments on C p , 1995 .

[22]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[25]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[26]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[27]  Ali Mohammad-Djafari,et al.  Inversion of large-support ill-posed linear operators using a piecewise Gaussian MRF , 1998, IEEE Trans. Image Process..

[28]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[29]  S. Christian Albright,et al.  Data Analysis and Decision Making with Microsoft Excel , 1999 .

[30]  Yuehua Wu,et al.  Model selection with data-oriented penalty , 1999 .

[31]  Calyampudi R. Rao,et al.  Model Selection with Data-Oriented Penalty , 1999 .

[32]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[33]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[34]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[35]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[36]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[37]  R. Carroll,et al.  A Note on the Efficiency of Sandwich Covariance Matrix Estimation , 2001 .

[38]  Xiaotong Shen,et al.  Adaptive Model Selection , 2002 .

[39]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[40]  Jianqing Fan,et al.  Variable Selection for Cox's proportional Hazards Model and Frailty Model , 2002 .

[41]  S. Christian Albright,et al.  Data Analysis and Decision Making , 2004 .