Smoothly Clipped Absolute Deviation on High Dimensions

The smoothly clipped absolute deviation (SCAD) estimator, proposed by Fan and Li, has many desirable properties, including continuity, sparsity, and unbiasedness. The SCAD estimator also has the (asymptotically) oracle property when the dimension of covariates is fixed or diverges more slowly than the sample size. In this article we study the SCAD estimator in high-dimensional settings where the dimension of covariates can be much larger than the sample size. First, we develop an efficient optimization algorithm that is fast and always converges to a local minimum. Second, we prove that the SCAD estimator still has the oracle property on high-dimensional problems. We perform numerical studies to compare the SCAD estimator with the LASSO and SIS–SCAD estimators in terms of prediction accuracy and variable selectivity when the true model is sparse. Through the simulation, we show that the variance estimator of Fan and Li still works well for some limited high-dimensional cases where the true nonzero coefficients are not too small and the sample size is moderately large. We apply the proposed algorithm to analyze a high-dimensional microarray data set.

[1]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[2]  G. Perdew,et al.  Regulation of Gene Expression , 2008, Goodman's Medical Cell Biology.

[3]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[6]  Le Thi Hoai An,et al.  Solving a Class of Linearly Constrained Indefinite Quadratic Problems by D.C. Algorithms , 1997, J. Glob. Optim..

[7]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[8]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[9]  Jianqing Fan,et al.  Regularization of Wavelet Approximations , 2001 .

[10]  W. Wong,et al.  On ψ-Learning , 2003 .

[11]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[12]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[13]  Y. Ritov,et al.  Persistence in high-dimensional linear predictor selection and the virtue of overparametrization , 2004 .

[14]  Bogdan E. Popescu,et al.  Gradient Directed Regularization , 2004 .

[15]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[16]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[17]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[18]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[19]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[20]  V. Sheffield,et al.  Regulation of gene expression in the mammalian eye and its relevance to eye disease , 2006, Proceedings of the National Academy of Sciences.

[21]  E. Greenshtein Best subset selection, persistence in high-dimensional statistical learning and optimization under l1 constraint , 2006, math/0702684.

[22]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[23]  S. Rosset,et al.  Piecewise linear regularized solution paths , 2007, 0708.2197.

[24]  Xiaotong Shen,et al.  On L1-Norm Multiclass Support Vector Machines , 2007 .

[25]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[26]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .