A simple forward selection procedure based on false discovery rate control

We propose the use of a new false discovery rate (FDR) controlling procedure as a model selection penalized method, and compare its performance to that of other penalized methods over a wide range of realistic settings: nonorthogonal design matrices, moderate and large pool of explanatory variables, and both sparse and nonsparse models, in the sense that they may include a small and large fraction of the potential variables (and even all). The comparison is done by a comprehensive simulation study, using a quantitative framework for performance comparisons in the form of empirical minimaxity relative to a "random oracle": the oracle model selection performance on data dependent forward selected family of potential models. We show that FDR based procedures have good performance, and in particular the newly proposed method, emerges as having empirical minimax performance. Interestingly, using FDR level of 0.05 is a global best.

[1]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[2]  D. F. Andrews,et al.  Robust Estimates of Location: Survey and Advances. , 1975 .

[3]  C. J. Lawrence Robust estimates of location : survey and advances , 1975 .

[4]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[5]  Y. Benjamini,et al.  Thresholding of Wavelet Coefficients as Multiple Hypotheses Testing Procedure , 1995 .

[6]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Dean Phillips Foster,et al.  Calibration and Empirical Bayes Variable Selection , 1997 .

[9]  R. Tibshirani,et al.  The Covariance Inflation Criterion for Adaptive Model Selection , 1999 .

[10]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[11]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[12]  Dean P. Foster,et al.  Variable Selection in Data Mining , 2004 .

[13]  Xiaotong Shen,et al.  Adaptive Model Selection , 2002 .

[14]  S. Sarkar Some Results on False Discovery Rate in Stepwise multiple testing procedures , 2002 .

[15]  John D. Storey A direct approach to false discovery rates , 2002 .

[16]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[17]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[18]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[19]  I. Johnstone,et al.  Adapting to unknown sparsity by controlling the false discovery rate , 2005, math/0505374.

[20]  I. Johnstone,et al.  Empirical Bayes selection of wavelet thresholds , 2005, math/0508281.

[21]  Y. Benjamini,et al.  Adaptive linear step-up procedures that control the false discovery rate , 2006 .

[22]  L. Stefanski,et al.  Approved by: Project Leader Approved by: LCG Project Leader Prepared by: Project Manager Prepared by: LCG Project Manager Reviewed by: Quality Assurance Manager , 2004 .

[23]  M. Yuan,et al.  Dimension reduction and coefficient estimation in multivariate linear regression , 2007 .

[24]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[25]  T. Dickhaus,et al.  On the false discovery rate and an asymptotically optimal rejection curve , 2009, 0903.5161.

[26]  Y. Benjamini,et al.  An adaptive step-down procedure with proven FDR control under independence , 2009, 0903.5373.

[27]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[28]  Peter J. Bickel,et al.  Hierarchical selection of variables in sparse high-dimensional regression , 2008, 0801.1158.