Conditional sure independence screening by conditional marginal empirical likelihood

In many applications, researchers often know a certain set of predictors is related to the response from some previous investigations and experiences. Based on the conditional information, we propose a conditional screening feature procedure via ranking conditional marginal empirical likelihood ratios. Due to the use of centralized variable, the proposed screening approach works well when there exist either or both hidden important variables and unimportant variables that are highly marginal correlated with the response. Moreover, the new method is demonstrated effective in scenarios with less restrictive distributional assumptions by inheriting the advantage of empirical likelihood approach and is computationally simple because it only needs to evaluate the conditional marginal empirical likelihood ratio at one point, without parameter estimation and iterative algorithm. The theoretical results reveal that the proposed procedure has sure screening properties. The merits of the procedure are illustrated by extensive numerical examples.

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  J. Lawless,et al.  Empirical Likelihood and General Estimating Equations , 1994 .

[3]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[4]  Yichao Wu,et al.  MARGINAL EMPIRICAL LIKELIHOOD AND SURE INDEPENDENCE FEATURE SCREENING. , 2013, Annals of statistics.

[5]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[6]  Xiaohong Chen,et al.  High dimensional generalized empirical likelihood for moment restrictions with dependent data , 2013, 1308.5732.

[7]  Runze Li,et al.  Model-Free Feature Screening for Ultrahigh-Dimensional Data , 2011, Journal of the American Statistical Association.

[8]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[9]  Liang Peng,et al.  Effects of data dimension on empirical likelihood , 2009 .

[10]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Jianqing Fan,et al.  Sure independence screening in generalized linear models with NP-dimensionality , 2009, The Annals of Statistics.

[13]  W. Newey,et al.  HIGHER ORDER PROPERTIES OF GMM AND GENERALIZED , 2004 .

[14]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[15]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[16]  A. Owen Empirical likelihood ratio confidence intervals for a single functional , 1988 .

[17]  Lixing Zhu,et al.  Nonparametric feature screening , 2013, Comput. Stat. Data Anal..

[18]  Yichao Wu,et al.  LOCAL INDEPENDENCE FEATURE SCREENING FOR NONPARAMETRIC AND SEMIPARAMETRIC MODELS BY MARGINAL EMPIRICAL LIKELIHOOD. , 2015, Annals of statistics.

[19]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[20]  Ingrid Van Keilegom,et al.  A review on empirical likelihood methods for regression , 2009 .

[21]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[22]  Peter Hall,et al.  Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems , 2009 .

[23]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[24]  Yichao Wu,et al.  Ultrahigh Dimensional Feature Selection: Beyond The Linear Model , 2009, J. Mach. Learn. Res..

[25]  Yang Feng,et al.  Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models , 2009, Journal of the American Statistical Association.

[26]  Nils Lid Hjort,et al.  Extending the Scope of Empirical Likelihood , 2009, 0904.2949.

[27]  P. Hall,et al.  Tilting methods for assessing the influence of components in a classifier , 2009 .

[28]  Chenlei Leng,et al.  Penalized high-dimensional empirical likelihood , 2010 .

[29]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[30]  Jianqing Fan,et al.  Nonconcave Penalized Likelihood With NP-Dimensionality , 2009, IEEE Transactions on Information Theory.

[31]  Chenlei Leng,et al.  Penalized empirical likelihood and growing dimensional general estimating equations , 2012 .

[32]  秀俊 松井,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2014 .

[33]  Jianqing Fan,et al.  Conditional Sure Independence Screening , 2012, Journal of the American Statistical Association.

[34]  Lei Qi,et al.  Sparse High Dimensional Models in Economics. , 2011, Annual review of economics.