Nonparametric independence screening via favored smoothing bandwidth

Abstract We propose a flexible nonparametric regression method for ultrahigh-dimensional data. As a first step, we propose a fast screening method based on the favored smoothing bandwidth of the marginal local constant regression. Then, an iterative procedure is developed to recover both the important covariates and the regression function. Theoretically, we prove that the favored smoothing bandwidth based screening possesses the model selection consistency property. Simulation studies as well as real data analysis show the competitive performance of the new procedure.

[1]  Yang Feng,et al.  Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models , 2009, Journal of the American Statistical Association.

[2]  Jianqing Fan,et al.  Sure independence screening in generalized linear models with NP-dimensionality , 2009, The Annals of Statistics.

[3]  Runze Li,et al.  Feature Screening via Distance Correlation Learning , 2012, Journal of the American Statistical Association.

[4]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[5]  Chunming Zhang Calibrating the Degrees of Freedom for Automatic Data Smoothing and Effective Curve Checking , 2003 .

[6]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[7]  Yichao Wu,et al.  Automatic structure recovery for additive models. , 2015, Biometrika.

[8]  E. Nadaraya On Estimating Regression , 1964 .

[9]  V. Sheffield,et al.  Regulation of gene expression in the mammalian eye and its relevance to eye disease , 2006, Proceedings of the National Academy of Sciences.

[10]  Jianqing Fan,et al.  Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models , 2014, Journal of the American Statistical Association.

[11]  Yichao Wu,et al.  Variable Selection in Kernel Regression Using Measurement Error Selection Likelihoods , 2017, Journal of the American Statistical Association.

[12]  Thomas L Casavant,et al.  Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11). , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[13]  S. Geer,et al.  High-dimensional additive modeling , 2008, 0806.4115.

[14]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[15]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[16]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[17]  Yichao Wu,et al.  LOCAL INDEPENDENCE FEATURE SCREENING FOR NONPARAMETRIC AND SEMIPARAMETRIC MODELS BY MARGINAL EMPIRICAL LIKELIHOOD. , 2015, Annals of statistics.

[18]  L A Stefanski,et al.  Variable Selection in Nonparametric Classification Via Measurement Error Model Selection Likelihoods , 2014, Journal of the American Statistical Association.

[19]  Runze Li,et al.  Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates , 2014, Journal of the American Statistical Association.

[20]  Yang Feng,et al.  High-dimensional variable selection for Cox's proportional hazards model , 2010, 1002.3315.

[21]  B. Hansen UNIFORM CONVERGENCE RATES FOR KERNEL ESTIMATION WITH DEPENDENT DATA , 2008, Econometric Theory.

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[24]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[25]  Yichao Wu,et al.  MARGINAL EMPIRICAL LIKELIHOOD AND SURE INDEPENDENCE FEATURE SCREENING. , 2013, Annals of statistics.

[26]  Runze Li,et al.  Model-Free Feature Screening for Ultrahigh-Dimensional Data , 2011, Journal of the American Statistical Association.

[27]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[28]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[29]  Hui Zou,et al.  The Kolmogorov filter for variable screening in high-dimensional binary classification , 2013 .