Model averaging assisted sufficient dimension reduction

Abstract Sufficient dimension reduction that replaces original predictors with their low- dimensional linear combinations without loss of information is a critical tool in modern statistics and has gained considerable research momentum in the past decades since the two pioneers sliced inverse regression and principal Hessian directions. The classical sufficient dimension reduction methods do not handle sparse case well since the estimated linear reductions involve all of the original predictors. Sparse sufficient dimension reduction methods rely on sparsity assumption which may not be true in practice. Motivated by the least squares formulation of the classical sliced inverse regression and principal Hessian directions, several model averaging assisted sufficient dimension reduction methods are proposed. They are applicable to both dense and sparse cases even with weak signals since model averaging adaptively assigns weights to different candidate models. Based on the model averaging assisted sufficient dimension reduction methods, how to estimate the structural dimension is further studied. Theoretical justifications are given and empirical results show that the proposed methods compare favorably with the classical sufficient dimension reduction methods and popular sparse sufficient dimension reduction methods.

[1]  Xiangrong Yin,et al.  Sequential sufficient dimension reduction for large p, small n problems , 2015 .

[2]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[3]  L. Ferré Determining the Dimension in Sliced Inverse Regression and Related Methods , 1998 .

[4]  Lixing Zhu,et al.  On Sliced Inverse Regression With High-Dimensional Covariates , 2006 .

[5]  R. Cook,et al.  Dimension reduction for the conditional kth moment in regression , 2002 .

[6]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[7]  Chris Muris,et al.  Model averaging in semiparametric estimation of treatment effects , 2015 .

[8]  Zhou Yu,et al.  Trace Pursuit: A General Framework for Model-Free Variable Selection , 2014, 1402.5190.

[9]  Xiangrong Yin,et al.  Sliced Inverse Regression with Regularizations , 2008, Biometrics.

[10]  Ker-Chau Li,et al.  A Model-Averaging Approach for High-Dimensional Regression , 2014 .

[11]  R. Cook,et al.  Dimension reduction for conditional mean in regression , 2002 .

[12]  D. Pollard,et al.  Cube Root Asymptotics , 1990 .

[13]  K. Burnham,et al.  Model selection: An integral part of inference , 1997 .

[14]  Ker-Chau Li Sliced inverse regression for dimension reduction (with discussion) , 1991 .

[15]  D. Freedman,et al.  Some Asymptotic Theory for the Bootstrap , 1981 .

[16]  R. H. Moore,et al.  Regression Graphics: Ideas for Studying Regressions Through Graphics , 1998, Technometrics.

[17]  Zhou Yu,et al.  On marginal sliced inverse regression for ultrahigh dimensional model-free feature selection , 2016 .

[18]  R. Cook,et al.  Principal Hessian Directions Revisited , 1998 .

[19]  R. Carroll,et al.  Parsimonious Model Averaging With a Diverging Number of Parameters , 2020, Journal of the American Statistical Association.

[20]  Hua Liang,et al.  Model averaging and weight choice in linear mixed-effects models , 2014 .

[21]  Ker-Chau Li,et al.  A weight-relaxed model averaging approach for high-dimensional generalized linear models , 2017 .

[22]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[23]  N. Hjort,et al.  Frequentist Model Average Estimators , 2003 .

[24]  Bing Li,et al.  Successive direction extraction for estimating the central subspace in a multiple-index regression , 2008 .

[25]  Xinyu Zhang,et al.  INFERENCE AFTER MODEL AVERAGING IN LINEAR REGRESSION MODELS , 2017, Econometric Theory.

[26]  Andrew R. Barron,et al.  Information Theory and Mixing Least-Squares Regressions , 2006, IEEE Transactions on Information Theory.

[27]  Hua Liang,et al.  Focused information criterion and model averaging for generalized additive partial linear models , 2011, 1103.1480.

[28]  B. Hansen Least Squares Model Averaging , 2007 .

[29]  Bing Li,et al.  Sufficient Dimension Reduction: Methods and Applications with R , 2018 .

[30]  Lexin Li,et al.  Sparse sufficient dimension reduction , 2007 .

[31]  Xin Chen,et al.  On the consistency of coordinate-independent sparse estimation with BIC , 2012, J. Multivar. Anal..

[32]  Lexin Li,et al.  ASYMPTOTIC PROPERTIES OF SUFFICIENT DIMENSION REDUCTION WITH A DIVERGING NUMBER OF PREDICTORS. , 2011, Statistica Sinica.

[33]  David Nott,et al.  Varying‐coefficient semiparametric model averaging prediction , 2018, Biometrics.

[34]  Shaoli Wang,et al.  On Directional Regression for Dimension Reduction , 2007 .

[35]  Wei Luo,et al.  Combining eigenvalues and variation of eigenvectors for order determination , 2016 .

[36]  Tosio Kato Perturbation theory for linear operators , 1966 .

[37]  T. Cai,et al.  Sparse PCA: Optimal rates and adaptive estimation , 2012, 1211.1309.

[38]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[39]  Guohua Zou,et al.  A Mallows-Type Model Averaging Estimator for the Varying-Coefficient Partially Linear Model , 2018, Journal of the American Statistical Association.

[40]  O. Linton,et al.  A flexible semiparametric forecasting model for time series , 2015 .

[41]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[42]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[43]  R. Dennis Cook,et al.  A note on shrinkage sliced inverse regression , 2005 .

[44]  Raymond J Carroll,et al.  A ROBUST AND EFFICIENT APPROACH TO CAUSAL INFERENCE BASED ON SPARSE SUFFICIENT DIMENSION REDUCTION. , 2019, Annals of statistics.

[45]  Z. Bai,et al.  On detection of the number of signals in presence of white noise , 1985 .

[46]  R. Cook,et al.  Using intraslice covariances for improved estimation of the central subspace in regression , 2006 .

[47]  B. Li,et al.  Dimension reduction for nonelliptically distributed predictors , 2009, 0904.3842.

[48]  Yuhong Yang Adaptive Regression by Mixing , 2001 .

[49]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[50]  R. Dennis Cook,et al.  Sparse Minimum Discrepancy Approach to Sufficient Dimension Reduction with Simultaneous Variable Selection in Ultrahigh Dimension , 2018, Journal of the American Statistical Association.

[51]  R. Cook,et al.  Coordinate-independent sparse sufficient dimension reduction and variable selection , 2010, 1211.3215.

[52]  Nicholas T. Longford,et al.  Editorial: Model selection and efficiency—is ‘Which model …?’ the right question? , 2005 .

[53]  Jeffrey S. Racine,et al.  Jackknife model averaging , 2012 .

[54]  Hua Liang,et al.  Optimal Model Averaging Estimation for Generalized Linear Models and Generalized Linear Mixed-Effects Models , 2016 .

[55]  Jun S. Liu,et al.  Sparse Sliced Inverse Regression via Lasso , 2016, Journal of the American Statistical Association.

[56]  Zhaoran Wang,et al.  A convex formulation for high‐dimensional sparse sliced inverse regression , 2018, ArXiv.