Comparison of sliced inverse regression approaches for underdetermined cases

Among methods to analyze high-dimensional data, the sliced inverse regression (SIR) is of particular interest for non-linear relations between the dependent variable and some indices of the covariate. When the dimension of the covariate is greater than the number of observations, classical versions of SIR cannot be applied. Various upgrades were then proposed to tackle this issue such as regularized SIR (RSIR) and sparse ridge SIR (SR-SIR), to estimate the parameters of the underlying model and to select variables of interest. In this paper, we introduce two new estimation methods respectively based on the QZ algorithm and on the Moore-Penrose pseudo-inverse. We also describe a new selection procedure of the most relevant components of the covariate that relies on a proximity criterion between submodels and the initial one. These approaches are compared with RSIR and SR-SIR in a simulation study. Finally we applied SIR-QZ and the associated selection procedure to a genetic dataset in order to find markers that are linked to the expression of a gene. These markers are called expression quantitative trait loci (eQTL).

[1]  R. Cook,et al.  A NOTE ON SMOOTHED FUNCTIONAL INVERSE REGRESSION , 2007 .

[2]  R. Cook,et al.  Principal Hessian Directions Revisited , 1998 .

[3]  Thi Mong Ngoc Nguyen,et al.  A new approach on recursive and non-recursive SIR methods , 2012 .

[4]  Chun-Houh Chen,et al.  CAN SIR BE AS POPULAR AS MULTIPLE LINEAR REGRESSION , 2003 .

[5]  Anestis Antoniadis,et al.  Dimension reduction in functional regression with applications , 2006, Comput. Stat. Data Anal..

[6]  Matthias Heinig,et al.  New Insights into the Genetic Control of Gene Expression using a Bayesian Multi-tissue Approach , 2010, PLoS Comput. Biol..

[7]  K. Fang,et al.  Asymptotics for kernel estimate of sliced inverse regression , 1996 .

[8]  Sylvia Richardson,et al.  Evolutionary Stochastic Search for Bayesian model exploration , 2010, 1002.2706.

[9]  Raymond J. Carroll,et al.  Measurement Error Regression with Unknown Link: Dimension Reduction and Data Visualization , 1992 .

[10]  Jérôme Saracco,et al.  An asymptotic theory for sliced inverse regression , 1997 .

[11]  Jérôme Saracco,et al.  Application of the Bootstrap Approach to the Choice of Dimension and the α Parameter in the SIRα Method , 2008, Commun. Stat. Simul. Comput..

[12]  Liping Zhu,et al.  On kernel method for sliced average variance estimation , 2007 .

[13]  Lixing Zhu,et al.  Asymptotics for sliced average variance estimation , 2007, 0708.0462.

[14]  Jérôme Saracco,et al.  Sliced Inverse Regression (SIR): An Appraisal of Small Sample Alternatives to Slicing , 1996 .

[15]  James R. Schott,et al.  Determining the Dimensionality in Sliced Inverse Regression , 1994 .

[16]  Peng Zeng,et al.  RSIR: regularized sliced inverse regression for motif discovery , 2005, Bioinform..

[17]  Marc Chadeau-Hyam,et al.  ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration , 2011, Bioinform..

[18]  Jérôme Saracco,et al.  POOLED SLICING METHODS VERSUS SLICING METHODS , 2001 .

[19]  C Bernard-Michel,et al.  A note on sliced inverse regression with regularizations. , 2008, Biometrics.

[20]  Gene H. Golub,et al.  Matrix computations , 1983 .

[21]  Yingxing Li,et al.  On hybrid methods of inverse regression-based algorithms , 2007, Comput. Stat. Data Anal..

[22]  Bernard W. Silverman,et al.  Smoothing and Regression: Approaches, Computation and Application, , 1999 .

[23]  Tõnu Kollo,et al.  Communications in Statistics-Simulation and Computation , 2015 .

[24]  G. Stewart,et al.  An Algorithm for Generalized Matrix Eigenvalue Problems. , 1973 .

[25]  Xiangrong Yin,et al.  ASYMPTOTIC DISTRIBUTIONS FOR DIMENSION REDUCTION IN THE SIR-II METHOD , 2005 .

[26]  Yadolah Dodge,et al.  L[1]-statistical procedures and related topics , 1997 .

[27]  Philippe Besse,et al.  Sparse canonical methods for biological data integration: application to a cross-platform study , 2009, BMC Bioinformatics.

[28]  J. Saracco,et al.  Optimal quantization applied to sliced inverse regression , 2012 .

[29]  L. Ferré Determining the Dimension in Sliced Inverse Regression and Related Methods , 1998 .

[30]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[31]  Xuming He,et al.  A chi-square test for dimensionality with non-Gaussian data , 2004 .

[32]  R. Cook,et al.  Estimating the structural dimension of regressions via parametric inverse regression , 2001 .

[33]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[34]  Xiangrong Yin,et al.  Sliced Inverse Regression with Regularizations , 2008, Biometrics.

[35]  Ker-Chau Li,et al.  Slicing Regression: A Link-Free Regression Method , 1991 .

[36]  Douglas C. Mont Smoothing and Regression , 2001 .

[37]  R. Dennis Cook,et al.  Testing predictor contributions in sufficient dimension reduction , 2004, math/0406520.

[38]  Ker-Chau Li Sliced inverse regression for dimension reduction (with discussion) , 1991 .