Kernel dimension reduction in regression

We present a new methodology for sufficient dimension reduction (SDR). Our methodology derives directly from the formulation of SDR in terms of the conditional independence of the covariate X from the response Y , given the projection of X on the central subspace [cf. J. Amer. Statist. Assoc. 86 (1991) 316–342 and Regression Graphics (1998) Wiley]. We show that this conditional independence assertion can be characterized in terms of conditional covariance operators on reproducing kernel Hilbert spaces and we show how this characterization leads to an M-estimator for the central subspace. The resulting estimator is shown to be consistent under weak conditions; in particular, we do not have to impose linearity or ellipticity conditions of the kinds that are generally invoked for SDR methods. We also present empirical results showing that the new methodology is competitive in practice.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  J. Lamperti ON CONVERGENCE OF STOCHASTIC PROCESSES , 1962 .

[3]  K. Nomizu,et al.  Foundations of Differential Geometry , 1963 .

[4]  C. Baker Joint measures and cross-covariance operators , 1973 .

[5]  J. Kuelbs Probability on Banach spaces , 1978 .

[6]  J. Friedman,et al.  Projection Pursuit Regression , 1981 .

[7]  D. Pollard Convergence of stochastic processes , 1984 .

[8]  C. W. Groetsch,et al.  The theory of Tikhonov regularization for Fredholm equations of the first kind , 1984 .

[9]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[10]  J. Friedman,et al.  Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .

[11]  N. Vakhania,et al.  Probability Distributions on Banach Spaces , 1987 .

[12]  Bernhard N Flury Multivariate Statistics: A Practical Approach , 1988 .

[13]  H. Riedwyl,et al.  Multivariate Statistics: A Practical Approach , 1988 .

[14]  G. Wahba Spline models for observational data , 1990 .

[15]  D. Hawkins Multivariate Statistics: A Practical Approach , 1990 .

[16]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[17]  S. Weisberg,et al.  Comments on "Sliced inverse regression for dimension reduction" by K. C. Li , 1991 .

[18]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[19]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[20]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[21]  A. Samarov Exploring Regression Structure Using Nonparametric Functional Estimation , 1993 .

[22]  R. Cook Regression Graphics , 1994 .

[23]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[24]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[25]  A. V. D. Vaart Asymptotic Statistics: Delta Method , 1998 .

[26]  Sanjay Jain,et al.  Proceedings of the 16th international conference on Algorithmic Learning Theory , 2005 .

[27]  J. Polzehl,et al.  Structure adaptive approach for dimension reduction , 2001 .

[28]  Ingo Steinwart,et al.  On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[29]  R. Cook,et al.  Theory & Methods: Special Invited Paper: Dimension Reduction and Visualization in Discriminant Analysis (with discussion) , 2001 .

[30]  R. Cook,et al.  Sufficient Dimension Reduction and Graphics in Regression , 2002 .

[31]  H. Tong,et al.  An adaptive estimation of dimension reduction space , 2002 .

[32]  R. Cook,et al.  Dimension reduction for conditional mean in regression , 2002 .

[33]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[34]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[35]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[36]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004 .

[37]  R. Dennis Cook,et al.  Direction estimation in single-index regressions , 2005 .

[38]  Kenji Fukumizu,et al.  Consistency of Kernel Canonical Correlation Analysis , 2005 .

[39]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[40]  H. Zha,et al.  Contour regression: A general approach to dimension reduction , 2005, math/0508277.

[41]  Yu Zhu,et al.  Fourier Methods for Estimating the Central Subspace and the Central Mean Subspace in Regression , 2006 .

[42]  Efstathia Bura,et al.  Moment-based dimension reduction for multivariate response regression , 2006 .

[43]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[44]  Kenji Fukumizu,et al.  Statistical Consistency of Kernel Canonical Correlation Analysis , 2007 .

[45]  Bernhard Schölkopf,et al.  Injective Hilbert Space Embeddings of Probability Measures , 2008, COLT.