Central Subspace Dimensionality Reduction Using Covariance Operators

We consider the task of dimensionality reduction informed by real-valued multivariate labels. The problem is often treated as Dimensionality Reduction for Regression (DRR), whose goal is to find a low-dimensional representation, the central subspace, of the input data that preserves the statistical correlation with the targets. A class of DRR methods exploits the notion of inverse regression (IR) to discover central subspaces. Whereas most existing IR techniques rely on explicit output space slicing, we propose a novel method called the Covariance Operator Inverse Regression (COIR) that generalizes IR to nonlinear input/output spaces without explicit target slicing. COIR's unique properties make DRR applicable to problem domains with high-dimensional output data corrupted by potentially significant amounts of noise. Unlike recent kernel dimensionality reduction methods that employ iterative nonconvex optimization, COIR yields a closed-form solution. We also establish the link between COIR, other DRR techniques, and popular supervised dimensionality reduction methods, including canonical correlation analysis and linear discriminant analysis. We then extend COIR to semi-supervised settings where many of the input points lack their labels. We demonstrate the benefits of COIR on several important regression problems in both fully supervised and semi-supervised settings.

[1]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[2]  R. Cook Regression Graphics , 1994 .

[3]  Dimitris N. Metaxas,et al.  Tracking Facial Features Using Mixture of Point Distribution Models , 2006, ICVGIP.

[4]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[5]  Timothy P. Boyer,et al.  World Ocean Atlas 2005 Volume 1: Temperature [+DVD] , 2006 .

[6]  Michael I. Jordan,et al.  Regression on manifolds using kernel dimension reduction , 2007, ICML '07.

[7]  Han-Ming Wu Kernel Sliced Inverse Regression with Applications to Classification , 2008 .

[8]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[9]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[10]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[12]  N. Vakhania,et al.  Probability Distributions on Banach Spaces , 1987 .

[13]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[14]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[15]  Carl E. Rasmussen,et al.  In Advances in Neural Information Processing Systems , 2011 .

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[18]  Su-Yun Huang,et al.  Kernel Fisher ’ s Discriminant Analysis in Gaussian Reproducing Kernel , 2005 .

[19]  Jens Nilsson Manifold Learning in Computational Biology , 2008 .

[20]  C. Baker Joint measures and cross-covariance operators , 1973 .

[21]  M. Barker,et al.  Partial least squares for discrimination , 2003 .

[22]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[23]  S. Gorshkov,et al.  World ocean atlas , 1976 .

[24]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[25]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[26]  S. Levitus,et al.  World ocean atlas 2009 , 2010 .

[27]  Amir Globerson,et al.  Metric Learning by Collapsing Classes , 2005, NIPS.

[28]  G. Wahba Spline models for observational data , 1990 .

[29]  L. Rosasco,et al.  Manifold Regularization , 2007 .

[30]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[31]  Michael I. Jordan,et al.  Kernel dimension reduction in regression , 2009, 0908.1854.

[32]  Timothy P. Boyer,et al.  World Ocean Atlas 2005, Volume 3: Dissolved Oxygen, Apparent Oxygen Utilization, and Oxygen Saturation [+DVD] , 2006 .

[33]  Kilian Q. Weinberger,et al.  Mapping Uncharted Waters: Exploratory Analysis, Visualization, and Clustering of Oceanographic Data , 2008, 2008 Seventh International Conference on Machine Learning and Applications.