Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis

Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce. We further show experimental results on several different types of multi-view data sets and for different kinds of tasks, including exploratory data analysis, generation, ambiguity modelling through latent priors and classification.

[1]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[2]  Neil D. Lawrence,et al.  Gaussian Process Models with Parallelization and GPU acceleration , 2014, ArXiv.

[3]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[4]  Michael E. Tipping The Relevance Vector Machine , 1999, NIPS.

[5]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[6]  Samuel Kaski,et al.  Bayesian Canonical correlation analysis , 2013, J. Mach. Learn. Res..

[7]  J. Shawe-Taylor,et al.  Multi-View Canonical Correlation Analysis , 2010 .

[8]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[9]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[10]  Trevor Darrell,et al.  Discriminative Gaussian process latent variable model for classification , 2007, ICML '07.

[11]  Neil D. Lawrence,et al.  Ambiguity Modeling in Latent Spaces , 2008, MLMI.

[12]  Timothy F. Cootes,et al.  Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Neil D. Lawrence,et al.  Probabilistic Spectral Dimensionality Reduction , 2010 .

[14]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[15]  Trevor Darrell,et al.  Factorized Orthogonal Latent Spaces , 2010, AISTATS.

[16]  Wm. R. Wright General Intelligence, Objectively Determined and Measured. , 1905 .

[17]  Samuel Kaski,et al.  Bayesian CCA via Group Sparsity , 2011, ICML.

[18]  C. Rasmussen,et al.  Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[19]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[20]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[21]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[22]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[23]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24]  David J. Fleet,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Gaussian Process Dynamical Model , 2007 .

[25]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[26]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  L. Tucker An inter-battery method of factor analysis , 1958 .

[28]  Rajesh P. N. Rao,et al.  Learning Shared Latent Structure for Image Synthesis and Robotic Imitation , 2005, NIPS.

[29]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[30]  Hedvig Kjellström,et al.  Factorized Topic Models , 2013, ICLR.

[31]  Trevor Darrell,et al.  Factorized Latent Spaces with Structured Sparsity , 2010, NIPS.

[32]  H. Gardner,et al.  Language and Learning: The Debate between Jean Piaget and Noam Chomsky , 1983 .

[33]  Neil D. Lawrence,et al.  Manifold Relevance Determination , 2012, ICML.

[34]  C. Ek Shared Gaussian Process Latent Variables Models , 2009 .

[35]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[36]  Neil D. Lawrence,et al.  Variational Inference for Latent Variables and Uncertain Inputs in Gaussian Processes , 2016, J. Mach. Learn. Res..

[37]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Daniel D. Lee,et al.  Learning a manifold-constrained map between image sets: applications to matching and pose estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[39]  Neil D. Lawrence,et al.  GP-LVM for data consolidation , 2008, NIPS 2008.

[40]  C. Bishop,et al.  Analysis of multiphase flows using dual-energy gamma densitometry and neural networks , 1993 .

[41]  Malte Kuss,et al.  The Geometry Of Kernel Canonical Correlation Analysis , 2003 .

[42]  Daniel D. Lee,et al.  Learning High Dimensional Correspondences from Low Dimensional Manifolds , 2003 .

[43]  Joaquin Quiñonero Candela,et al.  Local distance preservation in the GP-LVM through back constraints , 2006, ICML.

[44]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[45]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[46]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[47]  Neil D. Lawrence,et al.  Variational Gaussian Process Dynamical Systems , 2011, NIPS.

[48]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[49]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Mukund Balasubramanian,et al.  The isomap algorithm and topological stability. , 2002, Science.

[51]  Michael I. Jordan,et al.  A Probabilistic Interpretation of Canonical Correlation Analysis , 2005 .

[52]  Kilian Q. Weinberger,et al.  Learning a kernel matrix for nonlinear dimensionality reduction , 2004, ICML.

[53]  Trevor Darrell,et al.  Learning cross-modality similarity for multinomial data , 2011, 2011 International Conference on Computer Vision.

[54]  S. Kaski,et al.  Generative Models that Discover Dependencies Between Data Sets , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[55]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[56]  Neil D. Lawrence,et al.  Nested Variational Compression in Deep Gaussian Processes , 2014, 1412.1370.

[57]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Human Pose Estimation , 2007, MLMI.

[58]  Gayle Leen Learning shared and separate features of two related data sets using GPLVMs , 2008 .

[59]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[60]  Andreas C. Damianou,et al.  Deep Gaussian processes and variational propagation of uncertainty , 2015 .

[61]  Trevor Darrell,et al.  Factorized Multi-Modal Topic Model , 2012, UAI.

[62]  S A Mulaik,et al.  A Brief History of the Philosophical Foundations of Exploratory Factor Analysis. , 1987, Multivariate behavioral research.

[63]  Neil D. Lawrence,et al.  Residual Component Analysis: Generalising PCA for more flexible inference in linear-Gaussian models , 2012, ICML 2012.

[64]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[65]  Neil D. Lawrence,et al.  Efficient Multioutput Gaussian Processes through Variational Inducing Kernels , 2010, AISTATS.

[66]  Daniel D. Lee,et al.  Semisupervised alignment of manifolds , 2005, AISTATS.