Only Bayes should learn a manifold (on the estimation of differential geometric structure from data)

We investigate learning of the differential geometric structure of a data manifold embedded in a high-dimensional Euclidean space. We first analyze kernel-based algorithms and show that under the usual regularizations, non-probabilistic methods cannot recover the differential geometric structure, but instead find mostly linear manifolds or spaces equipped with teleports. To properly learn the differential geometric structure, non-probabilistic methods must apply regularizations that enforce large gradients, which go against common wisdom. We repeat the analysis for probabilistic methods and find that under reasonable priors, the geometric structure can be recovered. Fully exploiting the recovered structure, however, requires the development of stochastic extensions to classic Riemannian geometry. We take early steps in that regard. Finally, we partly extend the analysis to modern models based on neural networks, thereby highlighting geometric and probabilistic shortcomings of current deep generative models.

[1]  Nicki Skafte Detlefsen,et al.  Reliable training and estimation of variance networks , 2019, NeurIPS.

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  Xavier Pennec,et al.  Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements , 2006, Journal of Mathematical Imaging and Vision.

[4]  Soren Hauberg,et al.  Expected path length on random manifolds , 2019, ArXiv.

[5]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[7]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[8]  Andrew Gordon Wilson,et al.  Generalised Wishart Processes , 2010, UAI.

[9]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[10]  Jonathan Stein,et al.  Digital Signal Processing: A Computer Science Perspective , 2000 .

[11]  Mikhail Belkin,et al.  Back to the Future: Radial Basis Function Network Revisited , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[13]  Xueyan Jiang,et al.  Metrics for Deep Generative Models , 2017, AISTATS.

[14]  Justin Bayer,et al.  Fast Approximate Geodesics for Deep Generative Models , 2018, ICANN.

[15]  M. Wand Local Regression and Likelihood , 2001 .

[16]  Lars Kai Hansen,et al.  Latent Space Oddity: on the Curvature of Deep Generative Models , 2017, ICLR.

[17]  R. Tibshirani,et al.  Local Likelihood Estimation , 1987 .

[18]  Neil D. Lawrence,et al.  Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models , 2005, J. Mach. Learn. Res..

[19]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[20]  P. Thomas Fletcher,et al.  The Riemannian Geometry of Deep Generative Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[22]  Daan Wierstra,et al.  Stochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models , 2014, ArXiv.

[23]  Guohua Pan,et al.  Local Regression and Likelihood , 1999, Technometrics.

[24]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[25]  Christopher K. I. Williams,et al.  Magnification factors for the GTM algorithm , 1997 .

[26]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[27]  Carl Friedrich Gauss Disquisitiones generales circa superficies curvas , 1981 .

[28]  Samuli Laine,et al.  Feature-Based Metrics for Exploring the Latent Space of Generative Models , 2018, ICLR.

[29]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[30]  Neil D. Lawrence,et al.  Metrics for Probabilistic Geometries , 2014, UAI.

[31]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[32]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[33]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[34]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.