Hierarchical Representations with Poincaré Variational Auto-Encoders

The Variational Auto-Encoder (VAE) model is a popular method to learn at once a generative model and embeddings for data living in a high-dimensional space. In the real world, many datasets may be assumed to be hierarchically structured. Traditionally, VAE uses a Euclidean latent space, but tree-like structures cannot be efficiently embedded in such spaces as opposed to hyperbolic spaces with negative curvature. We therefore endow VAE with a Poincar\'e ball model of hyperbolic geometry and derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.

[1]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .

[2]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[3]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[4]  Gary Bécigneul,et al.  Poincaré GloVe: Hyperbolic Word Embeddings , 2018, ICLR.

[5]  Cosma Rohilla Shalizi,et al.  Geometric Network Comparisons , 2014, UAI.

[6]  Shakir Mohamed,et al.  Implicit Reparameterization Gradients , 2018, NeurIPS.

[7]  Rik Sarkar,et al.  Low Distortion Delaunay Embedding of Trees in Hyperbolic Plane , 2011, GD.

[8]  Miguel A. Andrade-Navarro,et al.  Distance Distribution between Complex Network Nodes in Hyperbolic Space , 2016, Complex Syst..

[9]  Yannick Berthoumieu,et al.  New Riemannian Priors on the Univariate Normal Model , 2014, Entropy.

[10]  Christopher De Sa,et al.  Representation Tradeoffs for Hyperbolic Embeddings , 2018, ICML.

[11]  Joshua B. Tenenbaum,et al.  One-Shot Learning with a Hierarchical Nonparametric Bayesian Model , 2011, ICML Unsupervised and Transfer Learning.

[12]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[13]  Nicola De Cao,et al.  Hyperspherical Variational Auto-Encoders , 2018, UAI 2018.

[14]  Douwe Kiela,et al.  Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry , 2018, ICML.

[15]  Lorenzo Livi,et al.  Adversarial Autoencoders with Constant-Curvature Latent Manifolds , 2019, Appl. Soft Comput..

[16]  Yann LeCun,et al.  Large-scale Learning with SVM and Convolutional for Generic Object Categorization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Lars Kai Hansen,et al.  Probabilistic Hierarchical Clustering with Labeled and Unlabeled Data , 2001 .

[18]  S. R. Jammalamadaka,et al.  Directional Statistics, I , 2011 .

[19]  Shoichiro Yamaguchi,et al.  A Differentiable Gaussian-like Distribution on Hyperbolic Space for Gradient-Based Learning , 2019, ICML 2019.

[20]  A. Gray,et al.  I. THE ORIGIN OF SPECIES BY MEANS OF NATURAL SELECTION , 1963 .

[21]  Topological Methods in Data Analysis and Visualization , 2011, Mathematics and Visualization.

[22]  I. Holopainen Riemannian Geometry , 1927, Nature.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  Abraham Albert Ungar,et al.  A Gyrovector Space Approach to Hyperbolic Geometry , 2009, A Gyrovector Space Approach to Hyperbolic Geometry.

[25]  Y. Teh,et al.  Concave-Convex Adaptive Rejection Sampling , 2011 .

[26]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[27]  Yee Whye Teh,et al.  Bayesian Agglomerative Clustering with Coalescents , 2007, NIPS.

[28]  E. Beltrami Teoria fondamentale degli spazii di curvatura costante , 1868 .

[29]  Amin Vahdat,et al.  Hyperbolic Geometry of Complex Networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Hamish Carr,et al.  Topological Methods in Data Analysis and Visualization III, Theory, Algorithms, and Applications , 2011 .

[31]  Thomas Hofmann,et al.  Learning annotated hierarchies from relational data , 2007 .

[32]  Xavier Pennec,et al.  Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements , 2006, Journal of Mathematical Imaging and Vision.

[33]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  A. Maslow A Theory of Human Motivation , 1943 .

[35]  Jesús Angulo,et al.  Probability Density Estimation on the Hyperbolic Space Applied to Radar Processing , 2015, GSI.

[36]  Ivan Ovinnikov,et al.  Poincaré Wasserstein Autoencoder , 2019, ArXiv.

[37]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .

[38]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[39]  Nicola De Cao,et al.  Explorations in Homeomorphic Variational Auto-Encoding , 2018, ArXiv.

[40]  Marc Peter Deisenroth,et al.  Neural Embeddings of Graphs in Hyperbolic Space , 2017, ArXiv.

[41]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[42]  Michael I. Jordan,et al.  Tree-Structured Stick Breaking for Hierarchical Data , 2010, NIPS.

[43]  Katherine A. Heller,et al.  Bayesian hierarchical clustering , 2005, ICML.

[44]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[45]  F. Keil Semantic and Conceptual Development: An Ontological Perspective , 2014 .

[46]  Arnaud Doucet,et al.  Hamiltonian Variational Auto-Encoder , 2018, NeurIPS.

[47]  M. Ross Quillian,et al.  Retrieval time from semantic memory , 1969 .

[48]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[49]  Søren Hauberg,et al.  Directional Statistics with the Spherical Normal Distribution , 2018, 2018 21st International Conference on Information Fusion (FUSION).

[50]  M. Gardner Non-Euclidean Geometry , 1943 .

[51]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[52]  Christophe Ley,et al.  Modern Directional Statistics , 2017 .

[53]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.