APo-VAE: Text Generation in Hyperbolic Space

Natural language often exhibits inherent hierarchical structure ingrained with complex syntax and semantics. However, most state-of-the-art deep generative models learn embeddings only in Euclidean vector space, without accounting for this structural property of language. In this paper, we investigate text generation in a hyperbolic latent space to learn continuous hierarchical representations. An Adversarial Poincare Variational Autoencoder (APo-VAE) is presented, where both the prior and variational posterior of latent variables are defined over a Poincare ball via wrapped normal distributions. By adopting the primal-dual formulation of Kullback-Leibler divergence, an adversarial learning procedure is introduced to empower robust model training. Extensive experiments in language modeling, unaligned style transfer, and dialog-response generation demonstrate the effectiveness of the proposed APo-VAE model over VAEs in Euclidean latent space, thanks to its superb capabilities in capturing latent language hierarchies in hyperbolic space.

[1]  Nathan Linial,et al.  The geometry of graphs and some of its algorithmic applications , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[2]  Gary Bécigneul,et al.  Poincaré GloVe: Hyperbolic Word Embeddings , 2018, ICLR.

[3]  Jianfeng Gao,et al.  Implicit Deep Latent Variable Models for Text Generation , 2019, EMNLP/IJCNLP.

[4]  Abraham Albert Ungar,et al.  A Gyrovector Space Approach to Hyperbolic Geometry , 2009, A Gyrovector Space Approach to Hyperbolic Geometry.

[5]  Xiaodong Liu,et al.  Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing , 2019, NAACL.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Xiaoyu Shen,et al.  Improving Variational Encoder-Decoders in Dialogue Generation , 2018, AAAI.

[8]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[9]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[10]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[11]  Jiacheng Xu,et al.  Spherical Latent Spaces for Stable Variational Autoencoders , 2018, EMNLP.

[12]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[13]  R. Rockafellar Extension of Fenchel’ duality theorem for convex functions , 1966 .

[14]  Zhe Gan,et al.  Adversarial Symmetric Variational Autoencoder , 2017, NIPS.

[15]  Christopher De Sa,et al.  Representation Tradeoffs for Hyperbolic Embeddings , 2018, ICML.

[16]  Lei Li,et al.  Fixing Gaussian Mixture VAEs for Interpretable Text Generation , 2019, ArXiv.

[17]  Yee Whye Teh,et al.  Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders , 2019, NeurIPS.

[18]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[19]  Alexander M. Rush,et al.  Latent Normalizing Flows for Discrete Sequences , 2019, ICML.

[20]  Zhe Gan,et al.  Adversarial Text Generation via Feature-Mover's Distance , 2018, NeurIPS.

[21]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[22]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders , 2017, ICML.

[23]  Lawrence Carin,et al.  Variational Inference and Model Selection with Generalized Evidence Bounds , 2018, ICML.

[24]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[25]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[26]  Marc Peter Deisenroth,et al.  Neural Embeddings of Graphs in Hyperbolic Space , 2017, ArXiv.

[27]  Rik Sarkar,et al.  Low Distortion Delaunay Embedding of Trees in Hyperbolic Plane , 2011, GD.

[28]  Xiaodong Gu,et al.  DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder , 2018, ICLR.

[29]  Jun Wang,et al.  ControlVAE: Controllable Variational Autoencoder , 2020, ICML.

[30]  Graham Neubig,et al.  Lagging Inference Networks and Posterior Collapse in Variational Autoencoders , 2019, ICLR.

[31]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[32]  Shoichiro Yamaguchi,et al.  A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning , 2019, ICML.

[33]  Ivan Ovinnikov,et al.  Poincar\'e Wasserstein Autoencoder , 2019, 1901.01427.

[34]  Guoyin Wang,et al.  Topic-Guided Variational Auto-Encoder for Text Generation , 2019, NAACL.

[35]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[36]  Le Song,et al.  Coupled Variational Bayes via Optimization Embedding , 2018, NeurIPS.

[37]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[38]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[39]  Ali Razavi,et al.  Preventing Posterior Collapse with delta-VAEs , 2019, ICLR.

[40]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[41]  Thomas Hofmann,et al.  Hyperbolic Neural Networks , 2018, NeurIPS.

[42]  Zhiting Hu,et al.  Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[43]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[44]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[45]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[46]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.