An Evaluation of Disentangled Representation Learning for Texts

Learning disentangled representations of texts, which encode information pertaining to different aspects of the text in separate representations, is an active area of research in NLP for controllable and interpretable text generation. These methods have, for the most part, been developed in the context of text style transfer, but are limited in their evaluation. In this work, we look at the motivation behind learning disentangled representations of content and style for texts and at the potential use-cases when compared to end-to-end methods. We then propose evaluation metrics that correspond to these use-cases. We conduct a systematic investigation of previously proposed loss functions for such models and we evaluate them on a highly-structured and synthetic natural language dataset that is well-suited for the task of disentangled representation learning, as well as two other parallel style transfer datasets. Our results demonstrate that current models still require considerable amounts of supervision in order to achieve good performance.

[1]  Hareesh Bahuleyan,et al.  Polarized-VAE: Proximity Based Disentangled Representation Learning for Text Generation , 2020, EACL.

[2]  Lawrence Carin,et al.  Improving Disentangled Text Representation Learning with Information-Theoretic Guidance , 2020, ACL.

[3]  Graham Neubig,et al.  A Probabilistic Formulation of Unsupervised Text Style Transfer , 2020, ICLR.

[4]  T. Jaakkola,et al.  Educating Text Autoencoders: Latent Representation Guidance via Denoising , 2019, ICML.

[5]  Yanshuai Cao,et al.  On Variational Learning of Controllable Representations for Text without Supervision , 2019, ICML.

[6]  Decomposing Textual Information For Style Transfer , 2019, EMNLP.

[7]  Lena Reed,et al.  Maximizing Stylistic Control and Semantic Accuracy in NLG: Personality Variation and Discourse Contrast , 2019, Proceedings of the 1st Workshop on Discourse Structure in Neural NLG.

[8]  Lei Li,et al.  Generating Sentences from Disentangled Syntactic and Semantic Spaces , 2019, ACL.

[9]  Kevin Gimpel,et al.  A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations , 2019, NAACL.

[10]  Guillaume Lample,et al.  Multiple-Attribute Text Rewriting , 2018, ICLR.

[11]  Anna Rumshisky,et al.  Adversarial Decomposition of Text Representation , 2018, NAACL.

[12]  Lili Mou,et al.  Disentangled Representation Learning for Non-Parallel Text Style Transfer , 2018, ACL.

[13]  Marilyn A. Walker,et al.  Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators , 2018, SIGDIAL Conference.

[14]  Yulia Tsvetkov,et al.  Style Transfer Through Back-Translation , 2018, ACL.

[15]  Joel R. Tetreault,et al.  Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer , 2018, NAACL.

[16]  Dongyan Zhao,et al.  Style Transfer in Text: Exploration and Evaluation , 2017, AAAI.

[17]  Keith Carlson,et al.  Evaluating prose style transfer with the Bible , 2017, Royal Society Open Science.

[18]  Yoav Goldberg,et al.  Controlling Linguistic Style Aspects in Neural Language Generation , 2017, ArXiv.

[19]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[20]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[21]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.