A Probabilistic Formulation of Unsupervised Text Style Transfer

We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques. Our probabilistic approach models non-parallel data from two domains as a partially observed parallel corpus. By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion. In contrast with traditional generative sequence models (e.g. the HMM), our model makes few assumptions about the data it generates: it uses a recurrent language model as a prior and an encoder-decoder as a transduction distribution. While computation of marginal data likelihood is intractable in this model class, we show that amortized variational inference admits a practical surrogate. Further, by drawing connections between our variational objective and other recent unsupervised style transfer and machine translation techniques, we show how our probabilistic view can unify some known non-generative objectives such as backtranslation and adversarial loss. Finally, we demonstrate the effectiveness of our method on a wide range of unsupervised style transfer tasks, including sentiment transfer, formality transfer, word decipherment, author imitation, and related language translation. Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes. Further, we conduct experiments on a standard unsupervised machine translation task and find that our unified approach matches the current state-of-the-art.

[1]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[2]  Guillaume Lample,et al.  Multiple-Attribute Text Rewriting , 2018, ICLR.

[3]  Graham Neubig,et al.  compare-mt: A Tool for Holistic Comparison of Language Generation Systems , 2019, NAACL.

[4]  Tomoki Toda,et al.  Linguistic Individuality Transformation for Spoken Language , 2015, Natural Language Dialog Systems and Intelligent Assistants.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Harsh Jhamtani,et al.  Shakespearizing Modern Language Using Copy-Enriched Sequence to Sequence Models , 2017, Proceedings of the Workshop on Stylistic Variation.

[7]  Phil Blunsom,et al.  Language as a Latent Variable: Discrete Generative Models for Sentence Compression , 2016, EMNLP.

[8]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[9]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[10]  Joel R. Tetreault,et al.  Dear Sir or Madam, May I Introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer , 2018, NAACL.

[11]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[12]  Kevin Knight,et al.  Dependency-Based Decipherment for Resource-Limited Machine Translation , 2013, EMNLP.

[13]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[14]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[15]  Eric P. Xing,et al.  Unsupervised Text Style Transfer using Language Models as Discriminators , 2018, NeurIPS.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Graham Neubig,et al.  StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing , 2018, ACL.

[18]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[19]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[20]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[21]  Kevin Knight,et al.  Deciphering Related Languages , 2017, EMNLP.

[22]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[23]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[24]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[25]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[26]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[27]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[28]  Kevin Knight,et al.  Unsupervised Analysis for Decipherment Problems , 2006, ACL.

[29]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[30]  Kevin Knight,et al.  Deciphering Foreign Language , 2011, ACL.

[31]  Guillaume Lample,et al.  Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.

[32]  Iyad Rahwan,et al.  Evaluating Style Transfer for Text , 2019, NAACL.

[33]  Eneko Agirre,et al.  An Effective Approach to Unsupervised Machine Translation , 2019, ACL.

[34]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[35]  Ralph Grishman,et al.  Paraphrasing for Style , 2012, COLING.