Paraphrase Generation with Latent Bag of Words

Paraphrase generation is a longstanding important problem in natural language processing. Recent progress in deep generative models has shown promising results on discrete latent variables for text generation. Inspired by variational autoencoders with discrete latent structures, in this work, we propose a latent bag of words (BOW) model for paraphrase generation. We ground the semantics of a discrete latent variable by the target BOW. We use this latent variable to build a fully differentiable content planning and surface realization pipeline. Specifically, we use source words to predict their neighbors and model the target BOW with a mixture of softmax. We use gumbel top-k reparameterization to perform differentiable subset sampling from the predicted BOW distribution. We retrieve the sampled word embeddings and use them to augment the decoder and guide its generation search space. Our latent BOW model not only enhances the decoder, but also exhibits clear interpretability. We show the model interpretability with regard to (1). unsupervised learning of word neighbors (2). the step-by-step generation procedure. Extensive experiments demonstrate the model's transparent and effective generation process.

[1]  Alexander M. Rush,et al.  A Tutorial on Deep Latent Variable Models of Natural Language , 2018, ArXiv.

[2]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[3]  Yishu Miao Deep generative models for natural language processing , 2017 .

[4]  Chris Callison-Burch,et al.  PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.

[5]  Mirella Lapata,et al.  Learning to Paraphrase for Question Answering , 2017, EMNLP.

[6]  Guoyin Wang,et al.  Topic-Guided Variational Auto-Encoder for Text Generation , 2019, NAACL.

[7]  Pascal Poupart,et al.  Order-Planning Neural Text Generation From Structured Data , 2017, AAAI.

[8]  Kathleen McKeown,et al.  Paraphrasing Questions Using Given and new information , 1983, CL.

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Qun Liu,et al.  Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[11]  Xu Sun,et al.  Bag-of-Words as Target for Neural Machine Translation , 2018, ACL.

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[14]  Ruslan Salakhutdinov,et al.  Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.

[15]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  Alexander M. Rush,et al.  End-to-End Content and Plan Selection for Data-to-Text Generation , 2018, INLG.

[18]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders , 2017, ICML.

[19]  Jihun Choi,et al.  Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[20]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[22]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[23]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Shashi Narayan,et al.  Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing , 2016, INLG.

[26]  Ido Dagan,et al.  Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation , 2019, NAACL.

[27]  Zhifang Sui,et al.  Table-to-text Generation by Structure-aware Seq2seq Learning , 2017, AAAI.

[28]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[30]  Xin Wang,et al.  No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling , 2018, ACL.

[31]  Mirella Lapata,et al.  Data-to-Text Generation with Content Selection and Planning , 2018, AAAI.

[32]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[33]  Stefano Ermon,et al.  Differentiable Subset Sampling , 2019, ArXiv.

[34]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[35]  Alexander M. Rush,et al.  Avoiding Latent Variable Collapse With Generative Skip Models , 2018, AISTATS.

[36]  Verena Rieser,et al.  Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[37]  Jannis Bulian,et al.  Ask the Right Questions: Active Question Reformulation with Reinforcement Learning , 2017, ICLR.

[38]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[39]  Regina Barzilay,et al.  Paraphrasing for Automatic Evaluation , 2006, NAACL.

[40]  Wenjie Li,et al.  Joint Copying and Restricted Generation for Paraphrase , 2016, AAAI.

[41]  Alexander F. Gelbukh,et al.  Synonymous Paraphrasing Using WordNet and Internet , 2004, NLDB.

[42]  Hang Li,et al.  Paraphrase Generation with Deep Reinforcement Learning , 2017, EMNLP.

[43]  Max Welling,et al.  Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement , 2019, ICML.

[44]  Jiacheng Xu,et al.  Spherical Latent Spaces for Stable Variational Autoencoders , 2018, EMNLP.

[45]  Oladimeji Farri,et al.  Neural Paraphrase Generation with Stacked Residual LSTM Networks , 2016, COLING.

[46]  Xifeng Yan,et al.  Cross-domain Semantic Parsing via Paraphrasing , 2017, EMNLP.

[47]  Ankush Gupta,et al.  A Deep Generative Framework for Paraphrase Generation , 2017, AAAI.

[48]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[49]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[50]  Graham Neubig,et al.  Lagging Inference Networks and Posterior Collapse in Variational Autoencoders , 2019, ICLR.

[51]  Alexander M. Rush,et al.  Latent Normalizing Flows for Discrete Sequences , 2019, ICML.

[52]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.