Controllable Paraphrase Generation with a Syntactic Exemplar

Prior work on controllable text generation usually assumes that the controlled attribute can take on one of a small set of values known a priori. In this work, we propose a novel task, where the syntax of a generated sentence is controlled rather by a sentential exemplar. To evaluate quantitatively with standard metrics, we create a novel dataset with human annotations. We also develop a variational model with a neural module specifically designed for capturing syntactic knowledge and several multitask training objectives to promote disentangled representation learning. Empirically, the proposed model is observed to achieve improvements over baselines and learn to capture desirable characteristics.

[1]  Yoshua Bengio,et al.  Z-Forcing: Training Stochastic Recurrent Networks , 2017, NIPS.

[2]  Alexander M. Rush,et al.  Learning Neural Templates for Text Generation , 2018, EMNLP.

[3]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mohit Bansal,et al.  Polite Dialogue Generation Without Parallel Data , 2018, TACL.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[7]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[8]  Xu Sun,et al.  Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation , 2018, NAACL.

[9]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[10]  Angela Fan,et al.  Controllable Abstractive Summarization , 2017, NMT@ACL.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Furu Wei,et al.  Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization , 2018, ACL.

[13]  Lili Mou,et al.  Disentangled Representation Learning for Non-Parallel Text Style Transfer , 2018, ACL.

[14]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[15]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[16]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[17]  Yoav Goldberg,et al.  Controlling Linguistic Style Aspects in Neural Language Generation , 2017, ArXiv.

[18]  Xiaoyan Zhu,et al.  Generating Informative Responses with Controlled Sentence Function , 2018, ACL.

[19]  Kevin Gimpel,et al.  Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.

[20]  Barbara Plank,et al.  Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss , 2016, ACL.

[21]  Jason Weston,et al.  Retrieve and Refine: Improved Sequence Generation Models For Dialogue , 2018, SCAI@EMNLP.

[22]  Oladimeji Farri,et al.  Neural Paraphrase Generation with Stacked Residual LSTM Networks , 2016, COLING.

[23]  Eric P. Xing,et al.  Toward Unsupervised Text Content Manipulation , 2019, ArXiv.

[24]  Alexander M. Rush,et al.  Challenges in Data-to-Document Generation , 2017, EMNLP.

[25]  Marek Rei,et al.  Semi-supervised Multitask Learning for Sequence Labeling , 2017, ACL.

[26]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[27]  Erhardt Barth,et al.  A Hybrid Convolutional Variational Autoencoder for Text Generation , 2017, EMNLP.

[28]  Mirella Lapata,et al.  Paraphrasing Revisited with Neural Machine Translation , 2017, EACL.

[29]  Manaal Faruqui,et al.  Text Generation with Exemplar-based Adaptive Decoding , 2019, NAACL.

[30]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[31]  Mirella Lapata,et al.  Learning to Paraphrase for Question Answering , 2017, EMNLP.

[32]  Kevin Gimpel,et al.  Variational Sequential Labelers for Semi-Supervised Learning , 2019, EMNLP.

[33]  Chris Quirk,et al.  Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[34]  Björn Hoffmeister,et al.  Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding , 2017, ArXiv.

[35]  Dongyan Zhao,et al.  Style Transfer in Text: Exploration and Evaluation , 2017, AAAI.

[36]  Kevin Gimpel,et al.  A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations , 2019, NAACL.

[37]  Graham Neubig,et al.  Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction , 2017, ACL.

[38]  Gaurav Pandey,et al.  Exemplar Encoder-Decoder for Neural Conversation Generation , 2018, ACL.

[39]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[40]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders , 2017, ICML.

[41]  Xin Wang,et al.  Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models , 2019, ACL.

[42]  Xuan Wang,et al.  Variational Autoregressive Decoder for Neural Response Generation , 2018, EMNLP.

[43]  Dongyan Zhao,et al.  Insufficient Data Can Also Rock! Learning to Converse Using Smaller Data with Augmentation , 2019, AAAI.

[44]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[45]  Jiacheng Xu,et al.  Spherical Latent Spaces for Stable Variational Autoencoders , 2018, EMNLP.

[46]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[47]  Hang Li,et al.  Paraphrase Generation with Deep Reinforcement Learning , 2017, EMNLP.

[48]  Kevin Gimpel,et al.  Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations , 2017, ArXiv.

[49]  Nicola De Cao,et al.  Hyperspherical Variational Auto-Encoders , 2018, UAI 2018.

[50]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[51]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[52]  Kevin Gimpel,et al.  Smaller Text Classifiers with Discriminative Cluster Embeddings , 2018, NAACL-HLT.

[53]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[54]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[55]  Joelle Pineau,et al.  Piecewise Latent Variables for Neural Variational Text Processing , 2016, EMNLP.

[56]  Joachim Bingel,et al.  Multi-task learning for historical text normalization: Size matters , 2018, DeepLo@ACL.

[57]  Isabelle Augenstein,et al.  Multi-Task Learning of Keyphrase Boundary Classification , 2017, ACL.

[58]  Eneko Agirre,et al.  SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.