Dictionary-Guided Editing Networks for Paraphrase Generation

An intuitive way for a human to write paraphrase sentences is to replace words or phrases in the original sentence with their corresponding synonyms and make necessary changes to ensure the new sentences are fluent and grammatically correct. We propose a novel approach to modeling the process with dictionary-guided editing networks which effectively conduct rewriting on the source sentence to generate paraphrase sentences. It jointly learns the selection of the appropriate word level and phrase level paraphrase pairs in the context of the original sentence from an off-the-shelf dictionary as well as the generation of fluent natural language sentences. Specifically, the system retrieves a set of word level and phrase level araphrased pairs derived from the Paraphrase Database (PPDB) for the original sentence, which is used to guide the decision of which the words might be deleted or inserted with the soft attention mechanism under the sequence-to-sequence framework. We conduct experiments on two benchmark datasets for paraphrase generation, namely the MSCOCO and Quora dataset. The evaluation results demonstrate that our dictionary-guided editing networks outperforms the baseline methods.

[1]  Nitin Madnani,et al.  Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods , 2010, CL.

[2]  Oladimeji Farri,et al.  Neural Paraphrase Generation with Stacked Residual LSTM Networks , 2016, COLING.

[3]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[4]  Ming Zhou,et al.  Combining Multiple Resources to Improve SMT-based Paraphrasing Model , 2008, ACL.

[5]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[6]  Emiel Krahmer,et al.  Paraphrase Generation as Monolingual Translation: Data and Evaluation , 2010, INLG.

[7]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[8]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[9]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[10]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[11]  Chris Callison-Burch,et al.  PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification , 2015, ACL.

[12]  Furu Wei,et al.  Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization , 2018, ACL.

[13]  Chris Quirk,et al.  Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[14]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Nitin Madnani,et al.  Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Haifeng Wang,et al.  Leveraging Multiple MT Engines for Paraphrase Generation , 2010, COLING.

[19]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[20]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[21]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[22]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[23]  Kathleen F. McCoy,et al.  Generation of Single-sentence Paraphrases from Predicate/Argument Structure using Lexico-grammatical Resources , 2003, IWP@ACL.

[24]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[25]  Ankush Gupta,et al.  A Deep Generative Framework for Paraphrase Generation , 2017, AAAI.

[26]  Rada Mihalcea,et al.  UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[27]  Ting Liu,et al.  Application-driven Statistical Paraphrase Generation , 2009, ACL.