A Joint Phrasal and Dependency Model for Paraphrase Alignment

Monolingual alignment is frequently required for natural language tasks that involve similar or comparable sentences. We present a new model for monolingual alignment in which the score of an alignment decomposes over both the set of aligned phrases as well as a set of aligned dependency arcs. Optimal alignments under this scoring function are decoded using integer linear programming while model parameters are learned using standard structured prediction approaches. We evaluate our joint aligner on the Edinburgh paraphrase corpus and show significant gains over a Meteor baseline and a state-of-the-art phrase-based aligner.

[1]  Kathleen McKeown,et al.  A Framework for Identifying Textual Redundancy , 2008, COLING.

[2]  Ming-Wei Chang,et al.  Discriminative Learning over Constrained Latent Representations , 2010, NAACL.

[3]  Kathleen McKeown,et al.  Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment , 2011, ACL.

[4]  Christopher D. Manning,et al.  A Phrase-Based Alignment Model for Natural Language Inference , 2008, EMNLP.

[5]  Chris Quirk,et al.  Monolingual Machine Translation for Paraphrase Generation , 2004, EMNLP.

[6]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[7]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[8]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[9]  Eric Yeh,et al.  Learning Alignments and Leveraging Natural Logic , 2007, ACL-PASCAL@ACL.

[10]  Lucian Vlad Lita,et al.  tRuEcasIng , 2003, ACL.

[11]  John DeNero,et al.  The Complexity of Phrase Alignment Problems , 2008, ACL.

[12]  Mirella Lapata,et al.  Constructing Corpora for the Development and Evaluation of Paraphrase Systems , 2008, CL.

[13]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[14]  Houda Bouamor,et al.  Monolingual Alignment by Edit Rate Computation on Sentential Paraphrase Pairs , 2011, ACL.

[15]  Nitin Madnani,et al.  ETS: Discriminative Edit Models for Paraphrase Scoring , 2012, *SEMEVAL.

[16]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[17]  Mirella Lapata,et al.  Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 1999, ACL 1999.

[18]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[19]  Emiel Krahmer,et al.  Explorations in Sentence Fusion , 2005, ENLG.

[20]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[21]  Nitin Madnani,et al.  Re-examining Machine Translation Metrics for Paraphrase Identification , 2012, NAACL.

[22]  Christopher D. Manning,et al.  Learning to recognize features of valid textual entailments , 2006, NAACL.

[23]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[24]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.