Using Sense-labeled Discourse Connectives for Statistical Machine Translation

This article shows how the automatic disambiguation of discourse connectives can improve Statistical Machine Translation (SMT) from English to French. Connectives are firstly disambiguated in terms of the discourse relation they signal between segments. Several classifiers trained using syntactic and semantic features reach state-of-the-art performance, with F1 scores of 0.6 to 0.8 over thirteen ambiguous English connectives. Labeled connectives are then used into SMT systems either by modifying their phrase table, or by training them on labeled corpora. The best modified SMT systems improve the translation of connectives without degrading BLEU scores. A threshold-based SMT system using only high-confidence labels improves BLEU scores by 0.2--0.4 points.

[1]  Andrei Popescu-Belis,et al.  How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives , 2011, BUCC@ACL.

[2]  Ani Nenkova,et al.  Using Syntax to Disambiguate Explicit Discourse Connectives in Text , 2009, ACL.

[3]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[4]  B. Webber,et al.  Experiments on Sense Annotations and Sense Disambiguation of Discourse Connectives , 2005 .

[5]  Daniel Jurafsky,et al.  Disambiguating “DE” for Chinese-English Machine Translation , 2009, WMT@EACL.

[6]  Ying Zhang,et al.  Significance tests of automatic machine translation evaluation metrics , 2010, Machine Translation.

[7]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[8]  Degen Huang,et al.  POS Tagging of English Particles for Machine Translation , 2011, MTSUMMIT.

[9]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[10]  ZhangYing,et al.  Significance tests of automatic machine translation evaluation metrics , 2010 .

[11]  James Pustejovsky,et al.  Sequence models and ranking methods for discourse parsing , 2009 .

[12]  James Pustejovsky,et al.  Temporal Processing with the TARSQI Toolkit , 2008, COLING.

[13]  L. Danlos,et al.  Traduction (automatique) des connecteurs de discours ((Machine) Translation of discourse connectors) , 2011, JEPTALNRECITAL.

[14]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[15]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[16]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[17]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[20]  Dan Klein,et al.  Optimization, Maxent Models, and Conditional Estimation without Magic , 2003, NAACL.

[21]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[22]  Mitsuru Ishizuka,et al.  HILDA: A Discourse Parser Using Support Vector Machine Classification , 2010, Dialogue Discourse.

[23]  Hwee Tou Ng,et al.  A PDTB-styled end-to-end discourse parser , 2012, Natural Language Engineering.

[24]  James Pustejovsky,et al.  Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources , 2006, SIGDIAL Workshop.

[25]  Jason Baldridge,et al.  Discourse Connective Argument Identification with Connective Specific Rankers , 2008, 2008 IEEE International Conference on Semantic Computing.