Disambiguating temporal-contrastive connectives for machine translation

Temporal-contrastive discourse connectives (although, while, since, etc.) signal various types of relations between clauses such as temporal, contrast, concession and cause. They are often ambiguous and therefore difficult to translate from one language to another. We discuss several new and translation-oriented experiments for the disambiguation of a specific subset of discourse connectives in order to correct some of the translation errors made by current statistical machine translation systems.

[1]  M. I. Jordan Leo Breiman , 2011, 1101.0929.

[2]  Philipp Koehn,et al.  Aiding Pronoun Translation with Co-Reference Resolution , 2010, WMT@ACL.

[3]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[4]  Yannick Versley,et al.  Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection , 2010 .

[5]  B. Haddow Acquiring a Disambiguation Model For Discourse Connectives , 2005 .

[6]  Yann Mathet,et al.  ANNODIS: une approche outillée de l’annotation de structures discursives , 2009, JEPTALNRECITAL.

[7]  Ani Nenkova,et al.  Using Syntax to Disambiguate Explicit Discourse Connectives in Text , 2009, ACL.

[8]  B. Webber,et al.  Experiments on Sense Annotations and Sense Disambiguation of Discourse Connectives , 2005 .

[9]  Jirí Mírovský,et al.  Typical Cases of Annotators' Disagreement in Discourse Annotations in Prague Dependency Treebank , 2010, LREC.

[10]  Laurence Danlos,et al.  LEXCONN: A French Lexicon of Discourse Connectives , 2010 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[13]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[14]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[15]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[16]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[17]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[18]  Helmut Prendinger,et al.  A Novel Discourse Parser Based on Support Vector Machine Classification , 2009, ACL.