Disambiguating temporal-contrastive connectives for machine translation

Temporal–contrastive discourse connectives (although, while, since, etc.) signal various types of relations between clauses such as temporal, contrast, concession and cause. They are often ambiguous and therefore difficult to translate from one language to another. We discuss several new and translation-oriented experiments for the disambiguation of a specific subset of discourse connectives in order to correct some of the translation errors made by current statistical machine translation systems.

[1]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[2]  Helmut Prendinger,et al.  A Novel Discourse Parser Based on Support Vector Machine Classification , 2009, ACL.

[3]  Ani Nenkova,et al.  Using Syntax to Disambiguate Explicit Discourse Connectives in Text , 2009, ACL.

[4]  B. Webber,et al.  Experiments on Sense Annotations and Sense Disambiguation of Discourse Connectives , 2005 .

[5]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[6]  M. I. Jordan Leo Breiman , 2011, 1101.0929.

[7]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[8]  Jirí Mírovský,et al.  Typical Cases of Annotators' Disagreement in Discourse Annotations in Prague Dependency Treebank , 2010, LREC.

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  Yannick Versley,et al.  Discovery of Ambiguous and Unambiguous Discourse Connectives via Annotation Projection , 2010 .

[11]  Yann Mathet,et al.  ANNODIS: une approche outillée de l’annotation de structures discursives , 2009, JEPTALNRECITAL.

[12]  Philipp Koehn,et al.  Aiding Pronoun Translation with Co-Reference Resolution , 2010, WMT@ACL.

[13]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[14]  Laurence Danlos,et al.  LEXCONN: A French Lexicon of Discourse Connectives , 2010 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[17]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[18]  B. Haddow Acquiring a Disambiguation Model For Discourse Connectives , 2005 .