Annotating the meaning of discourse connectives in multilingual corpora

Abstract Discourse connectives are lexical items indicating coherence relations between discourse segments. Even though many languages possess a whole range of connectives, important divergences exist cross-linguistically in the number of connectives that are used to express a given relation. For this reason, connectives are not easily paired with a univocal translation equivalent across languages. This paper is a first attempt to design a reliable method to annotate the meaning of discourse connectives cross-linguistically using corpus data. We present the methodological choices made to reach this aim and report three annotation experiments using the framework of the Penn Discourse Tree Bank.

[1]  Rashmi Prasad,et al.  Realization of Discourse Relations by Other Means: Alternative Lexicalizations , 2010, COLING.

[2]  Ted Sanders,et al.  Causal connectives in discourse: A cross-linguistic perspective , 2012 .

[3]  Alexandra Y. Aikhenvald,et al.  The Semantics of Clause Linking: A Cross-Linguistic Typology , 2009 .

[4]  Andrei Popescu-Belis,et al.  How Comparable are Parallel Corpora? Measuring the Distribution of General Vocabulary and Connectives , 2011, BUCC@ACL.

[5]  Alex Lascarides,et al.  Logics of Conversation , 2005, Studies in natural language processing.

[6]  Liesbeth Degand,et al.  A contrastive study of Dutch and French causal connectives on the speaker involvement scale , 2003 .

[7]  Wilbert Spooren,et al.  The processing of underspecified coherence relations , 1997 .

[8]  Leo G. M. Noordman,et al.  Toward a taxonomy of coherence relations , 1992 .

[9]  Mona Baker,et al.  In Other Words: A Coursebook on Translation , 1993 .

[10]  Maite Taboada,et al.  Discourse markers and coherence relations: Comparison across markers, languages and modalities , 2012 .

[11]  Liesbeth Degand,et al.  Coding coherence relations: Reliability and validity , 2010 .

[12]  Eve Sweetser From Etymology to Pragmatics: Notes , 1990 .

[13]  Sandrine Zufferey,et al.  “Car, parce que, puisque” revisited: Three empirical studies on French causal connectives , 2012 .

[14]  R. Carston Thoughts and Utterances , 2002 .

[15]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[16]  Petra Saskia Bayerl,et al.  What Determines Inter-Coder Agreement in Manual Annotations? A Meta-Analytic Investigation , 2011, CL.

[17]  Livio Robaldo,et al.  The Penn Discourse Treebank 2.0 Annotation Manual , 2007 .

[18]  S. Granger,et al.  Connector usage in the English essay writing of native and non‐native EFL speakers of English , 1996 .

[19]  Hsin-Hsi Chen,et al.  Chinese Discourse Relation Recognition , 2011, IJCNLP.

[20]  Robyn Carston,et al.  Thoughts and Utterances: The Pragmatics of Explicit Communication , 2002 .

[21]  M. Pit,et al.  Cross-linguistic analyses of backward causal connectives in Dutch, German and French , 2007 .

[22]  Silvia Bernardini,et al.  A New Approach to the Study of Translationese: Machine-learning the Difference between Original and Translated Text , 2005, Lit. Linguistic Comput..

[23]  Jirí Mírovský,et al.  Typical Cases of Annotators' Disagreement in Discourse Annotations in Prague Dependency Treebank , 2010, LREC.

[24]  W. Crewe,et al.  The illogic of logical connectives , 1990 .

[25]  Liesbeth Degand,et al.  Structure narrative et connecteurs temporels en français langue seconde , 2009 .

[26]  Michael Halliday,et al.  Cohesion in English , 1976 .

[27]  S. Zufferey,et al.  English and French causal connectives in contrast , 2012 .

[28]  Liesbeth Degand,et al.  Contrastive analyses, translation and speaker involvement : the case of "puisque" and "aangezien" , 2004 .

[29]  J. Anscombre,et al.  Deux mais en français , 1977 .

[30]  Rashmi Prasad,et al.  Evaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank , 2012, LREC.

[31]  Livio Robaldo,et al.  Sense Annotation in the Penn Discourse Treebank , 2008, CICLing.

[32]  S. Zufferey,et al.  A Multifactorial Analysis of Explicitation in Translation , 2014 .

[33]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[34]  Hannah M. Nash,et al.  The influence of connectives on young readers' processing and comprehension of text. , 2011 .

[35]  Linda A. Kinnahan,et al.  In other words , 2019, Chinese Theology and Translation.

[36]  Liesbeth Degand,et al.  Historical and comparative perspectives on subjectification: A corpus-based analysis of Dutch and French causal connectives , 2011 .

[37]  Sylviane Granger,et al.  Comparable and translation corpora in cross-linguistic research. Design, analysis and applications , 2010 .

[38]  Béatrice Lamiroy Pragmatic Connectives and L2 Acquisition: The Case of French and Dutch , 1994 .

[39]  Katja Markert,et al.  The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic , 2010, LREC.

[40]  T. Sanders Semantic and pragmatic sources of coherence: On the categorization of coherence relations in context , 1997 .

[41]  A. Knott,et al.  Using Linguistic Phenomena to Motivate a Set of Coherence Relations. , 1994 .

[42]  Nicholas Asher,et al.  Reference to abstract objects in discourse , 1993, Studies in linguistics and philosophy.