How Complex is Discourse Structure?

This paper contributes to the question of which degree of complexity is called for in representations of discourse structure. We review recent claims that tree structures do not suffice as a model for discourse structure, with a focus on the work done on the Discourse Graphbank (DGB) of Wolf and Gibson (2005, 2006). We will show that much of the additional complexity in the DGB is not inherent in the data, but due to specific design choices that underlie W&G’s annotation. Three kinds of configuration are identified whose DGB analysis violates tree-structure constraints, but for which an analysis in terms of tree structures is possible, viz., crossed dependencies that are eventually based on lexical or referential overlap, multiple-parent structures that could be handled in terms of Marcu’s (1996) Nuclearity Principle, and potential list structures, in which whole lists of segments are related to a preceding segment in the same way. We also discuss the recent results which Lee et al. (2008) adduce as evidence for a complexity of discourse structure that cannot be handled in terms of tree structures.

[1]  Manfred Stede,et al.  The Potsdam Commentary Corpus , 2004, ACL 2004.

[2]  L. Polanyi A formal model of the structure of discourse , 1988 .

[3]  Livio Robaldo,et al.  The Penn Discourse Treebank 2.0 Annotation Manual , 2007 .

[4]  Rashmi Prasad,et al.  Departures from Tree Structures in Discourse: Shared Arguments in the Penn Discourse TreeBank , 2008 .

[5]  Nicholas Asher Troubles on the right frontier , 2008 .

[6]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[7]  Markus Egg,et al.  Underspecified discourse representation , 2005 .

[8]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[9]  Chris Mellish,et al.  Beyond Elaboration: The Interaction of Relations and Focus in Coherent Text , 2000 .

[10]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[11]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[12]  Edward Gibson,et al.  Representing Discourse Coherence: A Corpus-Based Study , 2005, CL.

[13]  W. Mann,et al.  Rhetorical Structure Theory: looking back and moving ahead , 2006 .

[14]  Gisela Redeker,et al.  Says who? On the treatment of speech attributions in discourse structure , 2006 .

[15]  L SidnerCandace,et al.  Attention, intentions, and the structure of discourse , 1986 .

[16]  Maki Watanabe,et al.  Discourse Tagging Reference Manual , 2001 .

[17]  Florian Wolf,et al.  Coherence in Natural Language: Data Structures and Applications , 2006 .

[18]  Daniel Marcu,et al.  Building Up Rhetorical Structure Trees , 1996, AAAI/IAAI, Vol. 2.

[19]  Laurence Danlos,et al.  Strong generative capacity of RST, SDRT and discourse dependency DAGSs , 2007 .

[20]  Matthew Stone,et al.  Anaphora and Discourse Structure , 2001, CL.