Representing discourse coherence: A corpus-based analysis

We present a set of discourse structure relations that are easy to code, and develop criteria for an appropriate data structure for representing these relations. Discourse structure here refers to informational relations that hold between sentences in a discourse (cf. Hobbs, 1985). We evaluated whether trees are a descriptively adequate data structure for representing coherence. Trees are widely assumed as a data structure for representing coherence but we found that more powerful data structures are needed: In coherence structures of naturally occurring texts, we found many different kinds of crossed dependencies, as well as many nodes with multiple parents. The claims are supported by statistical results from a database of 135 texts from the Wall Street Journal and the AP Newswire that were hand-annotated with coherence relations, based on the annotation schema presented in this paper.

[1]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[2]  W. Kintsch,et al.  Strategies of discourse comprehension , 1983 .

[3]  Matthew Stone,et al.  Discourse Relations: A Structural and Presuppositional Account Using Lexicalised TAG , 1999, ACL.

[4]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[5]  Marilyn A. Walker,et al.  Centering, Anaphora Resolution, and Discourse Structure , 1997, ArXiv.

[6]  Wolfgang Lezius,et al.  A Description Language for Syntactically Annotated Corpora , 2000, COLING.

[7]  Noam Chomsky,et al.  Conditions on transformations , 1971 .

[8]  Alex Lascarides,et al.  Temporal interpretation, discourse relations and commonsense entailment , 1993, The Language of Time - A Reader.

[9]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[10]  Ingrid Zukerman,et al.  Generating Discourse across Several User Models: Maximizing Belief while Avoiding Boredom and Overload , 1995, IJCAI.

[11]  Stuart M. Shieber,et al.  Evidence against the context-freeness of natural language , 1985 .

[12]  Alistair Knott,et al.  A data-driven methodology for motivating a set of coherence relations , 1996 .

[13]  Johanna D. Moore,et al.  A Problem for RST: The Need for Multi-Level Discourse Analysis , 1992, CL.

[14]  Michael Halliday,et al.  Cohesion in English , 1976 .

[15]  L. Polanyi The Linguistic Structure of Discourse , 2005 .

[16]  Johanna D. Moore,et al.  Toward a Synthesis of Two Accounts of Discourse Structure , 1996, CL.

[17]  Matthew Stone,et al.  Anaphora and Discourse Structure , 2001, CL.

[18]  J. Hobbs On the coherence and structure of discourse , 1985 .

[19]  Eduard Hovy,et al.  Parsimonious or Profligate: How Many and Which Discourse Structure Relations? , 1992 .

[20]  Sabine Bergler,et al.  The Semantics of Collocational Patterns for Reporting Verbs , 1991, EACL.

[21]  Andrew Kehler,et al.  Coherence, reference, and the theory of grammar , 2002, CSLI lecture notes series.

[22]  Carolyn Penstein Rosé,et al.  Discourse Processing of Dialogues with Multiple Threads , 1995, ACL.

[23]  Lawrence Birnbaum,et al.  Argument Molecules: A Functional Representation of Argument Structure , 1982, AAAI.

[24]  Kathleen McKeown,et al.  Text generation: using discourse strategies and focus constraints to generate natural language text , 1985 .

[25]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[26]  Hulstijn THE GRAMMAR OF DISCOURSE , 2010 .

[27]  L. Polanyi A formal model of the structure of discourse , 1988 .

[28]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[29]  Julia Hirschberg,et al.  A Prosodic Analysis of Discourse Segments in Direction-Giving Monologues , 1996, ACL.

[30]  Sabine Bergler,et al.  Evidential analysis of reported speech , 1992 .

[31]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[32]  Martin van den Berg,et al.  A Rule Based Approach to Discourse Parsing , 2004, SIGDIAL Workshop.

[33]  Remko Scha,et al.  A Syntactic Approach to Discourse Semantics , 1984, ACL.

[34]  S. Corston-Oliver,et al.  Computing representations of the structure of written discourse , 1998 .

[35]  Rachel Reichman,et al.  Getting computers to talk like you and me , 1985 .

[36]  N. Curteanu Book Reviews: Lecture on Contemporary Syntactic Theories: An Introduction to Unification-Based Approaches to Grammar , 1987, CL.

[37]  M. Walker,et al.  Centering Theory in Discourse , 1998 .

[38]  Laurence Danlos,et al.  Discourse dependency structures as DAGs , 2002 .

[39]  Jerry R. Hobbs,et al.  Interpretation as Abduction , 1993, Artif. Intell..

[40]  Wojciech Skut,et al.  An Annotation Scheme for Free Word Order Languages , 1997, ANLP.