论文信息 - Unifying Annotated Discourse Hierarchies to Create a Gold Standard

Unifying Annotated Discourse Hierarchies to Create a Gold Standard

Human annotation of discourse corpora typically results in segmentation hierarchies that vary in their degree of agreement. This paper presents several techniques for unifying multiple discourse annotations into a single hierarchy, deemed a “gold standard” — the segmentation that best captures the underlying linguistic structure of the discourse. It proposes and analyzes methods that consider the level of embeddedness of a segmentation as well as methods that do not. A corpus containing annotated hierarchical discourses, the Boston Directions Corpus, was used to evaluate the “goodness” of each technique, by comparing the similarity of the segmentation it derives to the original annotations in the corpus. Several metrics of similarity between hierarchical segmentations are computed: precision/recall of matching utterances, pairwise inter-reliability scores ( ), and non-crossing-brackets. A novel method for unification that minimizes conflicts among annotators outperforms methods that require consensus among a majority for the and precision metrics, while capturing much of the structure of the discourse. When high recall is preferred, methods requiring a majority are preferable to those that demand full consensus among annotators.

[1] Julia Hirschberg,et al. Some intonational characteristics of discourse structure , 1992, ICSLP.

[2] Candace L. Sidner,et al. Attention, Intentions, and the Structure of Discourse , 1986, CL.

[3] Rebecca J. Passonneau,et al. Intention-Based Segmentation: Human Reliability and Correlation with Linguistic Cues , 1993, ACL.

[4] Roger Bakeman,et al. Observing Interaction: An Introduction to Sequential Analysis , 1986 .

[5] Treebank : Building a Large Scale Annotated Corpus Encoding DLTAG-based Discourse Structure and Discourse Relations , 2003 .

[6] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[7] John Andrew Rotondo,et al. Clustering analysis of subjective partitions of text , 1984 .

[8] Julia Hirschberg,et al. A Prosodic Analysis of Discourse Segments in Direction-Giving Monologues , 1996, ACL.

[9] Daniel Marcu,et al. Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[10] Victor Zue,et al. Empirical evaluation of human performance and agreement in parsing discourse constituents in spoken dialogue , 1995, EUROSPEECH.

[11] Janyce Wiebe,et al. Development and Use of a Gold-Standard Data Set for Subjectivity Classifications , 1999, ACL.

[12] Marti A. Hearst. Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.