论文信息 - Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus - 字舞流文

Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus

While the formal pragmatic concepts in information structure, such as the focus of an utterance, are precisely defined in theoretical linguistics and potentially very useful in conceptual and practical terms, it has turned out to be difficult to reliably annotate such notions in corpus data. We present a large-scale focus annotation effort designed to overcome this problem. Our annotation study is based on the tasked-based corpus CREG, which consists of answers to explicitly given reading comprehension questions. We compare focus annotation by trained annotators with a crowd-sourcing setup making use of untrained native speakers. Given the task context and an annotation process incrementally making the question form and answer type explicit, the trained annotators reach substantial agreement for focus annotation. Interestingly, the crowd-sourcing setup also supports high-quality annotation ― for specific subtypes of data. Finally, we turn to the question whether the relevance of focus annotation can be extrinsically evaluated. We show that automatic short-answer assessment significantly improves for focus annotated data. The focus annotated CREG corpus is freely available and constitutes the largest such resource for German.

Walt Detmar Meurers | Kordula De Kuthy | Ramon Ziai | Ramon Ziai | K. Kuthy

[1] Brendan T. O'Connor,et al. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[2] Manfred Krifka,et al. Basic notions of information structure , 2008 .

[3] Walt Detmar Meurers,et al. Evaluating Answers to Reading Comprehension Questions in Context: Results for German and the Role of Information Structure , 2011, TextInfer@EMNLP.

[4] Craige Roberts,et al. Information Structure: Towards an integrated formal theory of pragmatics , 2012 .

[5] Martin Chodorow,et al. Rethinking Grammatical Error Annotation and Evaluation with the Amazon Mechanical Turk , 2010 .

[6] Ulrich Endriss,et al. Empirical Analysis of Aggregation Methods for Collective Annotation , 2014, COLING.

[7] Arnim von Stechow,et al. Topic, Focus and Local Relevance , 1981 .

[8] Walt Detmar Meurers,et al. Evaluating the Meaning of Answers to Reading Comprehension Questions: A Semantics-Based Approach , 2012, BEA@NAACL-HLT.

[9] Craige Roberts. Information structure in discourse: Towards an integrated for-mal theory of pragmatics , 1996 .

[10] Mark Dredze,et al. Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[11] Chris Callison-Burch,et al. Crowdsourcing Translation: Professional Quality from Non-Professionals , 2011, ACL.

[12] Roger Schwarzschild,et al. GIVENNESS, AVOIDF AND OTHER CONSTRAINTS ON THE PLACEMENT OF ACCENT* , 1999 .

[13] Detmar Meurers,et al. Creation and Analysis of a Reading Comprehension Exercise Corpus : Towards Evaluating Meaning in Context , 2012 .

[14] Mark Steedman,et al. The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue , 2010, Lang. Resour. Evaluation.

[15] Walt Detmar Meurers,et al. Focus Annotation in Reading Comprehension Data , 2014, LAW@COLING.

[16] Detmar Meurers,et al. Learning what the crowd can do : a case study on focus annotation , 2015 .

[17] Daniel Büring,et al. On D-Trees, Beans, And B-Accents , 2003 .

[18] Walt Detmar Meurers. Diagnosing Meaning Errors in Short Answers to Reading Comprehension Questions , 2008 .

[19] Mats Rooth. A theory of focus interpretation , 1992, Natural Language Semantics.

[20] Ray Jackendoff,et al. Semantic Interpretation in Generative Grammar , 1972 .

[21] Rada Mihalcea,et al. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments , 2011, ACL.

[22] Stefanie Dipper,et al. Annotation of Information Structure: an Evaluation across different Types of Texts , 2008, LREC.