论文信息 - Squibs: From Annotator Agreement to Noise Models

Squibs: From Annotator Agreement to Noise Models

This article discusses the transition from annotated data to a gold standard, that is, a subset that is sufficiently noise-free with high confidence. Unless appropriately reinterpreted, agreement coefficients do not indicate the quality of the data set as a benchmarking resource: High overall agreement is neither sufficient nor necessary to distill some amount of highly reliable data from the annotated material. A mathematical framework is developed that allows estimation of the noise level of the agreed subset of annotated data, which helps promote cautious benchmarking.

Beata Beigman Klebanov | Eyal Beigman | E. Beigman

[1] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[2] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .

[3] P. Albert,et al. A Cautionary Note on the Robustness of Latent Class Models for Estimating Diagnostic Error without a Gold Standard , 2004, Biometrics.

[4] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[5] Giorgio Satta,et al. Guided Learning for Bidirectional Sequence Classification , 2007, ACL.

[6] Barbara Di Eugenio,et al. Squibs and Discussions: The Kappa Statistic: A Second Look , 2004, CL.

[7] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[8] Daniel Gildea,et al. Automatic Labeling of Semantic Roles , 2000, ACL.

[9] Renata Vieira,et al. An Empirically-based System for Processing Definite Descriptions , 2000, CL.

[10] Edith Cohen,et al. Learning noisy perceptrons by a perceptron in polynomial time , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[11] Ron Artstein,et al. Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.