Crowdsourcing as a preprocessing for complex semantic annotation tasks

This article outlines a methodology that uses crowdsourcing to reduce the workload of experts for complex semantic tasks. We split turker-annotated datasets into a high-agreement block, which is not modified, and a low-agreement block, which is re-annotated by experts. The resulting annotations have higher observed agreement. We identify different biases in the annotation for both turkers and experts.

[1]  Geoffrey Nunberg,et al.  Transfers of Meaning , 1995, J. Semant..

[2]  Malvina Nissim,et al.  Data and models for metonymy resolution , 2009, Lang. Resour. Evaluation.

[3]  Bolette S. Pedersen,et al.  Annotation of regular polysemy and underspecification , 2013, ACL.

[4]  Klaus Krippendorff,et al.  Agreement and Information in the Reliability of Coding , 2011 .

[5]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[6]  Malvina Nissim,et al.  Towards a Corpus Annotated for Metonymies: the Case of Location Names , 2002, LREC.

[7]  Reinhard Blutner,et al.  Lexical Pragmatics , 1998, J. Semant..

[8]  Ted Briscoe,et al.  Semi-productive Polysemy and Sense Extension , 1995, J. Semant..

[9]  lunchn sandwichn,et al.  Detecting selectional behavior of complex types in text , 2007 .

[10]  Dirk Hovy,et al.  Learning Whom to Trust with MACE , 2013, NAACL.

[11]  Christian Biemann,et al.  Distributional Semantics and Compositionality 2011: Shared Task Description and Results , 2011 .

[12]  David Stallard Two Kinds of Metonymy , 1993, ACL.

[13]  Chris Callison-Burch,et al.  Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk , 2009, EMNLP.

[14]  Nicholas Asher,et al.  Lexical Meaning in Context - A Web of Words , 2011 .

[15]  Mirella Lapata,et al.  A Probabilistic Account of Logical Metonymy , 2003, CL.

[16]  Robin Cooper,et al.  Do delicious lunches take a long time , 2005 .

[17]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[18]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[19]  Timothy Baldwin,et al.  Continuous Measurement Scales in Human Evaluation of Machine Translation , 2013, LAW@ACL.

[20]  David Jurgens,et al.  Embracing Ambiguity: A Comparison of Annotation Methodologies for Crowdsourcing Word Sense Labels , 2013, NAACL.

[21]  Ulrich Endriss,et al.  Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model , 2013, ACL.

[22]  Ralph Grishman,et al.  The American National Corpus: A Standardized Resource for American English , 2000, LREC.