Crowdsourcing for the identification of event nominals: an experiment

This paper presents the design and results of a crowdsourcing experiment on the recognition of Italian event nominals. The aim of the experiment was to assess the feasibility of crowdsourcing methods for a complex semantic task such as distinguishing the eventive interpretation of polysemous nominals taking into consideration various types of syntagmatic cues. Details on the theoretical background and on the experiment set up are provided together with the final results in terms of accuracy and inter-annotator agreement. These results are compared with the ones obtained by expert annotators on the same task. The low values in accuracy and Fleiss’ kappa of the crowdsourcing experiment demonstrate that crowdsourcing is not always optimal for complex linguistic tasks. On the other hand, the use of non-expert contributors allows to understand what are the most ambiguous patterns of polysemy and the most useful syntagmatic cues to be used to identify the eventive reading of nominals.

[1]  Bolette S. Pedersen,et al.  Annotation of regular polysemy and underspecification , 2013, ACL.

[2]  Sivaji Bandyopadhyay,et al.  JU_CSE: A CRF Based Approach to Annotation of Temporal Expression, Event and Temporal Relations , 2013, SemEval@NAACL-HLT.

[3]  Vanni Zavarella,et al.  FSS-TimEx for TempEval-3: Extracting Temporal Information from Text , 2013, SemEval@NAACL-HLT.

[4]  Emanuele Pianta,et al.  Comparing two methods for crowdsourcing speech transcription , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[6]  Huang Chu-Ren,et al.  Sourcing the Crowd for a Few Good Ones: Event Type Detection , 2012, Coling 2012.

[7]  Matteo Negri,et al.  Chinese Whispers: Cooperative Paraphrase Acquisition , 2012, LREC.

[8]  Tommaso Caselli,et al.  Sourcing the Crowd for a Few Good Ones: Event Type Detection , 2012, COLING.

[9]  Béatrice Arnulphy A Weighted Lexicon of French Event Names , 2011, RANLP Student Research Workshop.

[10]  Tommaso Caselli,et al.  Annotating Events, Temporal Expressions and Relations in Italian: the It-Timeml Experience for the Ita-TimeBank , 2011, Linguistic Annotation Workshop.

[11]  Pascal Denis,et al.  French TimeBank: An ISO-TimeML Annotated Reference Corpus , 2011, ACL.

[12]  Tommaso Caselli,et al.  Recognizing deverbal events in context , 2011 .

[13]  Núria Bel,et al.  Automatic Detection of Non-deverbal Event Nouns for Quick Lexicon Production , 2010, COLING.

[14]  Chris Callison-Burch,et al.  Creating Speech and Language Data With Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.

[15]  Alexandre Klementiev,et al.  Using Mechanical Turk to Annotate Lexicons for Less Commonly Used Languages , 2010, Mturk@HLT-NAACL.

[16]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[17]  Chris Callison-Burch,et al.  Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription , 2010, NAACL.

[18]  Gerhard Weikum,et al.  NEAT: News Exploration Along Time , 2010, ECIR.

[19]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Rafael Muñoz,et al.  Enhancing QA Systems with Complex Temporal Question Processing Capabilities , 2009, J. Artif. Intell. Res..

[21]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[22]  Elisabetta Jezek Polysemy of Italian Event Nominals , 2007 .

[23]  Adam Kilgarriff,et al.  Large Linguistically-Processed Web Corpora for Multiple Languages , 2006, EACL.

[24]  Emanuele Pianta,et al.  I-CAB: the Italian Content Annotation Bank , 2006, LREC.

[25]  Eduard Hovy,et al.  Assigning Time-Stamps to Event-Clauses , 2001, The Language of Time - A Reader.

[26]  Graham Katz,et al.  The Annotation Of Temporal Information In Natural Language Sentences , 2001, The Language of Time - A Reader.

[27]  Frank Schilder,et al.  From Temporal Expressions To Temporal Information: Semantic Tagging Of News Messages , 2001, The Language of Time - A Reader.

[28]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[29]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[30]  Dragomir R. Radev,et al.  Sub-event based multi-document summarization , 2003, HLT-NAACL 2003.

[31]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[32]  Vasileios Hatzivassiloglou,et al.  Domain -independent detection, extraction, and labeling of Atomic Events , 2003 .

[33]  Anna Cicalese Le estensioni di verbo supporto. Uno studio introduttivo , 1999 .

[34]  Ferenc Kiefer Les substantifs déverbaux événementiels , 1998 .

[35]  益子 真由美 Argument Structure , 1993, The Lexicon.

[36]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[37]  K. Krippendorff Krippendorff, Klaus, Content Analysis: An Introduction to its Methodology . Beverly Hills, CA: Sage, 1980. , 1980 .

[38]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[39]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[40]  Jurij D. Apresjan REGULAR POLYSEMY , 1974 .

[41]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .