Capturing Coercions in Texts: a First Annotation Exercise

In this paper we report the first results of an annotation exercise of argument coercion phenomena performed on Italian texts. Our corpus consists of ca 4000 sentences from the PAROLE sottoinsieme corpus (Bindi et al. 2000) annotated with Selection and Coercion relations among verb-noun pairs formatted in XML according to the Generative Lexicon Mark-up Language (GLML) format (Pustejovsky et al., 2008). For the purposes of coercion annotation, we selected 26 Italian verbs that impose semantic typing on their arguments in either Subject, Direct Object or Complement position. Every sentence of the corpus is annotated with the source type for the noun arguments by two annotators plus a judge. An overall agreement of 0.87 kappa indicates that the annotation methodology is reliable. A qualitative analysis of the results allows us to outline some suggestions for improvement of the task: 1) a different account of complex types for nouns has to be devised and 2) a more comprehensive account of coercion mechanisms requires annotation of the deeper meaning dimensions that are targeted in coercion operations, such as those captured by Qualia relations.

[1]  James Pustejovsky,et al.  A Pattern Dictionary for Natural Language Processing , 2005 .

[2]  Jean Carletta,et al.  Squibs: Reliability Measurement without Limits , 2008, CL.

[3]  Ted Briscoe,et al.  Semi-productive Polysemy and Sense Extension , 1995, J. Semant..

[4]  Nicoletta Calzolari,et al.  SIMPLE: A General Framework for the Development of Multilingual Lexicons , 2000, LREC.

[5]  James Pustejovsky,et al.  Automated Induction of Sense in Context , 2004, COLING.

[6]  Adam Kilgarriff,et al.  Large Linguistically-Processed Web Corpora for Multiple Languages , 2006, EACL.

[7]  James Pustejovsky,et al.  GLML: Annotating Argument Selection and Coercion , 2009, IWCS.

[8]  James Pustejovsky,et al.  Semantic Coercion in Language: Beyond Distributional Analysis , 2012 .

[9]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[10]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[11]  Malvina Nissim,et al.  Data and models for metonymy resolution , 2009, Lang. Resour. Evaluation.

[12]  M. González Rodríguez,et al.  Proceedings of the third International Conference on Language Resources and Evaluation , 2002 .

[13]  James Pustejovsky,et al.  SemEval-2010 Task 7: Argument Selection and Coercion , 2009, SemEval@ACL.

[14]  James Pustejovsky,et al.  Towards a Generative Lexical Resource: The Brandeis Semantic Ontology , 2006, LREC.

[15]  Elisabetta Jezek,et al.  When GL meets the corpus: a data-driven investigation of semantic types and coercion phenomena , 2007 .

[16]  Fredric C. Gey,et al.  Proceedings of LREC , 2010 .

[17]  James Pustejovsky,et al.  Semantic coercion in language , 2008 .

[18]  Malvina Nissim,et al.  Towards a Corpus Annotated for Metonymies: the Case of Location Names , 2002, LREC.

[19]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.