T-PAS; A resource of Typed Predicate Argument Structures for linguistic analysis and semantic processing

The goal of this paper is to introduce T-PAS, a resource of typed predicate argument structures for Italian, acquired from corpora by manual clustering of distributional information about Italian verbs, to be used for linguistic analysis and semantic processing tasks. T-PAS is the first resource for Italian in which semantic selection properties and sense-in-context distinctions of verbs are characterized fully on empirical ground. In the paper, we first describe the process of pattern acquisition and corpus annotation (section 2) and its ongoing evaluation (section 3). We then demonstrate the benefits of pattern tagging for NLP purposes (section 4), and discuss current effort to improve the annotation of the corpus (section 5). We conclude by reporting on ongoing experiments using semiautomatic techniques for extending coverage (section 6).

[1]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[2]  Barbara B. Levin,et al.  English verb classes and alternations , 1993 .

[3]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[4]  Elisabetta Jezek,et al.  Capturing Coercions in Texts: a First Annotation Exercise , 2010, LREC.

[5]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[6]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[7]  Nicola Guarino,et al.  Sweetening Ontologies with DOLCE , 2002, EKAW.

[8]  김두식,et al.  English Verb Classes and Alternations , 2006 .

[9]  James Pustejovsky,et al.  A Pattern Dictionary for Natural Language Processing , 2005 .

[10]  Silvie Cinková,et al.  Managing Uncertainty in Semantic Tagging , 2012, EACL.

[11]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[12]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[13]  Dominique Willems,et al.  Coercion: Definition and challenges, current approaches, and new trends , 2011 .

[14]  Malvina Nissim,et al.  Senso Comune: A Collaborative Knowledge Resource for Italian , 2013, The People's Web Meets NLP.

[15]  Steffen Staab,et al.  WonderWeb: Ontology Infrastructure for the Semantic Web , 2004 .

[16]  Martha Palmer,et al.  Expanding VerbNet with Sketch Engine , 2013 .

[17]  Adam Kilgarriff,et al.  Large Linguistically-Processed Web Corpora for Multiple Languages , 2006, EACL.

[18]  Elisabetta Jezek Acquiring typed predicate-argument structures from corpora , 2012 .

[19]  Francesco Sabatini,et al.  Il Sabatini Coletti : dizionario della lingua italiana , 2003 .

[20]  Josef Ruppenhofer,et al.  FrameNet II: Extended theory and practice , 2006 .

[21]  Patrick Hanks Corpus pattern analysis , 2004 .

[22]  Ido Dagan,et al.  Recognizing textual entailment: Rational, evaluation and approaches , 2009, Natural Language Engineering.

[23]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[24]  Alessandro Lenci,et al.  LexIt: A Computational Resource on Italian Argument Structure , 2012, LREC.

[25]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[26]  Iryna Gurevych,et al.  The People's Web Meets NLP, Collaboratively Constructed Language Resources , 2013, Theory and Applications of Natural Language Processing.

[27]  James Pustejovsky,et al.  Type Theory and Lexical Decomposition , 2013, Advances in Generative Lexicon Theory.