A Corpus and Model Integrating Multiword Expressions and Supersenses

This paper introduces a task of identifying and semantically classifying lexical expressions in running text. We investigate the online reviews genre, adding semantic supersense annotations to a 55,000 word English corpus that was previously annotated for multiword expressions. The noun and verb supersenses apply to full lexical expressions, whether single- or multiword. We then present a sequence tagging model that jointly infers lexical expressions and their supersenses. Results show that even with our relatively small training corpus in a noisy domain, the joint task can be performed to attain 70% class labeling F1.

[1]  Dirk Hovy,et al.  More or less supervised supersense tagging of Twitter , 2014, *SEMEVAL.

[2]  Christiane Fellbaum,et al.  English Verbs as a Semantic Net , 1990 .

[3]  Nathan Schneider,et al.  Lexical Semantic Analysis in Natural Language Text EXTENDED , 2015 .

[4]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[5]  Gerhard Paass,et al.  Exploiting Semantic Constraints for Estimating Supersenses with CRFs , 2009, SDM.

[6]  Davide Picca,et al.  Bridging Languages by SuperSense Entity Tagging , 2009, NEWS@IJCNLP.

[7]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[8]  Dawn Archer,et al.  Extracting Multiword Expressions with A Semantic Tagger , 2003, ACL 2003.

[9]  Maria Simi,et al.  SuperSense Tagging with a Maximum Entropy Markov Model , 2011, EVALITA.

[10]  Massimiliano Ciaramita,et al.  Supersense Tagging of Unknown Nouns in WordNet , 2003, EMNLP.

[11]  George A. Miller,et al.  Nouns in WordNet: A Lexical Inheritance System , 1990 .

[12]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[13]  Noah A. Smith,et al.  Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut , 2014, TACL.

[14]  Maria Simi,et al.  Description and Results of the SuperSense Tagging Task , 2011, EVALITA.

[15]  Timothy Baldwin,et al.  Multiword Expressions , 2010, Handbook of Natural Language Processing.

[16]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[17]  Francis R. Bach,et al.  Hidden Markov tree models for semantic class induction , 2013, CoNLL.

[18]  Davide Picca,et al.  Supersense Tagger for Italian , 2008, LREC.

[19]  Kemal Oflazer,et al.  Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study , 2012, ACL.

[20]  Kemal Oflazer,et al.  Supersense Tagging for Arabic: the MT-in-the-Middle Attack , 2013, NAACL.

[21]  Noah A. Smith,et al.  Comprehensive Annotation of Multiword Expressions in a Social Web Corpus , 2014, LREC.

[22]  Yulia Tsvetkov,et al.  Augmenting English Adjective Senses with Supersenses , 2014, LREC.

[23]  Dawn Archer,et al.  Comparing and combining a semantic tagger and a statistical tool for MWE extraction , 2005, Comput. Speech Lang..

[24]  Simonetta Montemagni,et al.  A Resource and Tool for Super-sense Tagging of Italian Texts , 2010, LREC.

[25]  Likun Qiu,et al.  Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words , 2011, CICLing.

[26]  Noah A. Smith,et al.  Automatic factual question generation from text , 2011 .

[27]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[28]  Yasemin Altun,et al.  Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger , 2006, EMNLP.