SenseFitting: Sense Level Semantic Specialization of Word Embeddings for Word Sense Disambiguation

We introduce a neural network-based system of Word Sense Disambiguation (WSD) for German that is based on SenseFitting, a novel method for optimizing WSD. We outperform knowledge-based WSD methods by up to 25% F1-score and produce a new state-of-the-art on the German sense-annotated dataset WebCAGe. Our method uses three feature vectors consisting of a) sense, b) gloss, and c) relational vectors to represent target senses and to compare them with the vector centroids of sample contexts. Utilizing widely available word embeddings and lexical resources, we are able to compensate for the lower resource availability of German. SenseFitting builds upon the recently introduced semantic specialization procedure Attract-Repel, and leverages sense level semantic constraints from lexical-semantic networks (e.g. GermaNet) or online social dictionaries (e.g. Wiktionary) to produce high-quality sense embeddings from pre-trained word embeddings. We evaluate our sense embeddings with a new SimLex-999 based similarity dataset, called SimSense, that we developed for this work. We achieve results that outperform current lemma-based specialization methods for German, making them comparable to results achieved for English.

[1]  Roland Schäfer,et al.  Processing and querying large web corpora with the COW14 architecture , 2015 .

[2]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[3]  Roberto Navigli,et al.  SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[4]  Ignacio Iacobacci,et al.  SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[5]  Gertjan van Noord,et al.  Simple Embedding-Based Word Sense Disambiguation , 2018, GWC.

[6]  David Vandyke,et al.  Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.

[7]  Iryna Gurevych,et al.  To Exhibit is not to Loiter: A Multilingual, Sense-Disambiguated Wiktionary for Measuring Verb Similarity , 2012, COLING.

[8]  Daniel Baumartz,et al.  FastSense: An Efficient Word Sense Disambiguation Classifier , 2018, LREC.

[9]  Tolga Uslu,et al.  Skalenfreie online-soziale Lexika am Beispiel von Wiktionary , 2018 .

[10]  Verena Henrich,et al.  Word Sense Disambiguation with GermaNet , 2015 .

[11]  Alexander Mehler,et al.  Resource-Size Matters: Improving Neural Named Entity Recognition with Optimized Large Corpora , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[12]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[13]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[14]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[15]  Claudia Kunze,et al.  GermaNet - representation, visualization, application , 2002, LREC.

[16]  Helmut Feldweg,et al.  GermaNet - a Lexical-Semantic Net for German , 1997 .

[17]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[18]  Iryna Gurevych,et al.  Dijkstra-WSA: A Graph-Based Approach to Word Sense Alignment , 2013, Transactions of the Association for Computational Linguistics.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Philipp Koehn,et al.  Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[21]  Kevin Gimpel,et al.  From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[22]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[23]  Wang Ling,et al.  Two/Too Simple Adaptations of Word2Vec for Syntax Problems , 2015, NAACL.

[24]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[25]  Thomas Eckart,et al.  Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages , 2012, LREC.

[26]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[27]  Erhard W. Hinrichs,et al.  WebCAGe – A Web-Harvested Corpus Annotated with GermaNet Senses , 2012, EACL.

[28]  Alexander Mehler,et al.  WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis , 2018, LREC.

[29]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).