KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning. First, from WordNet, we automatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets. Second, we automatically generate new sense-labeled data by DisDict from unlabeled corpora. Third, these generated data, together with manually labeled data and unlabeled data, are fed to a neural framework conducting supervised and unsupervised learning jointly to model the semantic relations among synsets, feature words and their contexts. The experimental results show that KDSL outperforms several representative state-of-the-art methods on various major benchmarks. Interestingly, it performs relatively well even when manually labeled data is unavailable, thus provides a potential solution for similar tasks in a lack of manual annotations.

[1]  Hwee Tou Ng,et al.  An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[2]  Hwee Tou Ng,et al.  One Million Sense-Tagged Instances for Word Sense Disambiguation and Induction , 2015, CoNLL.

[3]  Junpeng Chen,et al.  Combining ConceptNet and WordNet for Word Sense Disambiguation , 2011, IJCNLP.

[4]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[5]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[6]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[9]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[10]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[11]  Annalina Caputo,et al.  An Enhanced Lesk Word Sense Disambiguation Algorithm through a Distributional Semantic Model , 2014, COLING.

[12]  Iryna Gurevych,et al.  Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation , 2012, COLING.

[13]  Roberto Navigli,et al.  SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[14]  Mikael Kågebäck,et al.  Word Sense Disambiguation using a Bidirectional LSTM , 2016, CogALex@COLING.

[15]  Hans Uszkoreit,et al.  Multi-Objective Optimization for the Joint Disambiguation of Nouns and Named Entities , 2015, ACL.

[16]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[17]  Daniel Baumartz,et al.  FastSense: An Efficient Word Sense Disambiguation Classifier , 2018, LREC.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[20]  Roberto Navigli,et al.  A Large-Scale Multilingual Disambiguation of Glosses , 2016, LREC.

[21]  Marcello Pelillo,et al.  A Game-Theoretic Approach to Word Sense Disambiguation , 2016, CL.

[22]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[23]  Eneko Agirre,et al.  Random Walks for Knowledge-Based Word Sense Disambiguation , 2014, CL.

[24]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[25]  Ryan Doherty,et al.  Semi-supervised Word Sense Disambiguation with Neural Models , 2016, COLING.

[26]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[27]  Jacopo Urbani,et al.  Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words? , 2017, ArXiv.

[28]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[29]  Weiwei Guo,et al.  Combining Orthogonal Monolingual and Multilingual Sources of Evidence for All Words WSD , 2010, ACL.

[30]  George A. Miller,et al.  Using a Semantic Concordance for Sense Identification , 1994, HLT.

[31]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[32]  Roberto Navigli,et al.  SemEval-2013 Task 12: Multilingual Word Sense Disambiguation , 2013, *SEMEVAL.

[33]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[34]  Roberto Navigli,et al.  Two Knowledge-based Methods for High-Performance Sense Distribution Learning , 2018, AAAI.

[35]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[36]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[37]  Roberto Navigli,et al.  Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data , 2017, EMNLP.

[38]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[39]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[40]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[41]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[42]  Reinhard Rapp,et al.  Computation of Word Associations Based on Co-occurrences of Words in Large Corpora , 1993, VLC@ACL.

[43]  Xueqi Cheng,et al.  Inside Out: Two Jointly Predictive Models for Word Representations and Phrase Representations , 2016, AAAI.

[44]  Eneko Agirre,et al.  Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models , 2016, LREC.