Word sense disambiguation based on context selection using knowledge-based word similarity

Abstract In this paper, we introduce a novel knowledge-based word-sense disambiguation (WSD) system. In particular, the main goal of our research is to find an effective way to filter out unnecessary information by using word similarity. For this, we adopt two methods in our WSD system. First, we propose a novel encoding method for word vector representation by considering the graphical semantic relationships from the lexical knowledge bases, and the word vector representation is utilized to determine the word similarity in our WSD system. Second, we present an effective method for extracting the contextual words from a text for analyzing an ambiguous word based on word similarity. The results demonstrate that the suggested methods significantly enhance the baseline WSD performance in all corpora. In particular, the performance on nouns is similar to those of the state-of-the-art knowledge-based WSD models, and the performance on verbs surpasses that of the existing knowledge-based WSD models.

[1]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[2]  Alneu de Andrade Lopes,et al.  Word sense disambiguation: A complex network approach , 2018, Inf. Sci..

[3]  Zhifang Sui,et al.  Incorporating Glosses into Neural Word Sense Disambiguation , 2018, ACL.

[4]  Raazesh Sainudiin,et al.  An Iterative 'Sudoku Style' Approach to Subgraph-based Word Sense Disambiguation , 2014, *SEM@COLING.

[5]  Rada Mihalcea,et al.  Knowledge-Based Methods for WSD , 2007 .

[6]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[7]  Soto Montalvo,et al.  Person name disambiguation on the web in a multilingual context , 2018, Inf. Sci..

[8]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[11]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[12]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[13]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[14]  Youngjoong Ko,et al.  Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph , 2018, COLING.

[15]  André Paim Lemos,et al.  An anatomy for neural search engines , 2019, Inf. Sci..

[16]  Ruslan Salakhutdinov,et al.  Knowledge-based Word Sense Disambiguation using Topic Models , 2018, AAAI.

[17]  Mirella Lapata,et al.  Graph Connectivity Measures for Unsupervised Word Sense Disambiguation , 2007, IJCAI.

[18]  Eneko Agirre,et al.  Random Walks for Knowledge-Based Word Sense Disambiguation , 2014, CL.

[19]  Florentina Hristea,et al.  The long road from performing word sense disambiguation to successfully using it in information retrieval: An overview of the unsupervised approach , 2020, Comput. Intell..

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Eneko Agirre,et al.  The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD , 2018, ArXiv.

[22]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[23]  Adhistya Erna Permanasari,et al.  Cosine similarity to determine similarity measure: Study case in online essay assessment , 2016, 2016 4th International Conference on Cyber and IT Service Management.

[24]  Hans Uszkoreit,et al.  Multi-Objective Optimization for the Joint Disambiguation of Nouns and Named Entities , 2015, ACL.

[25]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[26]  Marcello Pelillo,et al.  A Game-Theoretic Approach to Word Sense Disambiguation , 2016, CL.

[27]  Radu Tudor Ionescu,et al.  ShotgunWSD 2.0: An Improved Algorithm for Global Word Sense Disambiguation , 2019, IEEE Access.

[28]  Xuanjing Huang,et al.  GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge , 2019, EMNLP.

[29]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[30]  Iraklis Varlamis,et al.  A knowledge-based semantic framework for query expansion , 2019, Inf. Process. Manag..

[31]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[32]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[33]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[34]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[35]  Xu Ying,et al.  A Hybrid Bat Algorithm Based on Combined Semantic Measures for Word Sense Disambiguation , 2019, ICNC-FSKD.

[36]  Roberto Navigli,et al.  SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations , 2019, EMNLP.

[37]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[38]  Dexter Kozen,et al.  Depth-First and Breadth-First Search , 1992 .

[39]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[40]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[41]  Roberto Navigli,et al.  SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[42]  Pushpak Bhattacharyya,et al.  Unsupervised Word Sense Disambiguation Using Markov Random Field and Dependency Parser , 2015, AAAI.

[43]  Roberto Navigli,et al.  SemEval-2013 Task 12: Multilingual Word Sense Disambiguation , 2013, *SEMEVAL.

[44]  Annalina Caputo,et al.  An Enhanced Lesk Word Sense Disambiguation Algorithm through a Distributional Semantic Model , 2014, COLING.

[45]  Jong-Hyeok Lee,et al.  Influence of WSD on Cross-Language Information Retrieval , 2004, IJCNLP.

[46]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[47]  Jerome R. Bellegarda,et al.  Statistical language model adaptation: review and perspectives , 2004, Speech Commun..

[48]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[49]  Mirella Lapata,et al.  An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[51]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Information Retrieval , 2012, ACL.

[52]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[53]  Jugal Kalita,et al.  Multi-task learning for natural language processing in the 2020s: where are we going? , 2020, Pattern Recognit. Lett..

[54]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[55]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[56]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[57]  Sudip Kumar Naskar,et al.  Bio-molecular Event Trigger Extraction by Word Sense Disambiguation Based on Supervised Machine Learning Using Wordnet-Based Data Decomposition and Feature Selection , 2020 .

[58]  Christiane Fellbaum,et al.  English Tasks: All-Words and Verb Lexical Sample , 2001, *SEMEVAL.

[59]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[60]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.