One Single Deep Bidirectional LSTM Network for Word Sense Disambiguation of Text Data

Due to recent technical and scientific advances, we have a wealth of information hidden in unstructured text data such as offline/online narratives, research articles, and clinical reports. To mine these data properly, attributable to their innate ambiguity, a Word Sense Disambiguation (WSD) algorithm can avoid numbers of difficulties in Natural Language Processing (NLP) pipeline. However, considering a large number of ambiguous words in one language or technical domain, we may encounter limiting constraints for proper deployment of existing WSD models. This paper attempts to address the problem of one-classifier-per-one-word WSD algorithms by proposing a single Bidirectional Long Short-Term Memory (BLSTM) network which by considering senses and context sequences works on all ambiguous words collectively. Evaluated on SensEval-3 benchmark, we show the result of our model is comparable with top-performing WSD algorithms. We also discuss how applying additional modifications alleviates the model fault and the need for more training data.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Cristian Grozea,et al.  Finding optimal parameter settings for high performance word sense disambiguation , 2004 .

[4]  Carlo Strapparava,et al.  Pattern abstraction and term similarity for Word Sense Disambiguation: IRST at Senseval-3 , 2004 .

[5]  Hwee Tou Ng,et al.  Supervised Word Sense Disambiguation with Support Vector Machines and multiple knowledge sources , 2004, SENSEVAL@ACL.

[6]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[7]  Daphne Koller,et al.  Word-Sense Disambiguation for Machine Translation , 2005, HLT.

[8]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[9]  George Yee,et al.  Applying word sense disambiguation to question answering system for e-learning , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[10]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[11]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[12]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[13]  Ted Pedersen,et al.  WordNet::SenseRelate::AllWords - A Broad Coverage Word Sense Tagger that Maximizes Semantic Relatedness , 2009, NAACL.

[14]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Ahmad Pesaranghader,et al.  Adapting Gloss Vector Semantic Relatedness Measure for Semantic Similarity Estimation: An Evaluation in the Biomedical Domain , 2013, JIST.

[17]  Ted Pedersen,et al.  Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text , 2013, J. Biomed. Informatics.

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Norwati Mustapha,et al.  Word Sense Disambiguation for Biomedical Text Mining Using Definition-Based Semantic Relatedness and Similarity Measures , 2014 .

[20]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  Hwee Tou Ng,et al.  Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains , 2015, NAACL.

[23]  Juntae Yoon,et al.  Link-topic model for biomedical abbreviation disambiguation , 2015, J. Biomed. Informatics.

[24]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[25]  Mikael Kågebäck,et al.  Word Sense Disambiguation using a Bidirectional LSTM , 2016, CogALex@COLING.

[26]  Stan Matwin,et al.  simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes , 2015, Bioinform..