Hybrid Sense Classification Method for Large-Scale Word Sense Disambiguation

Word sense disambiguation (WSD) is a task of determining a reasonable sense of a word in a particular context. Although recent studies have demonstrated some progress in the advancement of neural language models, the scope of research is still such that the senses of several words can only be determined in a few domains. Therefore, it is necessary to move toward developing a highly scalable process that can address a lot of senses occurring in various domains. This paper introduces a new large WSD dataset that is automatically constructed from the Oxford Dictionary, which is widely used as a standard source for the meaning of words. We propose a new WSD model that individually determines the sense of the word in accordance with its part of speech in the context. In addition, we introduce a hybrid sense prediction method that separately classifies the less frequently used senses for achieving a reasonable performance. We have conducted comparative experiments to demonstrate that the proposed method is more reliable compared with the baseline approaches. Also, we investigated the adaptation of the method to a realistic environment with the use of news articles.

[1]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[2]  Laura Mascarell,et al.  Improving Word Sense Disambiguation in Neural Machine Translation with Sense Embeddings , 2017, WMT.

[3]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[4]  Radu Tudor Ionescu,et al.  ShotgunWSD 2.0: An Improved Algorithm for Global Word Sense Disambiguation , 2019, IEEE Access.

[5]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Jingzhou Liu,et al.  Stack-Pointer Networks for Dependency Parsing , 2018, ACL.

[7]  Shweta Taneja,et al.  An Enhanced K-Nearest Neighbor Algorithm Using Information Gain and Clustering , 2014, 2014 Fourth International Conference on Advanced Computing & Communication Technologies.

[8]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[9]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[10]  John Tait,et al.  Word sense disambiguation in information retrieval revisited , 2003, SIGIR.

[11]  Youngjoong Ko,et al.  Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph , 2018, COLING.

[12]  Yue Zhang,et al.  Tagging The Web: Building A Robust Web Tagger with Neural Network , 2014, ACL.

[13]  Hwee Tou Ng,et al.  One Million Sense-Tagged Instances for Word Sense Disambiguation and Induction , 2015, CoNLL.

[14]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[15]  George A. Miller,et al.  Using a Semantic Concordance for Sense Identification , 1994, HLT.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Peter Norvig,et al.  Inference in Text Understanding , 1987, AAAI.

[18]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[19]  James R. Glass,et al.  Quantifying Exposure Bias for Neural Language Generation , 2019, ArXiv.

[20]  Daniel Loureiro,et al.  Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation , 2019, ACL.

[21]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[22]  Karl Stratos,et al.  Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models , 2016, TACL.

[23]  Herng-Yow Chen,et al.  Web-based synchronized multimedia lecture system design for teaching/learning Chinese as second language , 2008, Comput. Educ..

[24]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Information Retrieval , 2012, ACL.

[25]  Sardar Jaf,et al.  Deep Learning for Natural Language Parsing , 2019, IEEE Access.

[26]  Hans Uszkoreit,et al.  Multi-Objective Optimization for the Joint Disambiguation of Nouns and Named Entities , 2015, ACL.

[27]  Rada Mihalcea,et al.  Building a Sense Tagged Corpus with Open Mind Word Expert , 2002, SENSEVAL.

[28]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[29]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[30]  Christopher D. Manning,et al.  Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models , 2016, ACL.

[31]  Jingbo Zhu,et al.  Easy-First POS Tagging and Dependency Parsing with Beam Search , 2013, ACL.

[32]  Ronan Collobert,et al.  Deep Learning for Efficient Discriminative Parsing , 2011, AISTATS.

[33]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[34]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[35]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[36]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[37]  Sangwoo Kang,et al.  Multimodal Neural Machine Translation With Weakly Labeled Images , 2019, IEEE Access.

[38]  Pushpak Bhattacharyya,et al.  Unsupervised Word Sense Disambiguation Using Markov Random Field and Dependency Parser , 2015, AAAI.

[39]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[40]  Mirella Lapata,et al.  Graph Connectivity Measures for Unsupervised Word Sense Disambiguation , 2007, IJCAI.

[41]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[42]  Haolin Wang,et al.  Semantically Enhanced Medical Information Retrieval System: A Tensor Factorization Based Approach , 2017, IEEE Access.

[43]  Rada Mihalcea,et al.  Coarse to Fine Grained Sense Disambiguation in Wikipedia , 2013, *SEMEVAL.

[44]  Jungyun Seo,et al.  Extensive Use of Morpheme Features in Korean Dependency Parsing , 2019, 2019 IEEE International Conference on Big Data and Smart Computing (BigComp).

[45]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[46]  Florian Schmidt Generalization in Generation: A closer look at Exposure Bias , 2019, NGT@EMNLP-IJCNLP.

[47]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.