Knowledge-Based Biomedical Word Sense Disambiguation with Neural Concept Embeddings

Biomedical word sense disambiguation (WSD) is an important intermediate task in many natural language processing applications such as named entity recognition, syntactic parsing, and relation extraction. In this paper, we employ knowledge-based approaches that also exploit recent advances in neural word/concept embeddings to improve over the state-of-the-art in biomedical WSD using the MSH WSD dataset as the test set. Our methods involve weak supervision - we do not use any hand-labeled examples for WSD to build our prediction models; however, we employ an existing well known named entity recognition and concept mapping program, MetaMap, to obtain our concept vectors. Over the MSH WSD dataset, our linear time (in terms of numbers of senses and words in the test instance) method achieves an accuracy of 92.24% which is an absolute 3% improvement over the best known results obtained via unsupervised or knowledge-based means. A more expensive approach that we developed relies on a nearest neighbor framework and achieves an accuracy of 94.34%, essentially cutting the error rate in half. Employing dense vector representations learned from unlabeled free text has been shown to benefit many language processing tasks recently and our efforts show that biomedical WSD is no exception to this trend. For a complex and rapidly evolving domain such as biomedicine, building labeled datasets for larger sets of ambiguous terms may be impractical. Here, we show that weak supervision that leverages recent advances in representation learning can rival supervised approaches in biomedical WSD. However, external knowledge bases (here sense inventories) play a key role in the improvements achieved.

[1]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[2]  Marc Weeber,et al.  Developing a test collection for biomedical word sense disambiguation , 2001, AMIA.

[3]  Trevor Cohen,et al.  Hyperdimensional Computing Approach to Word Sense Disambiguation , 2012, AMIA.

[4]  Hongfang Liu,et al.  Research Paper: A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation , 2004, J. Am. Medical Informatics Assoc..

[5]  Amit P. Sheth,et al.  Context-Driven Automatic Subgraph Creation for Literature-Based Discovery , 2015, J. Biomed. Informatics.

[6]  Keith R. Matthews,et al.  Elementary Linear Algebra , 1998 .

[7]  Ted Pedersen,et al.  Abbreviation and Acronym Disambiguation in Clinical Discourse , 2005, AMIA.

[8]  Rafael Berlanga,et al.  Knowledge based word-concept model estimation and refinement for biomedical text mining. , 2015, Journal of biomedical informatics.

[9]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Information Retrieval , 2012, ACL.

[10]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[11]  Christopher G. Chute,et al.  Word sense disambiguation across two domains: Biomedical literature and clinical notes , 2008, J. Biomed. Informatics.

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Ted Pedersen,et al.  Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text , 2013, J. Biomed. Informatics.

[14]  RiosAnthony,et al.  An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records , 2015 .

[15]  A. Dunker The pacific symposium on biocomputing , 1998 .

[16]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[17]  Amit P. Sheth,et al.  An up-to-date knowledge-based literature search and exploration framework for focused bioscience domains , 2012, IHI '12.

[18]  Graciela Gonzalez-Hernandez,et al.  Utilizing social media data for pharmacovigilance: A review , 2015, J. Biomed. Informatics.

[19]  Antonio Jimeno Yepes Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation. , 2017, Journal of biomedical informatics.

[20]  Yue Wang,et al.  Clinical Word Sense Disambiguation with Interactive Search and Classification , 2016, AMIA.

[21]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[22]  Ramakanth Kavuluru,et al.  Convolutional neural networks for biomedical text classification: application in indexing biomedical articles , 2015, BCB.

[23]  Mark Stevenson,et al.  Disambiguation of biomedical text using diverse sources of information , 2008, BMC Bioinformatics.

[24]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[25]  Antonio Jimeno-Yepes,et al.  Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation , 2017, J. Biomed. Informatics.

[26]  Peter Szolovits,et al.  Bridging semantics and syntax with graph algorithms - state-of-the-art of extracting biomedical relations , 2017, Briefings Bioinform..

[27]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[28]  Jari Björne,et al.  Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization , 2013, PloS one.

[29]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[30]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[31]  Martijn J. Schuemie,et al.  Word Sense Disambiguation in the Biomedical Domain: An Overview , 2005, J. Comput. Biol..

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Jing Wang,et al.  A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment , 2015, TACL.

[34]  Yuan Lu,et al.  An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records , 2015, Artif. Intell. Medicine.

[35]  Haibin Liu,et al.  Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach , 2015, AMIA.

[36]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[37]  Juntae Yoon,et al.  Link-topic model for biomedical abbreviation disambiguation , 2015, J. Biomed. Informatics.

[38]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[39]  Antonio Jimeno-Yepes,et al.  Integration of UMLS and MEDLINE in Unsupervised Word Sense Disambiguation , 2012, AAAI Fall Symposium: Information Retrieval and Knowledge Discovery in Biomedical Text.

[40]  Sandra Kübler Proceedings of the ACL 2010 System Demonstrations , 2010, ACL 2010.

[41]  Carol Friedman,et al.  Combining Corpus-derived Sense Profiles with Estimated Frequency Information to Disambiguate Clinical Abbreviations , 2012, AMIA.

[42]  Antonio Jimeno Yepes,et al.  Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation , 2016, 1604.02506.

[43]  M. Slaney,et al.  Locality-Sensitive Hashing for Finding Nearest Neighbors [Lecture Notes] , 2008, IEEE Signal Processing Magazine.

[44]  Munmun De Choudhury,et al.  Mental Health Discourse on reddit: Self-Disclosure, Social Support, and Anonymity , 2014, ICWSM.

[45]  Bridget T. McInnes,et al.  Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation , 2011, BMC Bioinformatics.

[46]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[47]  Michael J. Paul,et al.  Session Introduction , 2016, PSB.

[48]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[49]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[50]  Anna Rumshisky,et al.  Research and applications: Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods , 2014, J. Am. Medical Informatics Assoc..

[51]  Reed McEwan,et al.  Corpus domain effects on distributional semantic modeling of medical terms , 2016, Bioinform..

[52]  Ramakanth Kavuluru,et al.  Toward automated e-cigarette surveillance: Spotting e-cigarette proponents on Twitter , 2016, J. Biomed. Informatics.

[53]  Loes M. M. Braun,et al.  Natural Language Processing in Radiology: A Systematic Review. , 2016, Radiology.