Recurrent neural network models for disease name recognition using domain invariant features

Hand-crafted features based on linguistic and domain-knowledge play crucial role in determining the performance of disease name recognition systems. Such methods are further limited by the scope of these features or in other words, their ability to cover the contexts or word dependencies within a sentence. In this work, we focus on reducing such dependencies and propose a domain-invariant framework for the disease name recognition task. In particular, we propose various end-to-end recurrent neural network (RNN) models for the tasks of disease name recognition and their classification into four pre-defined categories. We also utilize convolution neural network (CNN) in cascade of RNN to get character-based embedded features and employ it with word-embedded features in our model. We compare our models with the state-of-the-art results for the two tasks on NCBI disease dataset. Our results for the disease mention recognition task indicate that state-of-the-art performance can be obtained without relying on feature engineering. Further the proposed models obtained improved performance on the classification task of disease names.

[1]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[2]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[3]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[4]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[5]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[9]  Zhiyong Lu,et al.  NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.

[10]  Alberto Lavelli,et al.  Disease Mention Recognition with Specific Features , 2010, BioNLP@ACL.

[11]  Razvan Pascanu,et al.  Understanding the exploding gradient problem , 2012, ArXiv.

[12]  Graciela Gonzalez,et al.  BANNER: An Executable Survey of Advances in Biomedical Named Entity Recognition , 2007, Pacific Symposium on Biocomputing.

[13]  Zhiyong Lu,et al.  An improved corpus of disease mentions in PubMed citations , 2012, BioNLP@HLT-NAACL.

[14]  Alexandre Allauzen,et al.  Non-lexical neural architecture for fine-grained POS Tagging , 2015, EMNLP.

[15]  Cícero Nogueira dos Santos,et al.  Learning Character-level Representations for Part-of-Speech Tagging , 2014, ICML.

[16]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[17]  Burr Settles,et al.  ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text , 2005 .

[18]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[19]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[20]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[21]  Cícero Nogueira dos Santos,et al.  Boosting Named Entity Recognition with Neural Character Embeddings , 2015, NEWS@ACL.

[22]  Barbara Rosario,et al.  Classifying Semantic Relations in Bioscience Texts , 2004, ACL.

[23]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[24]  Sunil Kumar Sahu,et al.  Evaluating distributed word representations for capturing semantics of biomedical concepts , 2015, BioNLP@IJCNLP.

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.