论文信息 - Long short-term memory RNN for biomedical named entity recognition

Long short-term memory RNN for biomedical named entity recognition

BackgroundBiomedical named entity recognition(BNER) is a crucial initial step of information extraction in biomedical domain. The task is typically modeled as a sequence labeling problem. Various machine learning algorithms, such as Conditional Random Fields (CRFs), have been successfully used for this task. However, these state-of-the-art BNER systems largely depend on hand-crafted features.ResultsWe present a recurrent neural network (RNN) framework based on word embeddings and character representation. On top of the neural network architecture, we use a CRF layer to jointly decode labels for the whole sentence. In our approach, contextual information from both directions and long-range dependencies in the sequence, which is useful for this task, can be well modeled by bidirectional variation and long short-term memory (LSTM) unit, respectively. Although our models use word embeddings and character embeddings as the only features, the bidirectional LSTM-RNN (BLSTM-RNN) model achieves state-of-the-art performance — 86.55% F1 on BioCreative II gene mention (GM) corpus and 73.79% F1 on JNLPBA 2004 corpus.ConclusionsOur neural network architecture can be successfully used for BNER without any manual feature engineering. Experimental results show that domain-specific pre-trained word embeddings and character-level representation can improve the performance of the LSTM-RNN models. On the GM corpus, we achieve comparable performance compared with other systems using complex hand-crafted features. Considering the JNLPBA corpus, our model achieves the best results, outperforming the previously top performing systems. The source code of our method is freely available under GPL at https://github.com/lvchen1989/BNER.

[1] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2] Xiaolong Wang,et al. Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks , 2014, BioMed research international.

[3] Rie Kubota Ando,et al. BioCreative II Gene Mention Tagging System at IBM Watson , 2007 .

[4] Zhenchao Jiang,et al. Biomedical named entity recognition based on extended Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..

[6] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[7] Yue Zhang,et al. Gated Neural Networks for Targeted Sentiment Analysis , 2016, AAAI.

[8] Zhenchao Jiang,et al. Training word embeddings for deep learning in biomedical text mining tasks , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[10] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[11] Malvina Nissim,et al. Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web , 2004, NLPBA/BioNLP.

[12] Dong-Hong Ji,et al. Deep Learning for Textual Entailment Recognition , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[13] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[14] Jieping Ye,et al. Deep convolutional neural networks for annotating gene expression patterns in the mouse brain , 2015, BMC Bioinformatics.

[15] Chun-Nan Hsu,et al. Integrating high dimensional bi-directional parsing models for gene mention tagging , 2008, ISMB.

[16] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17] Hongfang Liu,et al. BioThesaurus: a web-based thesaurus of protein and gene names , 2006, Bioinform..

[18] Ruifeng Liu,et al. Data-driven identification of structural alerts for mitigating the risk of drug-induced human liver injuries , 2015, Journal of Cheminformatics.

[19] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[20] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[21] Xiaohui Liang,et al. CHEMDNER system with mixed conditional random fields and multi-scale word clustering , 2015, Journal of Cheminformatics.

[22] Yue Zhang,et al. LibN3L: A Lightweight Package for Neural NLP , 2016, LREC.

[23] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24] Nigel Collier,et al. Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications , 2004 .