Multi-task Learning Applied to Biomedical Named Entity Recognition Task

Recent deep learning techniques have shown significant improvements in biomedical named entity recognition task. However, such techniques are still facing challenges; one of them is related to the limited availability of annotated text data. In this perspective, with a multi-task approach, simultaneously training different related tasks enables multi-task models to learn common features among different tasks where they share some layers with each other. It is desirable to used stacked long-short term memories (LSTMs) in such models to deal with a large amount of training data and to learn the underlying hidden structure in the data. However, the stacked LSTMs approach also leads to the vanishing gradient problem. To alleviate this limitation, we propose a multi-task model based on convolution neural networks, stacked LSTMs, and conditional random fields and use embedding information at different layers. The model proposed shows results comparable to state-of-the-art approaches. Moreover, we performed an empirical analysis of the proposed model with different variations to see their impact on our model.

[1]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[2]  Hwee Tou Ng,et al.  Named Entity Recognition: A Maximum Entropy Approach Using Global Information , 2002, COLING.

[3]  Mourad Gridach,et al.  Character-level neural network for biomedical named entity recognition , 2017, J. Biomed. Informatics.

[4]  Eric Nichols,et al.  DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets , 2016, NUT@COLING.

[5]  Yue Zhang,et al.  Design Challenges and Misconceptions in Neural Sequence Labeling , 2018, COLING.

[6]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[7]  Cícero Nogueira dos Santos,et al.  Boosting Named Entity Recognition with Neural Character Embeddings , 2015, NEWS@ACL.

[8]  Nigel Collier,et al.  Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition , 2016, BioTxtM@COLING 2016.

[9]  Paloma Martínez,et al.  Exploring Word Embedding for Drug Name Recognition , 2015, Louhi@EMNLP.

[10]  Sampo Pyysalo,et al.  A neural network multi-task learning approach to biomedical named entity recognition , 2017, BMC Bioinformatics.

[11]  D. W. Zimmerman,et al.  Relative Power of the Wilcoxon Test, the Friedman Test, and Repeated-Measures ANOVA on Ranks , 1993 .

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[14]  Yifan Gong,et al.  Layer Trajectory LSTM , 2018, INTERSPEECH.

[15]  Jiawei Han,et al.  Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning , 2018 .

[16]  Andrew McCallum,et al.  Ask the GRU: Multi-task Learning for Deep Text Recommendations , 2016, RecSys.

[17]  Ivan Serina,et al.  Leveraging Multi-task Learning for Biomedical Named Entity Recognition , 2019, AI*IA.

[18]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[19]  Gary D. Bader,et al.  Transfer learning for biomedical named entity recognition with neural networks , 2018, bioRxiv.