Leveraging Multi-task Learning for Biomedical Named Entity Recognition

Biomedical named entity recognition (BioNER) is the task of categorizing biomedical entities. Due to the specific characteristics of the names of biomedical entities, such as ambiguity among different concepts or different ways of referring to the same entity, the BioNER task is usually considered more challenging compared to standard named entity recognition tasks. Recent techniques based on deep learning not only significantly reduce the hand crafted feature engineering phase but also determined relevant improvements in the BioNER task. However, such systems are still facing challenges. One of them is the limited availability of annotated text data. Multi-task learning approaches tackle this problem by training different related tasks simultaneously. This enables multi-task models to learn common features among different tasks where they share some layers. To explore the advantages of the multi-task learning, we propose a model based on convolution neural networks, long-short term memories, and conditional random fields. The model we propose shows comparable results to state-of-the-art approaches. Moreover, we present an empirical analysis considering the impact of different word input representations (word embedding, character embedding, and case embedding) on the model performance.

[1]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[2]  Sampo Pyysalo,et al.  A neural network multi-task learning approach to biomedical named entity recognition , 2017, BMC Bioinformatics.

[3]  D. W. Zimmerman,et al.  Relative Power of the Wilcoxon Test, the Friedman Test, and Repeated-Measures ANOVA on Ranks , 1993 .

[4]  Andrew McCallum,et al.  Ask the GRU: Multi-task Learning for Deep Text Recommendations , 2016, RecSys.

[5]  Cícero Nogueira dos Santos,et al.  Boosting Named Entity Recognition with Neural Character Embeddings , 2015, NEWS@ACL.

[6]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  M. Sheldon,et al.  The use and interpretation of the Friedman test in the analysis of ordinal-scale data in repeated measures designs. , 1996, Physiotherapy research international : the journal for researchers and clinicians in physical therapy.

[8]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[9]  Yue Zhang,et al.  Design Challenges and Misconceptions in Neural Sequence Labeling , 2018, COLING.

[10]  T. Pohlert The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR) , 2016 .

[11]  Eric Nichols,et al.  DeepNNNER: Applying BLSTM-CNNs and Extended Lexicons to Named Entity Recognition in Tweets , 2016, NUT@COLING.

[12]  Nigel Collier,et al.  Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition , 2016, BioTxtM@COLING 2016.

[13]  Paloma Martínez,et al.  Exploring Word Embedding for Drug Name Recognition , 2015, Louhi@EMNLP.

[14]  Mourad Gridach,et al.  Character-level neural network for biomedical named entity recognition , 2017, J. Biomed. Informatics.

[15]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[16]  Hwee Tou Ng,et al.  Named Entity Recognition: A Maximum Entropy Approach Using Global Information , 2002, COLING.

[17]  Yu Zhang,et al.  Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning , 2018, bioRxiv.

[18]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[19]  Gary D. Bader,et al.  Transfer learning for biomedical named entity recognition with neural networks , 2018, bioRxiv.