An End-to-End Progressive Multi-Task Learning Framework for Medical Named Entity Recognition and Normalization

Medical named entity recognition (NER) and normalization (NEN) are fundamental for constructing knowledge graphs and building QA systems. Existing implementations for medical NER and NEN are suffered from the error propagation between the two tasks. The mispredicted mentions from NER will directly influence the results of NEN. Therefore, the NER module is the bottleneck of the whole system. Besides, the learnable features for both tasks are beneficial to improving the model performance. To avoid the disadvantages of existing models and exploit the generalized representation across the two tasks, we design an end-to-end progressive multi-task learning model for jointly modeling medical NER and NEN in an effective way. There are three level tasks with progressive difficulty in the framework. The progressive tasks can reduce the error propagation with the incremental task settings which implies the lower level tasks gain the supervised signals other than errors from the higher level tasks to improve their performances. Besides, the context features are exploited to enrich the semantic information of entity mentions extracted by NER. The performance of NEN profits from the enhanced entity mention features. The standard entities from knowledge bases are introduced into the NER module for extracting corresponding entity mentions correctly. The empirical results on two publicly available medical literature datasets demonstrate the superiority of our method over nine typical methods.

[1]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[2]  Yuanjie Zheng,et al.  MMCL-Net: Spinal disease diagnosis in global mode using progressive multi-task joint learning , 2020, Neurocomputing.

[3]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[4]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[5]  Zhiyong Lu,et al.  tmChem: a high performance approach for chemical named entity recognition and normalization , 2015, Journal of Cheminformatics.

[6]  Zhiyong Lu,et al.  BioCreative V CDR task corpus: a resource for chemical disease relation extraction , 2016, Database J. Biol. Databases Curation.

[7]  Zheng Liu,et al.  SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition , 2019, PAKDD.

[8]  Zhiyong Lu,et al.  DNorm: disease name normalization with pairwise learning to rank , 2013, Bioinform..

[9]  Hongfei Lin,et al.  Disease named entity recognition from biomedical literature using a novel convolutional neural network , 2017, BMC Medical Genomics.

[10]  Yongdong Zhang,et al.  Curriculum Learning for Natural Language Understanding , 2020, ACL.

[11]  Zhiyong Lu,et al.  NCBI disease corpus: A resource for disease name recognition and concept normalization , 2014, J. Biomed. Informatics.

[12]  Yuting Wu,et al.  Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs , 2019, IJCAI.

[13]  Jun Zhao,et al.  Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism , 2018, EMNLP.

[14]  Yue Zhang,et al.  A transition‐based joint model for disease named entity recognition and normalization , 2017, Bioinform..

[15]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[18]  Fei Wang,et al.  A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization , 2018, AAAI.

[19]  Thamar Solorio,et al.  A Multi-task Approach for Named Entity Recognition in Social Media Data , 2017, NUT@EMNLP.

[20]  Dong Liu,et al.  MIX: Multi-Channel Information Crossing for Text Matching , 2018, KDD.

[21]  Francisco M. Couto,et al.  LasigeBioTM at MEDIQA 2019: Biomedical Question Answering using Bidirectional Transformers and Named Entity Recognition , 2019, BioNLP@ACL.

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Sunil Kumar Sahu,et al.  Recurrent neural network models for disease name recognition using domain invariant features , 2016, ACL.

[24]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[25]  Xian Qian,et al.  CRF-based Hybrid Model for Word Segmentation, NER and even POS Tagging , 2008, International Joint Conference on Natural Language Processing.

[26]  Zhiyong Lu,et al.  TaggerOne: joint named entity recognition and normalization with semi-Markov Models , 2016, Bioinform..

[27]  Monica Chagoyen,et al.  Named Entity Recognition and Normalization: A Domain-Specific Language Approach , 2008, IWPACBB.

[28]  Zhiyong Lu,et al.  Disease named entity recognition and normalization with DNorm , 2014, BCB.

[29]  Qingcai Chen,et al.  A Joint Model for Medical Named Entity Recognition and Normalization , 2020, IberLEF@SEPLN.

[30]  Daniel M. Lowe,et al.  LeadMine : Disease identification and concept mapping using Wikipedia , 2015 .

[31]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[32]  Jaewoo Kang,et al.  CollaboNet: collaboration of deep neural networks for biomedical named entity recognition , 2018, BMC Bioinformatics.

[33]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[34]  Lidong Bing,et al.  Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling , 2020, ACL.

[35]  Shixian Ning,et al.  Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes , 2020, BMC Bioinformatics.

[36]  Xiaolong Wang,et al.  Principles of Non-stationary Hidden Markov Model and Its Applications to Sequence Labeling Task , 2005, IJCNLP.

[37]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[38]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.