Simple Hierarchical Multi-Task Neural End-To-End Entity Linking for Biomedical Text

Recognising and linking entities is a crucial first step to many tasks in biomedical text analysis, such as relation extraction and target identification. Traditionally, biomedical entity linking methods rely heavily on heuristic rules and predefined, often domain-specific features. The features try to capture the properties of entities and complex multi-step architectures to detect, and subsequently link entity mentions. We propose a significant simplification to the biomedical entity linking setup that does not rely on any heuristic methods. The system performs all the steps of the entity linking task jointly in either single or two stages. We explore the use of hierarchical multi-task learning, using mention recognition and entity typing tasks as auxiliary tasks. We show that hierarchical multi-task models consistently outperform single-task models when trained tasks are homogeneous. We evaluate the performance of our models on the biomedical entity linking benchmarks using MedMentions and BC5CDR datasets. We achieve state-of-theart results on the challenging MedMentions dataset, and comparable results on BC5CDR.

[1]  Samuel Broscheit,et al.  Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking , 2019, CoNLL.

[2]  Zhiyong Lu,et al.  BioCreative V CDR task corpus: a resource for chemical disease relation extraction , 2016, Database J. Biol. Databases Curation.

[3]  Daniel King,et al.  ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[4]  Morteza Ziyadi,et al.  MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition using Deep Bidirectional Transformers , 2020, ArXiv.

[5]  Maryam Habibi,et al.  Deep learning with word embeddings improves biomedical named entity recognition , 2017, Bioinform..

[6]  Donghui Li,et al.  MedMentions: A Large Biomedical Corpus Annotated with UMLS Concepts , 2019, AKBC.

[7]  Daniel Loureiro,et al.  MedLinker: Medical Entity Linking with Neural Representations and Dictionary Matching , 2020, ECIR.

[8]  Dan Klein,et al.  A Joint Model for Entity Analysis: Coreference, Typing, and Linking , 2014, TACL.

[9]  Sampo Pyysalo,et al.  A neural network multi-task learning approach to biomedical named entity recognition , 2017, BMC Bioinformatics.

[10]  Denis Newman-Griffis,et al.  MedType: Improving Medical Entity Linking with Semantic Type Prediction , 2020, ArXiv.

[11]  Thomas Wolf,et al.  A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks , 2018, AAAI.

[12]  Fei Wang,et al.  A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization , 2018, AAAI.

[13]  Hiroyuki Shindo,et al.  Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation , 2016, CoNLL.

[14]  Vasudeva Varma,et al.  ELDEN: Improved Entity Linking Using Densified Knowledge Graphs , 2018, NAACL-HLT.

[15]  Bridget T. McInnes,et al.  MT-Clinical BERT: Scaling Clinical Information Extraction with Multitask Learning , 2020, J. Am. Medical Informatics Assoc..

[16]  Ke Xu,et al.  Multitask learning for biomedical named entity recognition with cross-sharing structure , 2019, BMC Bioinformatics.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Qingyu Chen,et al.  An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining , 2020, BIONLP.

[19]  Zhiyong Lu,et al.  TaggerOne: joint named entity recognition and normalization with semi-Markov Models , 2016, Bioinform..

[20]  Yonghwa Choi,et al.  A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining , 2019, IEEE Access.

[21]  Thomas Hofmann,et al.  End-to-End Neural Entity Linking , 2018, CoNLL.