论文信息 - Automatic recognition of abdominal lymph nodes from clinical text

Automatic recognition of abdominal lymph nodes from clinical text

Lymph node status plays a pivotal role in the treatment of cancer. The extraction of lymph nodes from radiology text reports enables large-scale training of lymph node detection on MRI. In this work, we first propose an ontology of 41 types of abdominal lymph nodes with a hierarchical relationship. We then introduce an end-to-end approach based on the combination of rules and transformer-based methods to detect these abdominal lymph node mentions and classify their types from the MRI radiology reports. We demonstrate the superior performance of a model fine-tuned on MRI reports using BlueBERT, called MriBERT. We find that MriBERT outperforms the rule-based labeler (0.957 vs 0.644 in micro weighted F1-score) as well as other BERT-based variations (0.913 - 0.928). We make the code and MriBERT publicly available at https://github.com/ncbi-nlp/bluebert, with the hope that this method can facilitate the development of medical report annotators to produce labels from scratch at scale.

[1] M. Girolami,et al. Analysis of free text in electronic health records for identification of cancer patient trajectories , 2017, Scientific Reports.

[2] M. Harisinghani. Atlas of Lymph Node Anatomy , 2013 .

[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4] C. Langlotz,et al. Deep Learning to Classify Radiology Free-Text Reports. , 2017, Radiology.

[5] Ping He,et al. Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[6] David A. Wood,et al. Automated Labelling using an Attention model for Radiology reports of MRI scans (ALARM) , 2020, MIDL.

[7] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[8] Ramin Khorasani,et al. Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing , 2013, Journal of Digital Imaging.

[9] Zhiyong Lu,et al. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets , 2019, BioNLP@ACL.

[10] Ronald M. Summers,et al. A self-attention based deep learning method for lesion attribute detection from CT reports , 2019, 2019 IEEE International Conference on Healthcare Informatics (ICHI).

[11] Department of Computer Science,et al. CheXpert++: Approximating the CheXpert labeler for Speed, Differentiability, and Probabilistic Output , 2020, MLHC.

[12] Yang Huang,et al. A novel hybrid approach to automated negation detection in clinical radiology reports. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[13] David Tresner-Kirsch,et al. MITRE system for clinical assertion status classification , 2011, J. Am. Medical Informatics Assoc..

[14] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.

[15] Yifan Yu,et al. CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[16] Franziska Wulf. Normal Lymph Node Topography Ct Atlas , 2016 .

[17] Wendy W. Chapman,et al. Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm , 2011, J. Biomed. Informatics.

[18] Chao Han,et al. Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing , 2019, Chinese medical journal.

[19] Mike Conway,et al. Extending the NegEx Lexicon for Multiple Languages , 2013, MedInfo.

[20] Wei-Hung Weng,et al. Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[21] Benjamin Szubert,et al. Supervised and unsupervised language modelling in Chest X-Ray radiological reports , 2020, PloS one.

[22] Alan R. Aronson,et al. An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[23] Jaewoo Kang,et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[24] Ronald M. Summers,et al. ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[25] Ronald M. Summers,et al. NegBio: a high-performance tool for negation and uncertainty detection in radiology reports , 2017, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[26] Le Lu,et al. DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning , 2018, Journal of medical imaging.

[27] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.