Not a cute stroke: Analysis of Rule- and Neural Network-based Information Extraction Systems for Brain Radiology Reports

We present an in-depth comparison of three clinical information extraction (IE) systems designed to perform entity recognition and negation detection on brain imaging reports: EdIE-R, a bespoke rule-based system, and two neural network models, EdIE-BiLSTM and EdIE-BERT, both multi-task learning models with a BiLSTM and BERT encoder respectively. We compare our models both on an in-sample and an out-of-sample dataset containing mentions of stroke findings and draw on our error analysis to suggest improvements for effective annotation when building clinical NLP models for a new domain. Our analysis finds that our rule-based system outperforms the neural models on both datasets and seems to generalise to the out-of-sample dataset. On the other hand, the neural models do not generalise negation to the out-of-sample dataset, despite metrics on the in-sample dataset suggesting otherwise.

[1]  Bonnie L. Webber,et al.  Detecting negation scope is easy, except when it isn’t , 2017, EACL.

[2]  Zhiyong Lu,et al.  Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets , 2019, BioNLP@ACL.

[3]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[4]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[5]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[6]  Ronald M. Summers,et al.  NegBio: a high-performance tool for negation and uncertainty detection in radiology reports , 2017, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[7]  Andrew Y. Ng,et al.  CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT , 2020, EMNLP.

[8]  Richard Tobin,et al.  Comparison of rule-based and neural network models for negation detection in radiology reports , 2020, Natural Language Engineering.

[9]  Steven Bethard,et al.  Unsupervised Domain Adaptation for Clinical Negation Detection , 2017, BioNLP.

[10]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[11]  Peter Szolovits,et al.  Neural Token Representations and Negation and Speculation Scope Detection in Biomedical and General Domain Text , 2019, EMNLP.

[12]  Loes M. M. Braun,et al.  Natural Language Processing in Radiology: A Systematic Review. , 2016, Radiology.

[13]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[14]  Claire Grover,et al.  Rule-Based Chunking and Reusability , 2006, LREC.

[15]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[16]  Saeed Hassanpour,et al.  Artificial Intelligence in Medicine , 2015 .

[17]  Steven Bethard,et al.  Does BERT need domain adaptation for clinical negation detection? , 2020, J. Am. Medical Informatics Assoc..

[18]  C. Langlotz RadLex: a new method for indexing online educational materials. , 2006, Radiographics : a review publication of the Radiological Society of North America, Inc.

[19]  S. D.SYKE Comparison of Rule-Based and Neural Network Models for Negation Detection in Radiology Reports , 2020 .

[20]  Yanshan Wang,et al.  Natural Language Processing for the Identification of Silent Brain Infarcts From Neuroimaging Reports , 2019, JMIR medical informatics.

[21]  Thomas Searle,et al.  MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation , 2019, EMNLP.

[22]  Giovanni Montana,et al.  Modelling Radiological Language with Bidirectional Long Short-Term Memory Networks , 2016, Louhi@EMNLP.

[23]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[24]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[25]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[26]  Beatrice Alex,et al.  Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches , 2019, ArXiv.

[27]  Richard Tobin,et al.  Text mining brain imaging reports , 2018, Journal of Biomedical Semantics.

[28]  Hong Gu,et al.  Clinically Significant Information Extraction from Radiology Reports , 2017, DocEng.

[29]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[30]  Busra Celikkaya,et al.  Joint Entity Extraction and Assertion Detection for Clinical Text , 2018, ACL.

[31]  Michael Dickinson B.Arch.,et al.  Exploring the Boundaries , 1993 .

[32]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[33]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[34]  Wendy W. Chapman,et al.  ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports , 2009, J. Biomed. Informatics.

[35]  Vivienne J. Zhu,et al.  Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke , 2019, PloS one.

[36]  Omer Levy,et al.  Generalization through Memorization: Nearest Neighbor Language Models , 2020, ICLR.

[37]  Malvina Nissim,et al.  Exploring the boundaries: gene and protein identification in biomedical text , 2005, BMC Bioinformatics.

[38]  Patrick Schrempf,et al.  Paying Per-label Attention for Multi-label Extraction from Radiology Reports , 2020, iMIMIC/MIL3iD/LABELS@MICCAI.

[39]  James J. Masanz,et al.  Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing , 2014, PloS one.

[40]  David A. Wood,et al.  Automated Labelling using an Attention model for Radiology reports of MRI scans (ALARM) , 2020, MIDL.

[41]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[42]  János Csirik,et al.  The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes , 2008, BMC Bioinformatics.

[43]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[44]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[45]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[46]  Amir Tahmasebi,et al.  Context-Driven Concept Annotation in Radiology Reports: Anatomical Phrase Labeling. , 2019, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[47]  John A. Carroll,et al.  Robust, applied morphological generation , 2000, INLG.

[48]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[49]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[50]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..