The Utility of General Domain Transfer Learning for Medical Language Tasks

The purpose of this study is to analyze the efficacy of transfer learning techniques and transformer-based models as applied to medical natural language processing (NLP) tasks, specifically radiological text classification. We used 1,977 labeled head CT reports, from a corpus of 96,303 total reports, to evaluate the efficacy of pretraining using general domain corpora and a combined general and medical domain corpus with a bidirectional representations from transformers (BERT) model for the purpose of radiological text classification. Model performance was benchmarked to a logistic regression using bag-of-words vectorization and a long short-term memory (LSTM) multi-label multi-class classification model, and compared to the published literature in medical text classification. The BERT models using either set of pretrained checkpoints outperformed the logistic regression model, achieving sample-weighted average F1-scores of 0.87 and 0.87 for the general domain model and the combined general and biomedical-domain model. General text transfer learning may be a viable technique to generate state-of-the-art results within medical NLP tasks on radiological corpora, outperforming other deep models such as LSTMs. The efficacy of pretraining and transformer-based models could serve to facilitate the creation of groundbreaking NLP models in the uniquely challenging data environment of medical text.

[1]  Po-Hao Chen,et al.  Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports , 2018, Journal of Digital Imaging.

[2]  Richard W. Grant,et al.  Case Report: Using Regular Expressions to Abstract Blood Pressure and Treatment Intensification Information from the Text of Physician Notes , 2006, J. Am. Medical Informatics Assoc..

[3]  Reed McEwan,et al.  Corpus domain effects on distributional semantic modeling of medical terms , 2016, Bioinform..

[4]  J. Hanson,et al.  Hospital Readmission From the Perspective of Medicaid and Uninsured Patients , 2017, Journal for healthcare quality : official publication of the National Association for Healthcare Quality.

[5]  Bruce M Psaty,et al.  Use of administrative data to estimate the incidence of statin-related rhabdomyolysis. , 2012, JAMA.

[6]  Bethany Percha,et al.  Automatic classification of mammography reports by BI-RADS breast tissue composition class , 2012, J. Am. Medical Informatics Assoc..

[7]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[8]  Hongfang Liu,et al.  Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[9]  Matthew Scotch,et al.  The Yale cTAKES extensions for document classification: architecture and application , 2011, J. Am. Medical Informatics Assoc..

[10]  Marcus A. Badgeley,et al.  Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study , 2018, PLoS medicine.

[11]  Alexander Turchin,et al.  Comparison of information content of structured and narrative text data sources on the example of medication intensification. , 2009, Journal of the American Medical Informatics Association : JAMIA.

[12]  Imre Solti1,et al.  Automated classification of radiology reports for acute lung injury: Comparison of keyword and machine learning based natural language processing approaches , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[13]  Zhiyong Lu,et al.  Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets , 2019, BioNLP@ACL.

[14]  Xin Sun,et al.  Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence , 2019, Nature Medicine.

[15]  Steven H. Brown,et al.  Automated identification of postoperative complications within an electronic medical record using natural language processing. , 2011, JAMA.

[16]  Joshua C. Denny,et al.  Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. , 2019, Seminars in arthritis and rheumatism.

[17]  Marcus A. Badgeley,et al.  Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports. , 2018, Radiology.

[18]  Mario A. Bochicchio,et al.  A Big Data Analytics Framework for Supporting Multidimensional Mining over Big Healthcare Data , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[19]  Xu Sun,et al.  Fast Implementation of DeLong’s Algorithm for Comparing the Areas Under Correlated Receiver Operating Characteristic Curves , 2014, IEEE Signal Processing Letters.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  V. Velanovich,et al.  The natural language of the surgeon's clinical note in outcomes assessment: a qualitative analysis of the medical record. , 2010, American journal of surgery.

[22]  Scott R. Halgrim,et al.  Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence. , 2014, American journal of epidemiology.

[23]  C. Langlotz,et al.  Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield. , 2017, AJR. American journal of roentgenology.