Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing

The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies.

[1]  Bethany Percha,et al.  Automatic classification of mammography reports by BI-RADS breast tissue composition class , 2012, J. Am. Medical Informatics Assoc..

[2]  Carol Friedman,et al.  Facilitating Cancer Research using Natural Language Processing of Pathology Reports , 2004, MedInfo.

[3]  J. Austin,et al.  Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. , 2002, Radiology.

[4]  Małgorzata Marciniak,et al.  Rule-based information extraction from patients' clinical data , 2009, J. Biomed. Informatics.

[5]  Karen Brandt Baldwin,et al.  Evaluating Healthcare Quality Using Natural Language Processing , 2008, Journal for healthcare quality : official publication of the National Association for Healthcare Quality.

[6]  D. Vanel The American College of Radiology (ACR) Breast Imaging and Reporting Data System (BI-RADS): a step towards a universal radiological language? , 2007, European journal of radiology.

[7]  L. Liberman,et al.  Breast imaging reporting and data system (BI-RADS). , 2002, Radiologic clinics of North America.

[8]  Carol Friedman,et al.  Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports , 1997, AMIA.

[9]  E. Sickles,et al.  Auditing your breast imaging practice: an evidence-based approach. , 2007, Seminars in roentgenology.

[10]  Christopher G. Chute,et al.  Automated discovery of drug treatment patterns for endocrine therapy of breast cancer within an electronic medical record , 2012, J. Am. Medical Informatics Assoc..

[11]  K. Kerlikowske,et al.  Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. , 1997, AJR. American journal of roentgenology.

[12]  James H Thrall,et al.  Application of Recently Developed Computer Algorithm for Automatic Classification of Unstructured Radiology Reports: Validation Study 1 , 2004 .

[13]  P. Shekelle,et al.  Systematic Review: Impact of Health Information Technology on Quality, Efficiency, and Costs of Medical Care , 2006, Annals of Internal Medicine.

[14]  Ramin Khorasani,et al.  Repeat abdominal imaging examinations in a tertiary care hospital. , 2012, The American journal of medicine.

[15]  Rob C. van Ommering,et al.  Automatically Correlating Clinical Findings and Body Locations in Radiology Reports Using MedLEE , 2012, Journal of Digital Imaging.

[16]  Guergana K. Savova,et al.  Discerning Tumor Status from Unstructured MRI Reports—Completeness of Information in Existing Reports and Utility of Automated Natural Language Processing , 2009, Journal of Digital Imaging.