Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Total Hip Arthroplasty.

BACKGROUND Manual chart review is labor-intensive and requires specialized knowledge possessed by highly trained medical professionals. Natural language processing (NLP) tools are distinctive in their ability to extract critical information from raw text in electronic health records (EHRs). As a proof of concept for the potential application of this technology, we examined the ability of NLP to correctly identify common elements described by surgeons in operative notes for total hip arthroplasty (THA). METHODS We evaluated primary THAs that had been performed at a single academic institution from 2000 to 2015. A training sample of operative reports was randomly selected to develop prototype NLP algorithms, and additional operative reports were randomly selected as the test sample. Three separate algorithms were created with rules aimed at capturing (1) the operative approach, (2) the fixation method, and (3) the bearing surface category. The algorithms were applied to operative notes to evaluate the language used by 29 different surgeons at our center and were applied to EHR data from outside facilities to determine external validity. Accuracy statistics were calculated with use of manual chart review as the gold standard. RESULTS The operative approach algorithm demonstrated an accuracy of 99.2% (95% confidence interval [CI], 97.1% to 99.9%). The fixation technique algorithm demonstrated an accuracy of 90.7% (95% CI, 86.8% to 93.8%). The bearing surface algorithm demonstrated an accuracy of 95.8% (95% CI, 92.7% to 97.8%). Additionally, the NLP algorithms applied to operative reports from other institutions yielded comparable performance, demonstrating external validity. CONCLUSIONS NLP-enabled algorithms are a promising alternative to the current gold standard of manual chart review for identifying common data elements from orthopaedic operative notes. The present study provides a proof of concept for use of NLP techniques in clinical research studies and registry-development endeavors to reliably extract data of interest in an expeditious and cost-effective manner.

[1]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[2]  K. Itani,et al.  Centers for Disease Control and Prevention Guideline for the Prevention of Surgical Site Infection, 2017 , 2017, JAMA surgery.

[3]  Steven H. Brown,et al.  Automated identification of postoperative complications within an electronic medical record using natural language processing. , 2011, JAMA.

[4]  Edmund Lau,et al.  Impact of the economic downturn on total joint replacement demand in the United States: updated projections to 2021. , 2014, The Journal of bone and joint surgery. American volume.

[5]  Jimeng Sun,et al.  Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records , 2014, Int. J. Medical Informatics.

[6]  Sunghwan Sohn,et al.  Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[7]  Kuanchin Chen,et al.  Applying natural language processing techniques to develop a task-specific EMR interface for timely stroke thrombolysis: A feasibility study , 2018, Int. J. Medical Informatics.

[8]  Mir H. Ali,et al.  Natural Language Processing for Asthma Ascertainment in Different Practice Settings. , 2018, The journal of allergy and clinical immunology. In practice.

[9]  Hongfang Liu,et al.  Journal of Biomedical Informatics , 2022 .

[10]  Carol Friedman,et al.  Deriving comorbidities from medical records using Natural Language Processing , 2013, AMIA.

[11]  Carol Friedman,et al.  Determining the reasons for medication prescriptions in the EHR using knowledge and natural language processing. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[12]  Hongfang Liu,et al.  Clinical decision support with automated text processing for cervical cancer screening , 2012, J. Am. Medical Informatics Assoc..

[13]  Hongfang Liu,et al.  Research and applications: MedXN: an open source medication extraction and normalization tool for clinical text , 2014, J. Am. Medical Informatics Assoc..

[14]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[15]  Cynthia S Crowson,et al.  Prevalence of Total Hip and Knee Replacement in the United States. , 2015, The Journal of bone and joint surgery. American volume.

[16]  Hongfang Liu,et al.  Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions , 2017, J. Am. Medical Informatics Assoc..

[17]  Christopher G Chute,et al.  An Information Extraction Framework for Cohort Identification Using Electronic Health Records , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.