Multi-Center Colonoscopy Quality Measurement Utilizing Natural Language Processing

Background:An accurate system for tracking of colonoscopy quality and surveillance intervals could improve the effectiveness and cost-effectiveness of colorectal cancer (CRC) screening and surveillance. The purpose of this study was to create and test such a system across multiple institutions utilizing natural language processing (NLP).Methods:From 42,569 colonoscopies with pathology records from 13 centers, we randomly sampled 750 paired reports. We trained (n=250) and tested (n=500) an NLP-based program with 19 measurements that encompass colonoscopy quality measures and surveillance interval determination, using blinded, paired, annotated expert manual review as the reference standard. The remaining 41,819 nonannotated documents were processed through the NLP system without manual review to assess performance consistency. The primary outcome was system accuracy across the 19 measures.Results:A total of 176 (23.5%) documents with 252 (1.8%) discrepant content points resulted from paired annotation. Error rate within the 500 test documents was 31.2% for NLP and 25.4% for the paired annotators (P=0.001). At the content point level within the test set, the error rate was 3.5% for NLP and 1.9% for the paired annotators (P=0.04). When eight vaguely worded documents were removed, 125 of 492 (25.4%) were incorrect by NLP and 104 of 492 (21.1%) by the initial annotator (P=0.07). Rates of pathologic findings calculated from NLP were similar to those calculated by annotation for the majority of measurements. Test set accuracy was 99.6% for CRC, 95% for advanced adenoma, 94.6% for nonadvanced adenoma, 99.8% for advanced sessile serrated polyps, 99.2% for nonadvanced sessile serrated polyps, 96.8% for large hyperplastic polyps, and 96.0% for small hyperplastic polyps. Lesion location showed high accuracy (87.0–99.8%). Accuracy for number of adenomas was 92%.Conclusions:NLP can accurately report adenoma detection rate and the components for determining guideline-adherent colonoscopy surveillance intervals across multiple sites that utilize different methods for reporting colonoscopy findings.

[1]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[2]  Muhammad F Dawwas,et al.  Adenoma detection rate and risk of colorectal cancer and death. , 2014, The New England journal of medicine.

[3]  Sayon Dutta,et al.  Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings. , 2013, Annals of emergency medicine.

[4]  Ming Li,et al.  Natural Language Processing Improves Identification of Colorectal Cancer Testing in the Electronic Medical Record , 2012, Medical decision making : an international journal of the Society for Medical Decision Making.

[5]  R. Hayes,et al.  Utilization of surveillance colonoscopy in community practice. , 2010, Gastroenterology.

[6]  Leonard W. D'Avolio,et al.  Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing , 2013, Digestive Diseases and Sciences.

[7]  C. Maynard,et al.  Data resources in the Department of Veterans Affairs. , 2004, Diabetes care.

[8]  Timothy D. Imler,et al.  Current and future applications of natural language processing in the field of digestive diseases. , 2014, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[9]  K. Rajan,et al.  Advanced Adenoma Detection Rate Is Independent of Nonadvanced Adenoma Detection Rate , 2013, The American Journal of Gastroenterology.

[10]  C. Lindberg The Unified Medical Language System (UMLS) of the National Library of Medicine. , 1990, Journal.

[11]  Timothy D. Imler,et al.  The effect of colonoscopy preparation quality on adenoma detection rates. , 2012, Gastrointestinal endoscopy.

[12]  William K. Thompson,et al.  Anatomic and Advanced Adenoma Detection Rates as Quality Metrics Determined via Natural Language Processing , 2014, The American Journal of Gastroenterology.

[13]  W. Mccarthy Adjustment to the McNemar's Test for the Analysis of Clustered Matched-Pair Data , 2007 .

[14]  Marcin Polkowski,et al.  Quality indicators for colonoscopy and the risk of interval cancer. , 2010, The New England journal of medicine.

[15]  Hua Xu,et al.  Extracting timing and status descriptors for colonoscopy testing from electronic medical records , 2010, J. Am. Medical Informatics Assoc..

[16]  Peter L. Elkin,et al.  Comparison of Natural Language Processing Biosurveillance Methods for Identifying Influenza From Encounter Notes , 2012, Annals of Internal Medicine.

[17]  Cynthia S. Johnson,et al.  Impact of a quarterly report card on colonoscopy quality measures. , 2013, Gastrointestinal endoscopy.

[18]  Charles J. Kahi,et al.  Serrated Lesions of the Colorectum: Review and Recommendations From an Expert Panel , 2012, The American Journal of Gastroenterology.

[19]  J. Davila,et al.  Differentiation of ileostomy from colostomy procedures: assessing the accuracy of current procedural terminology codes and the utility of natural language processing. , 2013, Surgery.

[20]  B. Jacobson,et al.  Reliability of adenoma detection rate is based on procedural volume. , 2013, Gastrointestinal endoscopy.

[21]  M. Wallace,et al.  Assessment of adenoma detection rate benchmarks in women versus men. , 2013, Gastrointestinal endoscopy.

[22]  Henk Harkema,et al.  Applying a natural language processing tool to electronic health records to assess performance on colonoscopy quality measures. , 2012, Gastrointestinal endoscopy.

[23]  S. Paggi,et al.  Overutilization of post-polypectomy surveillance colonoscopy in clinical practice: a prospective, multicentre study. , 2012, Digestive and liver disease : official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver.

[25]  Douglas K Rex,et al.  Guidelines for colonoscopy surveillance after screening and polypectomy: a consensus update by the US Multi-Society Task Force on Colorectal Cancer. , 2012, Gastroenterology.

[26]  Cynthia S. Johnson,et al.  Improving measurement of the adenoma detection rate and adenoma per colonoscopy quality metric: the Indiana University experience. , 2014, Gastrointestinal endoscopy.

[27]  B. Spiegel,et al.  Adenoma detection rate is necessary but insufficient for distinguishing high versus low endoscopist performance. , 2013, Gastrointestinal Endoscopy.

[28]  Wendy W. Chapman,et al.  Developing a natural language processing application for measuring the quality of colonoscopy procedures , 2011, J. Am. Medical Informatics Assoc..

[29]  Timothy D. Imler,et al.  Clinical decision support with natural language processing facilitates determination of colonoscopy surveillance intervals. , 2014, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[30]  Steven H. Brown,et al.  Automated identification of postoperative complications within an electronic medical record using natural language processing. , 2011, JAMA.

[31]  Timothy D. Imler,et al.  Natural language processing accurately categorizes findings from colonoscopy and pathology reports. , 2013, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[32]  Á. Lanas,et al.  Modifiable endoscopic factors that influence the adenoma detection rate in colorectal cancer screening colonoscopies. , 2013, Gastrointestinal endoscopy.

[33]  Hongfang Liu,et al.  A Study of Transportability of an Existing Smoking Status Detection Module across Institutions , 2012, AMIA.