Automated pancreatic cyst screening using natural language processing: a new tool in the early detection of pancreatic cancer.

INTRODUCTION As many as 3% of computed tomography (CT) scans detect pancreatic cysts. Because pancreatic cysts are incidental, ubiquitous and poorly understood, follow-up is often not performed. Pancreatic cysts may have a significant malignant potential and their identification represents a 'window of opportunity' for the early detection of pancreatic cancer. The purpose of this study was to implement an automated Natural Language Processing (NLP)-based pancreatic cyst identification system. METHOD A multidisciplinary team was assembled. NLP-based identification algorithms were developed based on key words commonly used by physicians to describe pancreatic cysts and programmed for automated search of electronic medical records. A pilot study was conducted prospectively in a single institution. RESULTS From March to September 2013, 566,233 reports belonging to 50,669 patients were analysed. The mean number of patients reported with a pancreatic cyst was 88/month (range 78-98). The mean sensitivity and specificity were 99.9% and 98.8%, respectively. CONCLUSION NLP is an effective tool to automatically identify patients with pancreatic cysts based on electronic medical records (EMR). This highly accurate system can help capture patients 'at-risk' of pancreatic cancer in a registry.

[1]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[2]  Wendy W. Chapman,et al.  Natural Language Processing for Biosurveillance , 2006, Handbook of Biosurveillance.

[3]  Early diagnosis of pancreatic cancer; looking for a needle in a haystack? , 2012, Gut.

[4]  James A Brink,et al.  Managing incidental findings on abdominal CT: white paper of the ACR incidental findings committee. , 2010, Journal of the American College of Radiology : JACR.

[5]  Rascon [The National Cancer Institute]. , 1953, Boletin cultural e informativo - Consejo General de Colegios Medicos de Espana.

[6]  David Moher,et al.  Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Standards for Reporting of Diagnostic Accuracy. , 2003, Clinical chemistry.

[7]  Shaun J Grannis,et al.  The Indiana network for patient care: an integrated clinical information system informed by over thirty years of experience. , 2004, Journal of public health management and practice : JPHMP.

[8]  J. Disario,et al.  Improved Diagnosis of Pancreatic Adenocarcinoma Using Haptoglobin and Serum Amyloid A in a Panel Screen , 2009, World Journal of Surgery.

[9]  Mathew J. Palakal,et al.  An Efficient Pancreatic Cyst Identification Methodology Using Natural Language Processing , 2013, MedInfo.

[10]  D. Mitchell,et al.  Pancreatic cysts: depiction on single-shot fast spin-echo MR images. , 2002, Radiology.

[11]  A. Siriwardena,et al.  Systematic review of carbohydrate antigen (CA 19-9) as a biochemical marker in the diagnosis of pancreatic cancer. , 2007, European journal of surgical oncology : the journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology.

[12]  Joe Kesterson,et al.  Natural language processing for the development of a clinical registry: a validation study in intraductal papillary mucinous neoplasms. , 2010, HPB : the official journal of the International Hepato Pancreato Biliary Association.

[13]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[14]  D. Sahani,et al.  Radiology of pancreatic adenocarcinoma: Current status of imaging , 2007, Journal of gastroenterology and hepatology.

[15]  R. de Marco,et al.  Characterization of cystic tumors of the pancreas: CT accuracy. , 1999, Journal of computer assisted tomography.

[16]  Lucila Ohno-Machado,et al.  Natural language processing: an introduction , 2011, J. Am. Medical Informatics Assoc..

[17]  Key-Sun Choi,et al.  Natural Language Understanding in Neural Net , 1990 .

[18]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[19]  M. Metcalfe,et al.  Imaging of indeterminate pancreatic cystic lesions: a systematic review. , 2013, Pancreatology : official journal of the International Association of Pancreatology (IAP) ... [et al.].

[20]  Jin-Young Jang,et al.  International consensus guidelines 2012 for the management of IPMN and MCN of the pancreas. , 2012, Pancreatology : official journal of the International Association of Pancreatology (IAP) ... [et al.].

[21]  Wendy W. Chapman,et al.  Evaluation of negation phrases in narrative clinical reports , 2001, AMIA.

[22]  G. Parmigiani,et al.  Core Signaling Pathways in Human Pancreatic Cancers Revealed by Global Genomic Analyses , 2008, Science.

[23]  S. Chari,et al.  International Consensus Guidelines for Management of Intraductal Papillary Mucinous Neoplasms and Mucinous Cystic Neoplasms of the Pancreas , 2006, Pancreatology.

[24]  H. Furukawa,et al.  Early diagnosis of pancreatic cancer. , 1999, Hepato-gastroenterology.

[25]  A. Levy Prevalence of Unsuspected Pancreatic Cysts on MDCT , 2009 .

[26]  David Moher,et al.  Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. , 2004, Family practice.

[27]  T. Muto,et al.  Analysis of small cystic lesions of the pancreas , 1995, International journal of pancreatology : official journal of the International Association of Pancreatology.

[28]  Michael Goggins,et al.  Update on familial pancreatic cancer. , 2010, Advances in surgery.