A novel approach for selecting combination clinical markers of pathology applied to a large retrospective cohort of surgically resected pancreatic cysts

Objective: Our objective was to develop an approach for selecting combinatorial markers of pathology from diverse clinical data types. We demonstrate this approach on the problem of pancreatic cyst classification. Materials and Methods: We analyzed 1026 patients with surgically resected pancreatic cysts, comprising 584 intraductal papillary mucinous neoplasms, 332 serous cystadenomas, 78 mucinous cystic neoplasms, and 42 solid-pseudopapillary neoplasms. To derive optimal markers for cyst classification from the preoperative clinical and radiological data, we developed a statistical approach for combining any number of categorical, dichotomous, or continuous-valued clinical parameters into individual predictors of pathology. The approach is unbiased and statistically rigorous. Millions of feature combinations were tested using 10-fold cross-validation, and the most informative features were validated in an independent cohort of 130 patients with surgically resected pancreatic cysts. Results: We identified combinatorial clinical markers that classified serous cystadenomas with 95% sensitivity and 83% specificity; solid-pseudopapillary neoplasms with 89% sensitivity and 86% specificity; mucinous cystic neoplasms with 91% sensitivity and 83% specificity; and intraductal papillary mucinous neoplasms with 94% sensitivity and 90% specificity. No individual features were as accurate as the combination markers. We further validated these combinatorial markers on an independent cohort of 130 pancreatic cysts, and achieved high and well-balanced accuracies. Overall sensitivity and specificity for identifying patients requiring surgical resection was 84% and 81%, respectively. Conclusions: Our approach identified combinatorial markers for pancreatic cyst classification that had improved performance relative to the individual features they comprise. In principle, this approach can be applied to any clinical dataset comprising dichotomous, categorical, and continuous-valued parameters.

[1]  A. Maitra,et al.  Recurrent GNAS Mutations Define an Unexpected Pathway for Pancreatic Cyst Development , 2011, Science Translational Medicine.

[2]  Anne Marie Lennon,et al.  Cystic Neoplasms of the Pancreas , 2013, Journal of Gastrointestinal Surgery.

[3]  Jin-Young Jang,et al.  A combination of molecular markers and clinical features improve the classification of pancreatic cysts. , 2015, Gastroenterology.

[4]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[5]  Elliot K Fishman,et al.  A Systematic Review of Solid-Pseudopapillary Neoplasms: Are These Rare Lesions? , 2014, Pancreas.

[6]  Koichi Suda,et al.  Cystic Neoplasm of the Pancreas: A Japanese Multiinstitutional Study of Intraductal Papillary Mucinous Tumor and Mucinous Cystic Tumor , 2004, Pancreas.

[7]  I. Douglas,et al.  Beyond single-marker analyses: mining whole genome scans for insights into treatment responses in severe sepsis , 2012, The Pharmacogenomics Journal.

[8]  Robert A. Moran,et al.  Serous cystic neoplasm of the pancreas: a multinational study of 2622 patients under the auspices of the International Association of Pancreatology and European Pancreatic Club (European Study Group on Cystic Tumors of the Pancreas) , 2015, Gut.

[9]  F. Bosman,et al.  WHO Classification of Tumours of the Digestive System , 2010 .

[10]  Mary Jo Kurth,et al.  Diagnostic accuracy of heart-type fatty acid-binding protein for the early diagnosis of acute myocardial infarction. , 2012, The American journal of emergency medicine.

[11]  A. Maitra,et al.  Whole-exome sequencing of neoplastic cysts of the pancreas reveals recurrent mutations in components of ubiquitin-dependent pathways , 2011, Proceedings of the National Academy of Sciences.

[12]  John L. Cameron,et al.  Resected Serous Cystic Neoplasms of the Pancreas: A Review of 158 Patients with Recommendations for Treatment , 2007, Journal of Gastrointestinal Surgery.

[13]  David L. Masica,et al.  Correlation of somatic mutation and expression identifies genes important in human glioblastoma progression and survival. , 2011, Cancer research.

[14]  Nam Hoon Cho,et al.  Composite Three-Marker Assay for Early Detection of Kidney Cancer , 2013, Cancer Epidemiology, Biomarkers & Prevention.

[15]  Adrian Gadano,et al.  The Role of Serum Biomarkers in Predicting Fibrosis Progression in Pediatric and Adult Hepatitis C Virus Chronic Infection , 2011, PloS one.

[16]  Benjamin French,et al.  Development and evaluation of multi-marker risk scores for clinical prognosis , 2016, Statistical methods in medical research.

[17]  Cynthia S. Johnson,et al.  Performance characteristics of molecular (DNA) analysis for the diagnosis of mucinous pancreatic cysts. , 2014, Gastrointestinal endoscopy.

[18]  Susan Hutfless,et al.  Human bile contains MicroRNA‐laden extracellular vesicles that can be used for cholangiocarcinoma diagnosis , 2014, Hepatology.

[19]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Jin-Young Jang,et al.  International consensus guidelines 2012 for the management of IPMN and MCN of the pancreas. , 2012, Pancreatology : official journal of the International Association of Pancreatology (IAP) ... [et al.].

[22]  R. Karchin,et al.  Collections of simultaneously altered genes as biomarkers of cancer cell drug response. , 2013, Cancer research.

[23]  Akio Yanagisawa,et al.  Natural History of Branch Duct Intraductal Papillary Mucinous Neoplasms of the Pancreas: A Multicenter Study in Japan , 2011, Pancreas.

[24]  Susan Hutfless,et al.  Role of a Multidisciplinary Clinic in the Management of Patients with Pancreatic Cysts: A Single-Center Cohort Study , 2014, Annals of Surgical Oncology.

[25]  E. Feuer,et al.  SEER Cancer Statistics Review, 1975-2003 , 2006 .