Prognostication and Risk Factors for Cystic Fibrosis via Automated Machine Learning

Accurate prediction of survival for cystic fibrosis (CF) patients is instrumental in establishing the optimal timing for referring patients with terminal respiratory failure for lung transplantation (LT). Current practice considers referring patients for LT evaluation once the forced expiratory volume (FEV1) drops below 30% of its predicted nominal value. While FEV1 is indeed a strong predictor of CF-related mortality, we hypothesized that the survival behavior of CF patients exhibits a lot more heterogeneity. To this end, we developed an algorithmic framework, which we call AutoPrognosis, that leverages the power of machine learning to automate the process of constructing clinical prognostic models, and used it to build a prognostic model for CF using data from a contemporary cohort that involved 99% of the CF population in the UK. AutoPrognosis uses Bayesian optimization techniques to automate the process of configuring ensembles of machine learning pipelines, which involve imputation, feature processing, classification and calibration algorithms. Because it is automated, it can be used by clinical researchers to build prognostic models without the need for in-depth knowledge of machine learning. Our experiments revealed that the accuracy of the model learned by AutoPrognosis is superior to that of existing guidelines and other competing models.

[1]  S. Williams,et al.  A prognostic model for the prediction of survival in cystic fibrosis. , 1997, Thorax.

[2]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[3]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[4]  Parinya Chamnan,et al.  Genetic Determinants and Epidemiology of Cystic Fibrosis–Related Diabetes , 2008, Diabetes Care.

[5]  Gabriella Giordano,et al.  Validation of a predictive survival model in Italian patients with cystic fibrosis. , 2012, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society.

[6]  Frederick R Adler,et al.  Use of lung transplantation survival models to refine patient selection in cystic fibrosis. , 2005, American journal of respiratory and critical care medicine.

[7]  Jenna Sykes,et al.  The changing epidemiology and demography of cystic fibrosis. , 2017, Presse medicale.

[8]  A. E. Ewence,et al.  302 A retrospective review of renal function and intravenous (IV) antibiotic use in an adult UK cystic fibrosis centre , 2017 .

[9]  Peter A. Flach,et al.  Precision-Recall-Gain Curves: PR Analysis Done Right , 2015, NIPS.

[10]  Randal S. Olson,et al.  Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure , 2017, PSB.

[11]  Illhoi Yoo,et al.  Data Mining in Healthcare and Biomedicine: A Survey of the Literature , 2012, Journal of Medical Systems.

[12]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[13]  F R Adler,et al.  Survival effect of lung transplantation among patients with cystic fibrosis. , 2001, JAMA.

[14]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[15]  W J Warwick,et al.  Risk of death in cystic fibrosis patients with severely compromised lung function. , 1998, Chest.

[16]  T. Liou,et al.  Predictive 5-year survivorship model of cystic fibrosis. , 2001, American journal of epidemiology.

[17]  Walter Weder,et al.  True survival benefit of lung transplantation for cystic fibrosis patients: the Zurich experience. , 2009, The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation.

[18]  B. Reiser,et al.  Estimation of the Youden Index and its Associated Cutoff Point , 2005, Biometrical journal. Biometrische Zeitschrift.

[19]  John Pestian,et al.  Phenotypes of Rapid Cystic Fibrosis Lung Disease Progression during Adolescence and Young Adulthood , 2017, American journal of respiratory and critical care medicine.

[20]  Borislav D. Dimitrov,et al.  CF-ABLE-UK score: Modification and validation of a clinical prediction rule for prognosis in cystic fibrosis on data from UK CF registry , 2015 .

[21]  Xiaohong Huang,et al.  Lumacaftor-Ivacaftor in Patients with Cystic Fibrosis Homozygous for Phe508del CFTR. , 2015, The New England journal of medicine.

[22]  P. Bye,et al.  Gas exchange in disease: asthma, chronic obstructive pulmonary disease, cystic fibrosis, and interstitial lung disease. , 2011, Comprehensive Physiology.

[23]  A. Hinzpeter,et al.  Genetics of cystic fibrosis: CFTR mutation classifications toward genotype-based CF therapies. , 2014, The international journal of biochemistry & cell biology.

[24]  D. Shimmin,et al.  Practical Guidelines: Lung Transplantation in Patients with Cystic Fibrosis , 2014, Pulmonary medicine.

[25]  P. Flume,et al.  Cystic fibrosis: when to consider lung transplantation? , 1998, Chest.

[26]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[27]  Erika D. Lease,et al.  Heterogeneity in Survival in Adult Patients With Cystic Fibrosis With FEV1 < 30% of Predicted in the United States , 2017, Chest.

[28]  Gang Luo,et al.  Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction , 2016, Health Information Science and Systems.

[29]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[30]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[31]  Randal S. Olson,et al.  Automating Biomedical Data Science Through Tree-Based Pipeline Optimization , 2016, EvoApplications.

[32]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[33]  Ashish S Shah,et al.  Impact of lung allocation score on survival in cystic fibrosis lung transplant recipients. , 2015, The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation.

[34]  Fadi A. Thabtah,et al.  A review of associative classification mining , 2007, The Knowledge Engineering Review.

[35]  Parinya Chamnan,et al.  Diabetes as a Determinant of Mortality in Cystic Fibrosis , 2009, Diabetes Care.

[36]  Umer Khan,et al.  Clinical mechanism of the cystic fibrosis transmembrane conductance regulator potentiator ivacaftor in G551D-mediated cystic fibrosis. , 2014, American journal of respiratory and critical care medicine.

[37]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[38]  Peter Tarczy-Hornoch,et al.  Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods , 2017, JMIR research protocols.

[39]  R. Vender,et al.  Severe Hypercapnia in Critically Ill Adult Cystic Fibrosis Patients , 2011, Journal of clinical medicine research.

[40]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[41]  M. Corey,et al.  Prediction of mortality in patients with cystic fibrosis. , 1992, The New England journal of medicine.

[42]  Shaf Keshavjee,et al.  A consensus document for the selection of lung transplant candidates: 2014--an update from the Pulmonary Transplantation Council of the International Society for Heart and Lung Transplantation. , 2015, The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation.

[43]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[44]  Keith McNeil,et al.  International guidelines for the selection of lung transplant candidates. The American Society for Transplant Physicians (ASTP)/American Thoracic Society(ATS)/European Respiratory Society(ERS)/International Society for Heart and Lung Transplantation(ISHLT). , 1998, American journal of respiratory and critical care medicine.

[45]  Borislav D. Dimitrov,et al.  The CF-ABLE score: a novel clinical prediction rule for prognosis in patients with cystic fibrosis. , 2013, Chest.

[46]  John P Clancy,et al.  Progress in cystic fibrosis and the CF Therapeutics Development Network , 2012, Thorax.

[47]  Aliza K Fink,et al.  Data that empower: The success and promise of CF patient registries , 2017, Pediatric pulmonology.

[48]  Larry C. Lands,et al.  Candidate Markers Associated with the Probability of Future Pulmonary Exacerbations in Cystic Fibrosis Patients , 2014, PloS one.

[49]  D. Lederer,et al.  Selecting lung transplant candidates: where do current guidelines fall short? , 2012, Expert review of respiratory medicine.

[50]  C. Smith Diagnostic tests (1) – sensitivity and specificity , 2012, Phlebology.

[51]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[52]  M. Rosenfeld,et al.  Cystic Fibrosis Foundation pulmonary guideline. pharmacologic approaches to prevention and eradication of initial Pseudomonas aeruginosa infection. , 2014, Annals of the American Thoracic Society.

[53]  Margaret Rosenfeld,et al.  Developing cystic fibrosis lung transplant referral criteria using predictors of 2-year mortality. , 2002, American journal of respiratory and critical care medicine.

[54]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[55]  P. Burgel,et al.  A 3-year prognostic score for adults with cystic fibrosis. , 2017, Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society.

[56]  Emily A. Knapp,et al.  Longevity of Patients With Cystic Fibrosis in 2000 to 2010 and Beyond: Survival Analysis of the Cystic Fibrosis Foundation Patient Registry , 2014, Annals of Internal Medicine.

[57]  Mihaela van der Schaar,et al.  AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning , 2018, ICML.

[58]  Colin Wallis,et al.  Deaths in childhood from cystic fibrosis: 10-year analysis from two London specialist centres , 2012, Archives of Disease in Childhood.

[59]  J. Gustafson,et al.  Cystic Fibrosis , 2009, Journal of the Iowa Medical Society.

[60]  G. A. Whitmore,et al.  A statistical model to predict one-year risk of death in patients with cystic fibrosis. , 2015, Journal of clinical epidemiology.

[61]  Jaume Bacardit Applications of evolutionary computation: 19th European conference, Evoapplications 2016 Porto, Portugal, March 30 – April 1, 2016 proceedings, part II , 2016 .