Machine Learning Outperforms ACC/AHA CVD Risk Calculator in MESA

Background Studies have demonstrated that the current US guidelines based on American College of Cardiology/American Heart Association (ACC/AHA) Pooled Cohort Equations Risk Calculator may underestimate risk of atherosclerotic cardiovascular disease (CVD) in certain high‐risk individuals, therefore missing opportunities for intensive therapy and preventing CVD events. Similarly, the guidelines may overestimate risk in low risk populations resulting in unnecessary statin therapy. We used Machine Learning (ML) to tackle this problem. Methods and Results We developed a ML Risk Calculator based on Support Vector Machines (SVMs) using a 13‐year follow up data set from MESA (the Multi‐Ethnic Study of Atherosclerosis) of 6459 participants who were atherosclerotic CVD‐free at baseline. We provided identical input to both risk calculators and compared their performance. We then used the FLEMENGHO study (the Flemish Study of Environment, Genes and Health Outcomes) to validate the model in an external cohort. ACC/AHA Risk Calculator, based on 7.5% 10‐year risk threshold, recommended statin to 46.0%. Despite this high proportion, 23.8% of the 480 “Hard CVD” events occurred in those not recommended statin, resulting in sensitivity 0.76, specificity 0.56, and AUC 0.71. In contrast, ML Risk Calculator recommended only 11.4% to take statin, and only 14.4% of “Hard CVD” events occurred in those not recommended statin, resulting in sensitivity 0.86, specificity 0.95, and AUC 0.92. Similar results were found for prediction of “All CVD” events. Conclusions The ML Risk Calculator outperformed the ACC/AHA Risk Calculator by recommending less drug therapy, yet missing fewer events. Additional studies are underway to validate the ML model in other cohorts and to explore its ability in short‐term CVD risk prediction.

[1]  P. Burman A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods , 1989 .

[2]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[3]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[4]  Eric van Damme,et al.  Non-Cooperative Games , 2000 .

[5]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[6]  R. Kronmal,et al.  Multi-Ethnic Study of Atherosclerosis: objectives and design. , 2002, American journal of epidemiology.

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Yixin Zhong,et al.  Statistical learning theory and state of the art in SVM , 2003, The Second IEEE International Conference on Cognitive Informatics, 2003. Proceedings..

[9]  Johan A. K. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring , 2003, J. Oper. Res. Soc..

[10]  Paul Sajda,et al.  Machine learning for detection and diagnosis of disease. , 2006, Annual review of biomedical engineering.

[11]  D. Berman,et al.  From Vulnerable Plaque to Vulnerable Patient—Part III: Executive Summary of the Screening for Heart Attack Prevention and Education (SHAPE) Task Force Report , 2006 .

[12]  L. Parthiban,et al.  Intelligent Heart Disease Prediction System Using CANFIS and Genetic Algorithm , 2007 .

[13]  Fatal and nonfatal outcomes, incidence of hypertension, and blood pressure changes in relation to urinary sodium excretion. , 2011, JAMA.

[14]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[15]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[16]  Jennifer G. Robinson,et al.  Reprint: 2013 ACC/AHA Guideline on the Treatment of Blood Cholesterol to Reduce Atherosclerotic Cardiovascular Risk in Adults. , 2013, Journal of the American Pharmacists Association : JAPhA.

[17]  Nancy R Cook,et al.  Statins: new American guidelines for prevention of cardiovascular disease , 2013, The Lancet.

[18]  Ioannis A. Kakadiaris,et al.  NEATER: filtering of over-sampled data using non-cooperative game theory , 2014, Soft Computing.

[19]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[20]  Jennifer G. Robinson,et al.  2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. , 2014, Circulation.

[21]  Mary Cushman,et al.  Validation of the atherosclerotic cardiovascular disease Pooled Cohort risk equations. , 2014, JAMA.

[22]  O. Franco,et al.  Comparison of application of the ACC/AHA guidelines, Adult Treatment Panel III guidelines, and European Society of Cardiology guidelines for cardiovascular disease prevention in a European cohort. , 2014, JAMA.

[23]  Mathukumalli Vidyasagar,et al.  Identifying predictive features in drug response using machine learning: opportunities and challenges. , 2015, Annual review of pharmacology and toxicology.

[24]  John W McEvoy,et al.  An analysis of calibration and discrimination among multiple cardiovascular risk scores in a modern multiethnic cohort. , 2015, Annals of internal medicine.

[25]  M. Ozer,et al.  Comparison of the Effects of Cross-validation Methods on Determining Performances of Classifiers Used in Diagnosing Congestive Heart Failure , 2015 .

[26]  Omer T. Inan,et al.  Accelerometer body sensor network improves systolic time interval assessment with wearable ballistocardiography , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[27]  Raúl Alcaraz,et al.  Role of the P-wave high frequency energy and duration as noninvasive cardiovascular predictors of paroxysmal atrial fibrillation , 2015, Comput. Methods Programs Biomed..

[28]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[29]  K. Borgwardt,et al.  Machine Learning in Medicine , 2015, Mach. Learn. under Resour. Constraints Vol. 3.

[30]  Gediminas Adomavicius,et al.  Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting , 2016, J. Biomed. Informatics.

[31]  Tadashi Araki,et al.  PCA-based polling strategy in machine learning framework for coronary artery disease risk assessment in intravascular ultrasound: A link between carotid and coronary grayscale plaque morphology , 2016, Comput. Methods Programs Biomed..

[32]  M. Motwani,et al.  Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis , 2016, European heart journal.

[33]  J. Kai,et al.  Can machine-learning improve cardiovascular risk prediction using routine clinical data? , 2017, PloS one.

[34]  Steven Shea,et al.  Cardiovascular Event Prediction by Machine Learning: The Multi-Ethnic Study of Atherosclerosis , 2017, Circulation research.

[35]  M. Fornage,et al.  Heart Disease and Stroke Statistics—2017 Update: A Report From the American Heart Association , 2017, Circulation.

[36]  Richard A. Kronmal,et al.  Risk score overestimation: the impact of individual cardiovascular risk factors and preventive therapies on the performance of the American Heart Association-American College of Cardiology-Atherosclerotic Cardiovascular Disease risk score in a modern multi-ethnic cohort , 2016, European heart journal.

[37]  Khalid Raza,et al.  Machine Learning-based state-of-the-art methods for the classification of RNA-Seq data , 2017, bioRxiv.