Understanding risk factors in cardiac rehabilitation patients with random forests and decision trees

Cardiac rehabilitation is a well-recognised non-pharmacological intervention recommended for the prevention of cardiovascular disease. Numerous studies have produced large amounts of data to examine the above aspects in patient groups. In this paper, datasets collected for over a 10 year period by one Australian hospital are analysed using decision trees to derive prediction rules for the outcome of phase II cardiac rehabilitation. Analysis includes prediction of the outcome of the cardiac rehabilitation program in terms of three groups of cardiovascular risk factors: physiological, psychosocial and performance risk factors. Random forests are used for feature selection to make the models compact and interpretable. Balanced sampling is used to deal with heavily imbalanced class distribution. Experimental results show that the outcome of phase II cardiac rehabilitation in terms of physiological, psychosocial and performance risk factor can be predicted based on initial readings of cholesterol level and hypertension, level achieved in six minute walk test, and Hospital Anxiety and Depression Score (HADS) anxiety score and HADS depression score respectively. This will allow for identifying high risk patient groups and developing personalised cardiac rehabilitation programs for those patients to increase their chances of success and minimize their risk of failure.

[1]  J. Robinson,et al.  A randomised controlled trial of a self-management plan for patients with newly diagnosed angina. , 2002, The British journal of general practice : the journal of the Royal College of General Practitioners.

[2]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[3]  Jacky Austin,et al.  Five-year follow-up findings from a randomized controlled trial of cardiac rehabilitation for heart failure , 2008, European journal of cardiovascular prevention and rehabilitation : official journal of the European Society of Cardiology, Working Groups on Epidemiology & Prevention and Cardiac Rehabilitation and Exercise Physiology.

[4]  D. Kitzman,et al.  Endurance Exercise Training in Older Patients with Heart Failure: Results from a Randomized, Controlled, Single‐Blind Trial , 2009, Journal of the American Geriatrics Society.

[5]  Bernard C. Jiang,et al.  Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors , 2011, Expert Syst. Appl..

[6]  R. Hancox,et al.  Cardio‐selective and non‐selective beta‐blockers in chronic obstructive pulmonary disease: effects on bronchodilator response and exercise , 2010, Internal medicine journal.

[7]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[8]  Shah Ebrahim,et al.  Exercise-based rehabilitation for heart failure. , 2014, The Cochrane database of systematic reviews.

[9]  Taghi M. Khoshgoftaar,et al.  Hybrid sampling for imbalanced data , 2008, 2008 IEEE International Conference on Information Reuse and Integration.

[10]  C Delagardelle,et al.  Strength training for patients with chronic heart failure. , 2005, Europa medicophysica.

[11]  Jacky Austin,et al.  Randomised controlled trial of cardiac rehabilitation in elderly patients with heart failure , 2005, European journal of heart failure.

[12]  Boleslaw K. Szymanski,et al.  Random Forests Feature Selection with K-PLS: Detecting Ischemia from Magnetocardiograms , 2006, ESANN.

[13]  J. Bliss,et al.  An exploration of exercise training effects in coronary heart disease. , 2008, British journal of community nursing.

[14]  Lee Ingle,et al.  Physical activity readiness in patient withdrawals from cardiac rehabilitation. , 2009, British journal of nursing.

[15]  L A Palinkas,et al.  Low blood pressure and depression in older men: a population based study , 1994, BMJ.

[16]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[17]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[18]  Christoph Steinbeck,et al.  Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction , 2008, BMC Bioinformatics.

[19]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.

[20]  L. Bosquet,et al.  Optimization of high intensity interval exercise in coronary heart disease , 2010, European Journal of Applied Physiology.

[21]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[22]  Tim Chester,et al.  Cardiac rehabilitation for patients with heart failure: A service development audit , 2006 .

[23]  Geoffrey J. McLachlan,et al.  Statistical Analysis on Microarray Data: Selection of Gene Prognosis Signatures , 2009 .

[24]  Hong-Bin Shen,et al.  Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection. , 2009, Protein and peptide letters.

[25]  Constantinos S. Pattichis,et al.  Assessment of the Risk Factors of Coronary Heart Events Based on Data Mining With Decision Trees , 2010, IEEE Transactions on Information Technology in Biomedicine.

[26]  Stavros Dimopoulos,et al.  Effects of a 3-month rehabilitation program on muscle oxygenation in congestive heart failure patients as assessed by NIRS , 2010 .

[27]  Stphane Tuffry,et al.  Data Mining and Statistics for Decision Making , 2011 .

[28]  M. Risberg,et al.  Group-based Aerobic Interval Training in Patients With Chronic Heart Failure: Norwegian Ullevaal Model , 2008, Physical Therapy.

[29]  José Alvaro Marques Marcolino,et al.  Hospital Anxiety and Depression Scale: a study on the validation of the criteria and reliability on preoperative patients. , 2007, Revista brasileira de anestesiologia.

[30]  Nathalie Renaud,et al.  Absence of Exercise Capacity Improvement After Exercise Training Program: A Strong Prognostic Factor in Patients With Chronic Heart Failure , 2008, Circulation. Heart failure.

[31]  K Noy,et al.  Cardiac rehabilitation: structure, effectiveness and the future. , 1998, British journal of nursing.

[32]  Olle Melander,et al.  Assessment of conventional cardiovascular risk factors and multiple biomarkers for the prediction of incident heart failure and atrial fibrillation. , 2010, Journal of the American College of Cardiology.

[33]  Bengt Fridlund,et al.  Knowledge of heart disease risk in patients declining rehabilitation. , 2010, British journal of nursing.

[34]  S. Bangalore,et al.  Beta-blockers and exercise. , 2006, Journal of the American College of Cardiology.

[35]  Carl J Lavie,et al.  Exercise Training and Heart Failure in Older Adults—Dismal Failure or Not Enough Exercise? , 2009, Journal of the American Geriatrics Society.

[36]  Sanjay Kalra,et al.  EXERCISE BASED REHABILITATION FOR HEART FAILURE , 2007 .

[37]  Alireza Kajabadi,et al.  Data mining cardiovascular risk factors , 2009, 2009 International Conference on Application of Information and Communication Technologies.

[38]  Tuan D. Pham Computational biology : issues and applications in oncology , 2009 .

[39]  Bjoern H. Menze,et al.  A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data , 2009, BMC Bioinformatics.

[40]  Romain Meeusen,et al.  Long-term effect of rehabilitation in coronary artery disease patients: randomized clinical trial of the impact of exercise volume , 2010, Clinical rehabilitation.

[41]  B. S. Tur,et al.  The effect of cardiac rehabilitation on quality of life , anxiety and depression in patients with congestive heart failure ; a randomized controlled trial , short-term results Europa Medicophysica Best , 2022 .

[42]  Graham J. Williams,et al.  Rattle: A Data Mining GUI for R , 2009, R J..