10-year CVD risk prediction and minimization via InverseClassification

Cardiovascular diseases (CVD) remain the leading cause of death around the world. In past decades, many preventive strategies have been recommended to reduce the risk of CVD. However, current CVD risk prediction schemes are not targeted to personalized and optimized recommendations. The goal of this study was to better identify individuals at high risk of a CVD event, and recommend an optimal set of risk factor changes that could reduce the risk of long-term CVD events. We identified 100 demographic, lab, lifestyle, and medication variables for 12907 individuals who participated to the ARIC study and had no CVD events at baseline. We examined the prognostic performance of these features in isolation and ranked them based on mutual information. Then we combined those features to build predictive models using k-nearest neighbor prediction to estimate the 10-year CVD risk for each individual. Our feature-ranking method agreed with traditional risk factors identified by a domain expert. Our approach was successful in identifying cases with high risk and performed as well as traditional methods. Then we applied inverse classification to find the personalized optimal changes to reduce 10-year CVD risk. We also created a personalized package of five optimal changes for each individual to reduce their 10-year CVD risk. This approach can be applied to other chronic disease risk prediction and personalized recommendations, and may be useful to both health care providers and patients in making personalized health care recommendations and decisions.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  A. Folsom,et al.  The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators. , 1989, American journal of epidemiology.

[3]  William Nick Street,et al.  Breast Cancer Diagnosis and Prognosis Via Linear Programming , 1995, Oper. Res..

[4]  D. Levy,et al.  Prediction of coronary heart disease using risk factor categories. , 1998, Circulation.

[5]  J. Manson,et al.  Primary prevention of coronary heart disease in women through diet and lifestyle. , 2000, The New England journal of medicine.

[6]  R B D'Agostino,et al.  Cardiovascular Risk Assessment Based on US Cohort Studies: Findings From a National Heart, Lung, and Blood Institute Workshop , 2001, Circulation.

[7]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[8]  A. Folsom,et al.  Coronary heart disease risk prediction in the Atherosclerosis Risk in Communities (ARIC) study. , 2003, Journal of clinical epidemiology.

[9]  Deepak L. Bhatt,et al.  International prevalence, recognition, and treatment of cardiovascular risk factors in outpatients with atherothrombosis. , 2006, JAMA.

[10]  Judith Wylie-Rosett,et al.  Diet and Lifestyle Recommendations Revision 2006: A Scientific Statement From the American Heart Association Nutrition Committee , 2006, Circulation.

[11]  Arch G Mainous,et al.  Turning back the clock: adopting a healthy lifestyle in middle age. , 2007, The American journal of medicine.

[12]  Charu C. Aggarwal,et al.  The Inverse Classification Problem , 2010, Journal of Computer Science and Technology.

[13]  D. Mozaffarian,et al.  Defining and Setting National Goals for Cardiovascular Health Promotion and Disease Reduction: The American Heart Association's Strategic Impact Goal Through 2020 and Beyond , 2010, Circulation.

[14]  Chen Yang,et al.  A data mining approach to MPGN type II renal survival analysis , 2010, IHI.

[15]  Ross C. Brownson,et al.  Chronic disease epidemiology and control. , 2010 .

[16]  D. Mozaffarian,et al.  Heart disease and stroke statistics--2011 update: a report from the American Heart Association. , 2011, Circulation.