Use of Hundreds of Electrocardiographic Biomarkers for Prediction of Mortality in Postmenopausal Women: The Women's Health Initiative

Background— Simultaneous contribution of hundreds of electrocardiographic (ECG) biomarkers to prediction of long-term mortality in postmenopausal women with clinically normal resting ECGs is unknown. Methods and Results— We analyzed ECGs and all-cause mortality in 33 144 women enrolled in the Women's Health Initiative trials who were without baseline cardiovascular disease or cancer and had normal ECGs by Minnesota and Novacode criteria. Four hundred and seventy-seven ECG biomarkers, encompassing global and individual ECG findings, were measured with computer algorithms. During a median follow-up of 8.1 years (range for survivors, 0.5 to 11.2 years), 1229 women died. For analyses, the cohort was randomly split into derivation (n=22 096; deaths, 819) and validation (n=11 048; deaths, 410) subsets. ECG biomarkers and demographic and clinical characteristics were simultaneously analyzed using both traditional Cox regression and random survival forest, a novel algorithmic machine-learning approach. Regression modeling failed to converge. Random survival forest variable selection yielded 20 variables that were independently predictive of long-term mortality, 14 of which were ECG biomarkers related to autonomic tone, atrial conduction, and ventricular depolarization and repolarization. Conclusions— We identified 14 ECG biomarkers from among hundreds that were associated with long-term prognosis using a novel random forest variable selection methodology. These biomarkers were related to autonomic tone, atrial conduction, ventricular depolarization, and ventricular repolarization. Quantitative ECG biomarkers have prognostic importance and may be markers of subclinical disease in apparently healthy postmenopausal women.

[1]  S. Fisher,et al.  QRS duration and mortality in patients with congestive heart failure. , 2002, American heart journal.

[2]  Garnet L Anderson,et al.  The Women's Health Initiative recruitment methods and results. , 2003, Annals of epidemiology.

[3]  A. LaCroix,et al.  Electrocardiographic Predictors of Incident Congestive Heart Failure and All-Cause Mortality in Postmenopausal Women: The Women’s Health Initiative , 2006, Circulation.

[4]  Zhu-ming Zhang,et al.  Evaluation and comparison of the Minnesota Code and Novacode for electrocardiographic Q-ST wave abnormalities for the independent prediction of incident coronary heart disease and total mortality (from the Women's Health Initiative). , 2010, The American journal of cardiology.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[7]  D. F. Parkhurst,et al.  Indicator bacteria at five swimming beaches-analysis using random forests. , 2005, Water research.

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Hongyu Zhao,et al.  Pathway analysis using random forests classification and regression , 2006, Bioinform..

[10]  K. Lunetta,et al.  Screening large-scale association study data: exploiting interactions using random forests , 2004, BMC Genetics.

[11]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[12]  L. Breiman Heuristics of instability and stabilization in model selection , 1996 .

[13]  Sinisa Pajevic,et al.  Short-term prediction of mortality in patients with systemic lupus erythematosus: classification of outcomes using random forests. , 2006, Arthritis and rheumatism.

[14]  C. Kooperberg,et al.  Outcomes ascertainment and adjudication methods in the Women's Health Initiative. , 2003, Annals of epidemiology.

[15]  K. Lunetta,et al.  Identifying SNPs predictive of phenotype using random forests , 2005, Genetic epidemiology.

[16]  Udaya B. Kogalur,et al.  High-Dimensional Variable Selection for Survival Data , 2010 .

[17]  JoAnn E. Manson,et al.  Design of the Women's Health Initiative clinical trial and observational study. The Women's Health Initiative Study Group. , 1998, Controlled clinical trials.

[18]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[19]  B. Chaitman,et al.  The novacode criteria for classification of ECG abnormalities and their clinically significant progression and regression , 1998 .

[20]  E J Topol,et al.  Cause of death in clinical research: time for a reassessment? , 1999, Journal of the American College of Cardiology.

[21]  Tianxi Cai,et al.  The Performance of Risk Prediction Models , 2008, Biometrical journal. Biometrische Zeitschrift.

[22]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[23]  Philip Greenland,et al.  Major and minor ECG abnormalities in asymptomatic women and risk of cardiovascular events and mortality. , 2007, JAMA.

[24]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[25]  H. Ishwaran,et al.  Lung metastasis genes couple breast tumor size and metastatic spread , 2007, Proceedings of the National Academy of Sciences.

[26]  A. LaCroix,et al.  Electrocardiographic Abnormalities That Predict Coronary Heart Disease Events and Mortality in Postmenopausal Women: The Women’s Health Initiative , 2006, Circulation.

[27]  Ronald J. Prineas,et al.  The Minnesota Code Manual of Electrocardiographic Findings , 2009 .

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[30]  Udaya B. Kogalur,et al.  Random Survival Forests for R , 2007 .