Performance of Current Risk Stratification Models for Predicting Mortality in Patients with Heart Failure: A Systematic Review and Meta-Analysis.

BACKGROUND There are several risk scores designed to predict mortality in patients with heart failure (HF). AIM To assess performance of risk scores validated for mortality prediction in patients with acute HF (AHF) and chronic HF. METHODS MEDLINE and Scopus were searched from January 2015 to January 2021 for studies which internally or externally validated risk models for predicting all-cause mortality in patients with AHF and chronic HF. Discrimination data were analyzed using C-statistics, and pooled using generic inverse-variance random-effects model. RESULTS Nineteen studies (n = 494,156 patients; AHF:24,762; chronic HF mid-term mortality:62,000; chronic HF long-term mortality:452,097) and 11 risk scores were included. Overall, discrimination of risk scores was good across the three subgroups: AHF mortality (C-statistic:0.76, [0.68-0.83]), chronic HF mid-term mortality (1 year; C-statistic:0.74, [0.68-0.79]) and chronic HF long-term mortality (≥2 years; C-statistic:0.71, [0.69-0.73]). MEESSI-AHF (C-statistic:0.81, [0.80-0.83]) and MARKER-HF (C-statistic:0.85, [0.80-0.89]) had excellent discrimination for AHF and chronic HF mid-term mortality respectively, whereas MECKI had good discrimination (C-statistic:0.78, [0.73-0.83]) for chronic HF long-term mortality relative to other models. Overall, risk scores predicting short-term mortality in patients with AHF did not have evidence of poor calibration (Hosmer-Lemeshow p > 0.05). However, risk models predicting mid-term and long-term mortality in patients with chronic HF varied in calibration performance. CONCLUSIONS Majority of recently validated risk scores showed good discrimination for mortality in patients with HF. MEESSI-AHF demonstrated excellent discrimination in patients with AHF, and MARKER-HF and MECKI displayed excellent discrimination in patients with chronic HF. However, modest reporting of calibration and lack of head-to-head comparisons in same populations warrant future studies.