Latent class regression improves the predictive acuity and clinical utility of survival prognostication amongst chronic heart failure patients

The present study aimed to compare the predictive acuity of latent class regression (LCR) modelling with: standard generalised linear modelling (GLM); and GLMs that include the membership of subgroups/classes (identified through prior latent class analysis; LCA) as alternative or additional candidate predictors. Using real world demographic and clinical data from 1,802 heart failure patients enrolled in the UK-HEART2 cohort, the study found that univariable GLMs using LCA-generated subgroup/class membership as the sole candidate predictor of survival were inferior to standard multivariable GLMs using the same four covariates as those used in the LCA. The inclusion of the LCA subgroup/class membership together with these four covariates as candidate predictors in a multivariable GLM showed no improvement in predictive acuity. In contrast, LCR modelling resulted in a 10-14% improvement in predictive acuity and provided a range of alternative models from which it would be possible to balance predictive acuity against entropy to select models that were optimally suited to improve the efficient allocation of clinical resources to address the differential risk of the outcome (in this instance, survival). These findings provide proof-of-principle that LCR modelling can improve the predictive acuity of GLMs and enhance the clinical utility of their predictions. These improvements warrant further attention and exploration, including the use of alternative techniques (including machine learning algorithms) that are also capable of generating latent class structure while determining outcome predictions, particularly for use with large and routinely collected clinical datasets, and with binary, count and continuous variables.

[1]  M. Kearney,et al.  Socioeconomic deprivation and mode-specific outcomes in patients with chronic heart failure , 2018, Heart.

[2]  C. O’Donnell Opportunities and Challenges for Polygenic Risk Scores in Prognostication and Prevention of Cardiovascular Disease. , 2020, JAMA cardiology.

[3]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[4]  Christopher Hitchcock,et al.  Prediction Versus Accommodation and the Risk of Overfitting , 2004, The British Journal for the Philosophy of Science.

[5]  B. Muthén,et al.  Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study , 2007 .

[6]  Douglas G Altman,et al.  Statistics Notes: Bootstrap resampling methods , 2015, BMJ : British Medical Journal.

[7]  Clyde B. Schechter,et al.  Changing Characteristics and Mode of Death Associated With Chronic Heart Failure Caused by Left Ventricular Systolic Dysfunction: A Study Across Therapeutic Eras , 2011, Circulation. Heart failure.

[8]  Kellyn F Arnold,et al.  Use of directed acyclic graphs (DAGs) in applied health research: review and recommendations , 2019 .

[9]  S. Cole,et al.  Illustrating bias due to conditioning on a collider. , 2010, International journal of epidemiology.

[10]  T. Lumley,et al.  Time‐Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker , 2000, Biometrics.

[11]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[12]  A. Heppenstall,et al.  Analysing trajectories of a longitudinal exposure: A causal perspective on common methods in lifecourse research , 2019, PloS one.

[13]  J. Mandrekar Receiver operating characteristic curve in diagnostic test assessment. , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[14]  Kellyn F Arnold,et al.  Time to reality check the promises of machine learning-powered precision medicine , 2020, The Lancet. Digital health.

[15]  Amy Downing,et al.  Multilevel Latent Class Modelling of Colorectal Cancer Survival Status at Three Years and Socioeconomic Background Whilst Incorporating Stage of Disease , 2013 .

[16]  M. Parascandola,et al.  Individualised risk estimation and the nature of prevention , 2010 .

[17]  A. Kuk All subsets regression in a proportional hazards model , 1984 .

[18]  Francis X. Diebold,et al.  On the Origin(s) and Development of the Term 'Big Data' , 2012 .

[19]  Kevin J. Grimm,et al.  Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach , 2017 .

[20]  B. Gersh,et al.  Risk stratification for stroke in atrial fibrillation: a critique , 2018, European heart journal.

[21]  Vinny Davies,et al.  Reflection on modern methods: generalized linear models for prognosis and intervention—theory, practice and implications for machine learning , 2020, International journal of epidemiology.

[22]  Lorenzo Richiardi,et al.  Mediation analysis in epidemiology: methods, interpretation and bias. , 2013, International journal of epidemiology.

[23]  Christopher Winship,et al.  Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable. , 2014, Annual review of sociology.

[24]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[25]  Terry Anthony Byrd,et al.  Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations , 2018 .

[26]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[27]  John W McEvoy,et al.  An analysis of calibration and discrimination among multiple cardiovascular risk scores in a modern multiethnic cohort. , 2015, Annals of internal medicine.

[28]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[29]  Anders Huitfeldt,et al.  Is caviar a risk factor for being a millionaire? , 2016, British Medical Journal.

[30]  I. Kawachi,et al.  Individual risk prediction and population-wide disease prevention. , 2000, Epidemiologic reviews.

[31]  M. Kearney,et al.  Mortality Reduction Associated With β-Adrenoceptor Inhibition in Chronic Heart Failure Is Greater in Patients With Diabetes , 2017, Diabetes Care.

[32]  S. Greenland,et al.  The table 2 fallacy: presenting and interpreting confounder and modifier coefficients. , 2013, American journal of epidemiology.

[33]  M. S. Gilthorpe,et al.  Challenges in modelling the random structure correctly in growth mixture models and the impact this has on model mixtures , 2014, Journal of developmental origins of health and disease.

[34]  Nema Dean,et al.  Latent class analysis variable selection , 2010, Annals of the Institute of Statistical Mathematics.

[35]  K. Hajian‐Tilaki,et al.  Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation. , 2013, Caspian journal of internal medicine.