Predicting increasing glycaemic burden (GB) using logistic regression models derived from routinely collected clinical data

When glycaemic control deteriorates, it is hard to retrieve and patients are at risk of complications. We developed predictive models to identify patients whose treatment was likely to fail. 5027 patients with diabetes were followed over six years. GB was calculated using 92 000 HbA1c measurements. Logistic regression models with 10-fold cross validation were derived with data from 2500 random patients, and tested on the remaining 2527. Performance was assessed using area under receiver operating curve (AUROC). Complete follow-up data were available for 91.8%. A model to predict likelihood of death was derived (AUROC 0.84, SE 0.01). Output from this was included as input into GB models. Models for positive GB were derived for thresholds equivalent to average HbA1c of 7.0, 8.0 and 9.0. The models all performed as well on test data with AUROCS of 0.83 (SE 0.01), 0.89 (SE 0.008) and 0.90 (SE 0.01). The model to predict very poor control (HbA1c >9.0%) required fewest inputs and identified patients with deteriorating control with sensitivity of 84.3% and specificity of 84.2%. Likely treatment failures can be identified using routine clinical data. Use of statistical models could improve targeting of intensive treatment, and reduce the burden associated with treatment failure.