BACKGROUND
Various statistical methods are commonly used to assess the accuracy of near-continuous glucose sensors. The performance and reliability of these methods have not been well described.
METHODS
We used computer simulation to describe the behavior of several statistical measures including error grid analysis, receiver operating characteristics, correlation, and repeated measures under varying conditions. Actual data from an inpatient accuracy study conducted by the Diabetes Research in Children Network (DirecNet) were also used to demonstrate these limitations.
RESULTS
Sensors that were made artificially inaccurate by randomly shuffling the pairings to reference values still fell in Zone A or B 78% of the time for the Clarke grid and 79% of the time for the modified grid. Area under the curve values for these shuffled pairs averaged 64% for hypoglycemia and 68% for hyperglycemia. Continuous error grid analysis resulted in 75% of shuffled pairs designated as "Accurate Readings" or "Benign Errors." Correlation analysis gave inconsistent results for sensors simulated to have identical accuracies with values ranging from 0.50 to 0.96. Simplistic repeated-measures analyses accounting for subject effects, but ignoring temporal correlation patterns substantially inflated the probability of falsely obtaining a statistically significant result. In simulations where the null hypothesis was correct, 23% of observed P values were <0.05 and 12% of observed P values were <0.01.
CONCLUSION
Commonly used statistical methods can give overly optimistic and/or inconsistent notions of sensor accuracy if results are not placed in proper context. Novel techniques are needed to assess the accuracy of near-continuous glucose sensors.
[1]
D. Gough,et al.
Reservations on the Use of Error Grid Analysis for the Validation of Blood Glucose Assays
,
1997,
Diabetes Care.
[2]
D. Cox,et al.
Evaluating Clinical Accuracy of Systems for Self-Monitoring of Blood Glucose
,
1987,
Diabetes Care.
[3]
F M Urry,et al.
Evaluation and comparison of 10 glucose methods and the reference method recommended in the proposed product class standard (1974).
,
1977,
Clinical chemistry.
[4]
B H Ginsberg,et al.
A new consensus error grid to evaluate the clinical significance of inaccuracies in the measurement of blood glucose.
,
2000,
Diabetes care.
[5]
R. Beck,et al.
GlucoWatch G2 Biographer alarm reliability during hypoglycemia in children.
,
2004,
Diabetes technology & therapeutics.
[6]
D. Cox,et al.
Evaluating the accuracy of continuous glucose-monitoring sensors: continuous glucose-error grid analysis illustrated by TheraSense Freestyle Navigator data.
,
2004,
Diabetes care.
[7]
David C Klonoff,et al.
The need for separate performance goals for glucose sensors in the hypoglycemic, normoglycemic, and hyperglycemic ranges.
,
2004,
Diabetes care.
[8]
G. W. Snedecor.
Statistical Methods
,
1964
.
[9]
D. Altman,et al.
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
,
1986,
The Lancet.
[10]
J. Hanley,et al.
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
,
1982,
Radiology.
[11]
D. Cox,et al.
Understanding Error Grid Analysis
,
1997,
Diabetes Care.