A Data-Driven Approach to Classifying Daily Continuous Glucose Monitoring (CGM) Time Series

According to the World Health Organization, about 422 million people worldwide have type 1 or type 2 diabetes (T1D, T2D), with the latter accounting for 90–95% of cases. Safe and effective treatment of patients with diabetes requires accurate and frequent monitoring of their blood sugar levels. Continuous glucose monitoring (CGM) is a monitoring technology developed to address this need, and its use among U.S. T1D patients has increased from 6% in 2011 to 38% in 2018 and continues to increase worldwide in both T1D and T2D. This paper presents a data-driven approach to determine <inline-formula><tex-math notation="LaTeX">$\Omega$</tex-math></inline-formula>, a finite set of representative daily profiles (motifs) such that <italic>almost any</italic> daily CGM profile generated by a patient can be matched to one of the motifs in <inline-formula><tex-math notation="LaTeX">$\Omega$</tex-math></inline-formula>. The training data set (9741 profiles) was used to identify 8 candidate sets of motifs, while the validation data set (14 175 profiles) was used to select the final set <inline-formula><tex-math notation="LaTeX">$\Omega$</tex-math></inline-formula>. The robustness of <inline-formula><tex-math notation="LaTeX">$\Omega$</tex-math></inline-formula> was established by using it to successfully classify (match against a representative daily profile in <inline-formula><tex-math notation="LaTeX">$\Omega$</tex-math></inline-formula>) 99.0% of 42 595 daily CGM profiles in the testing data set. All data sets contained daily CGM profiles from six studies involving T1D and T2D patients using a variety of treatment modes, including daily insulin injections, insulin pumps, or artificial pancreas (AP). The classified profiles can be used in predictive modeling, decision support, and automated control systems (e.g., AP).