The Utility of Shapelets for Analyzing Physical Activity of COPD Patients and non-COPD controls

Physical activity is an attractive endpoint for novel therapies in Chronic Obstructive Pulmonary Disease (COPD). However, a deep understanding about COPD physical activity patterns and disease severity is lacking. In this research, we study the physical activity patterns for 184 individuals with and without COPD from a single center in the COPDGene cohort. These subjects participated in a 3-week observational study wearing wrist-worn accelerometers for collecting physical activity data. Our exploratory data analysis finds using the whole range of activity data is insufficient for patient clustering. Alternatively, we use shapelets, small and local sub-sequences, to better capture patients' behaviors in different groups. We develop a length-bound heuristic algorithm for choosing the subset that has the best clustering result. The study shows the potentials of using shapelets for helping providers in assessing COPD patients' status.

[1]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[2]  Kun Il Park,et al.  Fundamentals of Probability and Stochastic Processes with Applications to Communications , 2017 .

[3]  B. Rowe,et al.  Presentations to Emergency Departments for COPD: A Time Series Analysis , 2016, Canadian respiratory journal.

[4]  Jason Lines,et al.  Alternative Quality Measures for Time Series Shapelets , 2012, IDEAL.

[5]  Eamonn J. Keogh,et al.  Clustering Time Series Using Unsupervised-Shapelets , 2012, 2012 IEEE 12th International Conference on Data Mining.

[6]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[7]  Norbert Link,et al.  Prototype Optimization for Temporarily and Spatially Distorted Time Series , 2010, AAAI Spring Symposium: It's All in the Timing.

[8]  E. Regan,et al.  Genetic Epidemiology of COPD (COPDGene) Study Design , 2011, COPD.

[9]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[10]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[11]  Dan Roth,et al.  Efficient Pattern-Based Time Series Classification on GPU , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[13]  M. Morgan Life in slow motion: quantifying physical activity in COPD , 2008, Thorax.

[14]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[15]  S. Braun,et al.  Counting Steps in Activities of Daily Living in People With a Chronic Disease Using Nine Commercially Available Fitness Trackers: Cross-Sectional Validity Study , 2018, JMIR mHealth and uHealth.

[16]  Piet de Jong,et al.  Time‐Series Analysis , 1995 .

[17]  M. Marcus,et al.  Evaluation of the SenseWear Pro Armband to assess energy expenditure during exercise. , 2004, Medicine and science in sports and exercise.

[18]  Stephen G. Hall,et al.  ARIMA Models and the Box-Jenkins Methodology , 2016 .

[19]  W. Fuller,et al.  Distribution of the Estimators for Autoregressive Time Series with a Unit Root , 1979 .

[20]  A. Agustí,et al.  COPD, a multicomponent disease: implications for management. , 2005, Respiratory medicine.

[21]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[22]  Jason Lines,et al.  A shapelet transform for time series classification , 2012, KDD.

[23]  W. Krijnen,et al.  Daily Physical Activity in Patients with Chronic Obstructive Pulmonary Disease: A Systematic Review , 2011, COPD.

[24]  E. Russi,et al.  Predicting Daily Physical Activity in Patients with Chronic Obstructive Pulmonary Disease , 2012, PloS one.

[25]  H. Magnussen,et al.  Disease Progression and Changes in Physical Activity in Patients with Chronic Obstructive Pulmonary Disease. , 2015, American journal of respiratory and critical care medicine.

[26]  KeoghEamonn,et al.  On the Need for Time Series Data Mining Benchmarks , 2003 .

[27]  R. Bowler,et al.  Real-world use of rescue inhaler sensors, electronic symptom questionnaires and physical activity monitors in COPD , 2019, BMJ Open Respiratory Research.

[28]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[29]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[30]  Thierry Troosters,et al.  Physical inactivity in patients with COPD, a controlled multi-center pilot-study. , 2010, Respiratory medicine.

[31]  Oliver Amft,et al.  Physical activity patterns and clusters in 1001 patients with COPD , 2017, Chronic respiratory disease.

[32]  Jason Lines,et al.  Classification of time series by shapelet transformation , 2013, Data Mining and Knowledge Discovery.

[33]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[34]  Shu-Yi Liao,et al.  Physical Activity Monitoring in Patients with Chronic Obstructive Pulmonary Disease. , 2014, Chronic obstructive pulmonary diseases.