Co-Morbidity Exploration on Wearables Activity Data Using Unsupervised Pre-training and Multi-Task Learning

Physical activity and sleep play a major role in the prevention and management of many chronic conditions. It is not a trivial task to understand their impact on chronic conditions. Currently, data from electronic health records (EHRs), sleep lab studies, and activity/sleep logs are used. The rapid increase in the popularity of wearable health devices provides a significant new data source, making it possible to track the user's lifestyle real-time through web interfaces, both to consumer as well as their healthcare provider, potentially. However, at present there is a gap between lifestyle data (e.g., sleep, physical activity) and clinical outcomes normally captured in EHRs. This is a critical barrier for the use of this new source of signal for healthcare decision making. Applying deep learning to wearables data provides a new opportunity to overcome this barrier. To address the problem of the unavailability of clinical data from a major fraction of subjects and unrepresentative subject populations, we propose a novel unsupervised (task-agnostic) time-series representation learning technique called act2vec. act2vec learns useful features by taking into account the co-occurrence of activity levels along with periodicity of human activity patterns. The learned representations are then exploited to boost the performance of disorder-specific supervised learning models. Furthermore, since many disorders are often related to each other, a phenomenon referred to as co-morbidity, we use a multi-task learning framework for exploiting the shared structure of disorder inducing life-style choices partially captured in the wearables data. Empirical evaluation using actigraphy data from 4,124 subjects shows that our proposed method performs and generalizes substantially better than the conventional time-series symbolic representational methods and task-specific deep learning models.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  A. Sadeh The role and validity of actigraphy in sleep medicine: an update. , 2011, Sleep medicine reviews.

[3]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[4]  Yu Cheng,et al.  Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[5]  Brian Kingsbury,et al.  New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Fei Wang,et al.  A Multi-task Learning Framework for Joint Disease Risk Prediction and Comorbidity Discovery , 2014, 2014 22nd International Conference on Pattern Recognition.

[7]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  David Berrigan,et al.  Associations between physical activity, sedentary time, sleep duration and daytime sleepiness in US adults. , 2014, Preventive medicine.

[11]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  A. Shelgikar,et al.  Multidisciplinary sleep centers: strategies to improve care of sleep disorders patients. , 2014, Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine.

[15]  C. Mazzà,et al.  Step Detection and Activity Recognition Accuracy of Seven Physical Activity Monitors , 2015, PloS one.

[16]  Nathaniel F Watson,et al.  Health Care Savings: The Economic Value of Diagnostic and Therapeutic Care for Obstructive Sleep Apnea. , 2016, Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine.

[17]  Patrick Schäfer,et al.  Scalable time series classification , 2016, Data Mining and Knowledge Discovery.

[18]  Ryen W. White,et al.  Harnessing the Web for Population-Scale Physiological Sensing: A Case Study of Sleep and Performance , 2017, WWW.

[19]  Lloyd E Chambless,et al.  Design and implementation of the Hispanic Community Health Study/Study of Latinos. , 2010, Annals of epidemiology.

[20]  Moshe Tennenholtz,et al.  Encouraging Physical Activity in Patients With Diabetes: Intervention Using a Reinforcement Learning System , 2017, Journal of medical Internet research.

[21]  Hwee Pink Tan,et al.  Deep Activity Recognition Models with Triaxial Accelerometers , 2015, AAAI Workshop: Artificial Intelligence Applied to Assistive Technologies and Smart Environments.

[22]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[23]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[24]  B. Starfield,et al.  Defining Comorbidity: Implications for Understanding Health and Health Services , 2009, The Annals of Family Medicine.

[25]  Sofia Ouhbi,et al.  Free Web-based Personal Health Records: An Analysis of Functionality , 2013, Journal of Medical Systems.

[26]  R. Kronmal,et al.  Multi-Ethnic Study of Atherosclerosis: objectives and design. , 2002, American journal of epidemiology.

[27]  Patrick Schäfer The BOSS is concerned with time series classification in the presence of noise , 2014, Data Mining and Knowledge Discovery.

[28]  Shafiq R. Joty,et al.  Sleep Quality Prediction From Wearable Data Using Deep Learning , 2016, JMIR mHealth and uHealth.

[29]  Tim Oates,et al.  Time series classification from scratch with deep neural networks: A strong baseline , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[30]  Andrew McCallum,et al.  Joint Parsing and Semantic Role Labeling , 2005, CoNLL.

[31]  Shafiq R. Joty,et al.  Impact of Physical Activity on Sleep: A Deep Learning Based Exploration , 2016, ArXiv.

[32]  Andrew McCallum,et al.  Composition of Conditional Random Fields for Transfer Learning , 2005, HLT.

[33]  Shafiq R. Joty,et al.  Con-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec , 2017, ECML/PKDD.

[34]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[35]  David Sontag,et al.  Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests , 2016, MLHC.

[36]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[37]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[38]  Vipin Kumar,et al.  Mining Electronic Health Records: A Survey , 2017, ArXiv.

[39]  Bernt Schiele,et al.  A tutorial on human activity recognition using body-worn inertial sensors , 2014, CSUR.

[40]  György J. Simon,et al.  TR 15-016 Mining Electronic Health Records ( EHR ) : A Survey , 2015 .

[41]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[42]  Sergey Malinchik,et al.  SAX-VSM: Interpretable Time Series Classification Using SAX and Vector Space Model , 2013, 2013 IEEE 13th International Conference on Data Mining.

[43]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[44]  Catherine P. Jayapandian,et al.  Scaling Up Scientific Discovery in Sleep Medicine: The National Sleep Research Resource. , 2016, Sleep.

[45]  Yi Zheng,et al.  Exploiting multi-channels deep convolutional neural networks for multivariate time series classification , 2015, Frontiers of Computer Science.

[46]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[47]  Tom M. Mitchell,et al.  Using the Future to Sort Out the Present: Rankprop and Multitask Learning for Medical Risk Evaluation , 1995, NIPS.

[48]  Steven Greenberg,et al.  The modulation spectrogram: in pursuit of an invariant representation of speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[49]  D. Warburton,et al.  Health benefits of physical activity: the evidence , 2006, Canadian Medical Association Journal.