Multi-Source Ensemble Learning for the Remote Prediction of Parkinson's Disease in the Presence of Source-Wise Missing Data

As the collection of mobile health data becomes pervasive, missing data can make large portions of datasets inaccessible for analysis. Missing data has shown particularly problematic for remotely diagnosing and monitoring Parkinson's disease (PD) using smartphones. This contribution presents multi-source ensemble learning, a methodology which combines dataset deconstruction with ensemble learning and enables participants with incomplete data (i.e., where not all sensor data is available) to be included in the training of machine learning models and achieves a 100% participant retention rate. We demonstrate the proposed method on a cohort of 1513 participants, 91.2% of which contributed incomplete data in tapping, gait, voice, and/or memory tests. The use of multi-source ensemble learning, alongside convolutional neural networks (CNNs) capitalizing on the amount of available data, increases PD classification accuracy from 73.1% to 82.0% as compared to traditional techniques. The increase in accuracy is found to be partly caused by the use of multi-channel CNNs and partly caused by developing models using the large cohort of participants. Furthermore, through bootstrap sampling we reveal that feature selection is better performed on a large cohort of participants with incomplete data than on a small number of participants with complete data. The proposed method is applicable to a wide range of wearable/remote monitoring datasets that suffer from missing data and contributes to improving the ability to remotely monitor PD via revealing novel methods of accounting for symptom heterogeneity.

[1]  K. Jellinger,et al.  Accuracy of clinical diagnosis of Parkinson disease: A systematic review and meta-analysis , 2016, Neurology.

[2]  Arno Klein,et al.  Personalized Hypothesis Tests for Detecting Medication Response in Parkinson Disease Patients Using iPhone Sensor Data , 2016, PSB.

[3]  John Prince,et al.  Big data in Parkinson’s disease: using smartphones to remotely detect longitudinal disease phenotypes , 2018, Physiological measurement.

[4]  Robert LeMoyne,et al.  Quantification of Parkinson's disease characteristics using wireless accelerometers , 2009, 2009 ICME International Conference on Complex Medical Engineering.

[5]  Max A. Little,et al.  Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[6]  G. Rizzo,et al.  Accuracy of clinical diagnosis of Parkinson disease , 2016, Neurology.

[7]  R. Barker,et al.  The search for biomarkers in Parkinson’s disease: a critical review , 2008, Expert review of neurotherapeutics.

[8]  Paul M. Thompson,et al.  Bi-level multi-source learning for heterogeneous block-wise missing data , 2014, NeuroImage.

[9]  Suchi Saria,et al.  Using Smartphones and Machine Learning to Quantify Parkinson Disease Severity: The Mobile Parkinson Disease Score , 2018, JAMA neurology.

[10]  John G. Nutt,et al.  Diagnosis and Initial Management of Parkinson's Disease , 2005 .

[11]  Suchi Saria,et al.  High Frequency Remote Monitoring of Parkinson's Disease via Smartphone: Platform Overview and Medication Response Detection , 2016, ArXiv.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[14]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Paolo Bonato,et al.  Monitoring Motor Fluctuations in Patients With Parkinson's Disease Using Wearable Sensors , 2009, IEEE Transactions on Information Technology in Biomedicine.

[17]  Therese D. Pigott,et al.  A Review of Methods for Missing Data , 2001 .

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  F. Cavallo,et al.  How Wearable Sensors Can Support Parkinson's Disease Diagnosis and Treatment: A Systematic Review , 2017, Front. Neurosci..

[20]  Max A. Little,et al.  New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson ’ s disease symptom severity , 2011 .

[21]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[22]  S. Friend,et al.  The mPower study, Parkinson disease mobile data collected using ResearchKit , 2016, Scientific Data.

[23]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  James C. Bezdek,et al.  Decision templates for multiple classifier fusion: an experimental comparison , 2001, Pattern Recognit..

[25]  Sebastiaan Overeem,et al.  Physical inactivity in Parkinson’s disease , 2011, Journal of Neurology.

[26]  Chao Wu,et al.  DeepSleepNet: A Model for Automatic Sleep Stage Scoring Based on Raw Single-Channel EEG , 2017, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[27]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[28]  Svjetlana Miocinovic,et al.  Automated gait and balance parameters diagnose and correlate with severity in Parkinson disease , 2014, Journal of the Neurological Sciences.

[29]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[30]  J. Jankovic,et al.  Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Scale presentation and clinimetric testing results , 2008, Movement disorders : official journal of the Movement Disorder Society.

[31]  Larsson Omberg,et al.  On the analysis of personalized medication response and classification of case vs control patients in mobile health studies: the mPower case study , 2017, 1706.09574.

[32]  Maarten De Vos,et al.  A Deep Learning Framework for the Remote Detection of Parkinson’S Disease Using Smart-Phone Sensor Data , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[33]  E Ray Dorsey,et al.  The coming crisis , 2013, Neurology.

[34]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[35]  S. Lewis,et al.  Biomarkers and Parkinson's disease. , 2004, Brain : a journal of neurology.

[36]  Max A. Little,et al.  Detecting and monitoring the symptoms of Parkinson's disease using smartphones: A pilot study. , 2015, Parkinsonism & related disorders.

[37]  Taha Khan,et al.  Automatic and Objective Assessment of Alternating Tapping Performance in Parkinson's Disease , 2013, Sensors.

[38]  H. Mitoma,et al.  Quantitative Analysis of Motor Status in Parkinson's Disease Using Wearable Devices: From Methodological Considerations to Problems in Clinical Applications , 2017, Parkinson's disease.

[39]  Agata Brajdic,et al.  Walk detection and step counting on unconstrained smartphones , 2013, UbiComp.

[40]  Walter Maetzler,et al.  New methods for the assessment of Parkinson's disease (2005 to 2015): A systematic review , 2016, Movement disorders : official journal of the Movement Disorder Society.

[41]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).