A data mining methodology for predicting early stage Parkinson's disease using non-invasive, high-dimensional gait sensor data

Parkinson's disease (PD) is the second most common neurological disorder after Alzheimer's disease. Key clinical features of PD are motor-related and are typically assessed by healthcare providers based on qualitative visual inspection of a patient's movement/gait/posture. More advanced diagnostic techniques such as computed tomography scans that measure brain function, can be cost prohibitive and may expose patients to radiation and other harmful effects. To mitigate these challenges, and open a pathway to remote patient-physician assessment, the authors of this work propose a data mining–driven methodology that uses low cost, non-invasive sensors to model and predict the presence (or lack therefore) of PD movement abnormalities and model clinical subtypes. The study presented here evaluates the discriminative ability of non-invasive hardware and data mining algorithms to classify PD cases and controls. A 10-fold cross-validation approach is used to compare several data mining algorithms in order to determine that which provides the most consistent results when varying the subject gait data. Next, the predictive accuracy of the data mining model is quantified by testing it against unseen data captured from a test pool of subjects. The proposed methodology demonstrates the feasibility of using non-invasive, low cost, hardware and data mining models to monitor the progression of gait features outside of the traditional healthcare facility, which may ultimately lead to earlier diagnosis of emerging neurological diseases.

[1]  Houeto Jean-Luc [Parkinson's disease]. , 2022, La Revue du praticien.

[2]  Janusz Konrad,et al.  A gesture-driven computer interface using Kinect , 2012, 2012 IEEE Southwest Symposium on Image Analysis and Interpretation.

[3]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[4]  Scott T. Grafton,et al.  Survival of implanted fetal dopamine cells and neurologic improvement 12 to 46 months after transplantation for Parkinson's disease. , 1992, The New England journal of medicine.

[5]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[6]  R. Hilker,et al.  Arm swing asymmetry in Parkinson's disease measured with ultrasound based motion analysis during treadmill gait. , 2012, Gait & posture.

[7]  Max A. Little,et al.  Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[8]  A. Lees,et al.  What features improve the accuracy of clinical diagnosis in Parkinson's disease , 1992, Neurology.

[9]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Max A. Little,et al.  Enhanced classical dysphonia measures and sparse regression for telemonitoring of Parkinson's disease progression , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[12]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[13]  S Fahn,et al.  Speech dysfunction in early Parkinson's disease , 1995, Movement disorders : official journal of the Movement Disorder Society.

[14]  P. Martínez-Martín,et al.  Unified Parkinson's disease rating scale characteristics and structure , 1994, Movement disorders : official journal of the Movement Disorder Society.

[15]  Xuemei Huang,et al.  Arm swing magnitude and asymmetry during gait in the early stages of Parkinson's disease. , 2010, Gait & posture.

[16]  T. Robbins,et al.  Heterogeneity of Parkinson’s disease in the early clinical stages using a data driven approach , 2005, Journal of Neurology, Neurosurgery & Psychiatry.

[17]  S. Mohapatra,et al.  Binary Logistic Regression , 2014 .

[18]  Conrad S. Tucker,et al.  Machine learning classification of design team members' body language patterns for real time emotional state detection , 2015 .

[19]  M Rabuffetti,et al.  Influence of basal ganglia on upper limb locomotor synergies. Evidence from deep brain stimulation and L-DOPA treatment in Parkinson's disease. , 2008, Brain : a journal of neurology.

[20]  Joseph M Mahoney,et al.  Both coordination and symmetry of arm swing are reduced in Parkinson's disease. , 2012, Gait & posture.

[21]  M. Ferrarin,et al.  Effect of L-dopa and Subthalamic Nucleus stimulation on arm and leg swing during gait in Parkinson's Disease , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[22]  Linda Denehy,et al.  Validity of the Microsoft Kinect for assessment of postural control. , 2012, Gait & posture.

[23]  S. Fahn Description of Parkinson's Disease as a Clinical Syndrome , 2003, Annals of the New York Academy of Sciences.

[24]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[25]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[26]  Akin Özçift,et al.  SVM Feature Selection Based Rotation Forest Ensemble Classifiers to Improve Computer-Aided Diagnosis of Parkinson Disease , 2011, Journal of Medical Systems.

[27]  Harriet Black Nembhard,et al.  Machine learning classification of medication adherence in patients with movement disorders using non-wearable sensors , 2015, Comput. Biol. Medicine.

[28]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Banu Diri,et al.  Automatic Turkish Text Categorization in Terms of Author, Genre and Gender , 2006, NLDB.

[30]  M. N. Kamel Boulos Xbox 360 Kinect Exergames for Health. , 2012, Games for health journal.

[31]  Limsoon Wong,et al.  DATA MINING TECHNIQUES , 2003 .

[32]  Conrad S. Tucker,et al.  Quantifying Emotional States Based on Body Language Data Using Non Invasive Sensors , 2014 .

[33]  Magnus Johnsson,et al.  Diagnosing Parkinson by using artificial neural networks and support vector machines , 2009 .

[34]  J. J. van Hilten,et al.  Accelerometric assessment of levodopa‐induced dyskinesias in Parkinson's disease , 2001, Movement disorders : official journal of the Movement Disorder Society.

[35]  Dennis McLeod,et al.  A Comparative Study for Email Classification , 2007 .

[36]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[37]  Conrad S. Tucker,et al.  Trend Mining for Predictive Product Design , 2011 .

[38]  Nir Giladi,et al.  Relationship between freezing of gait (FOG) and other features of Parkinson’s: FOG is not correlated with bradykinesia , 2003, Journal of Clinical Neuroscience.

[39]  Y Ben-Shlomo,et al.  How valid is the clinical diagnosis of Parkinson's disease in the community? , 2002, Journal of neurology, neurosurgery, and psychiatry.

[40]  Harry Zhang,et al.  Naive Bayesian Classifiers for Ranking , 2004, ECML.

[41]  Ruigang Yang,et al.  Accurate 3D pose estimation from a single depth image , 2011, 2011 International Conference on Computer Vision.

[42]  C. Olanow,et al.  The scientific basis for the current treatment of Parkinson's disease. , 2004, Annual review of medicine.

[43]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[44]  A. Lawrence,et al.  Compulsive use of dopamine replacement therapy in Parkinson's disease: reward systems gone awry? , 2003, The Lancet Neurology.

[45]  P. M. Fitzgerald,et al.  Lower body parkinsonism: Evidence for vascular etiology , 1989, Movement disorders : official journal of the Movement Disorder Society.

[46]  R. Mailman,et al.  Task specific influences of Parkinson’s disease on the striato-thalamo-cortical and cerebello-thalamo-cortical motor circuitries , 2007, Neuroscience.

[47]  Zahra Moussavi,et al.  Application of fractal dimension on vestibular response signals for diagnosis of Parkinson's disease , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[48]  A. Rajput,et al.  Accuracy of Clinical Diagnosis in Parkinsonism — A Prospective Study , 1991, Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques.

[49]  Paolo Bonato,et al.  Monitoring Motor Fluctuations in Patients With Parkinson's Disease Using Wearable Sensors , 2009, IEEE Transactions on Information Technology in Biomedicine.

[50]  Xuemei Huang,et al.  Apolipoprotein E and dementia in Parkinson disease: a meta-analysis. , 2006, Archives of neurology.

[51]  Conrad S. Tucker,et al.  A Privacy Preserving Data Mining Methodology for Dynamically Predicting Emerging Human Threats , 2013 .

[52]  R. Iansek,et al.  Speech impairment in a large sample of patients with Parkinson's disease. , 1998, Behavioural neurology.

[53]  J. Richards,et al.  ON MACHINE-LEARNED CLASSIFICATION OF VARIABLE STARS WITH SPARSE AND NOISY TIME-SERIES DATA , 2011, 1101.1959.

[54]  A. Lang,et al.  Parkinson's disease. First of two parts. , 1998, The New England journal of medicine.

[55]  Max A. Little,et al.  New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson ’ s disease symptom severity , 2011 .

[56]  P. Goethals,et al.  Use of genetic algorithms to select input variables in decision tree models for the prediction of benthic macroinvertebrates , 2003 .

[57]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[58]  E. Critchley,et al.  Speech disorders of Parkinsonism: a review. , 1981, Journal of neurology, neurosurgery, and psychiatry.

[59]  Björn Eskofier,et al.  Combined analysis of sensor data from hand and gait motor function improves automatic recognition of Parkinson's disease , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[60]  E. Kokmen,et al.  Motor Speech Disorders , 1976 .

[61]  R. Geetha Ramani,et al.  Parkinson Disease Classification using Data Mining Algorithms , 2011 .

[62]  Y. Agid,et al.  Levodopa‐induced dyskinesias in Parkinson's disease phenomenology and pathophysiology , 1994, Movement disorders : official journal of the Movement Disorder Society.

[63]  Filomena Soares,et al.  Automatic Detection of Stereotypical Motor Movements , 2012 .

[64]  P. Bonato,et al.  Data mining techniques to detect motor fluctuations in Parkinson's disease , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[65]  H. Wehr,et al.  Apolipoprotein E and dementia , 1997 .

[66]  J. Hughes,et al.  Accuracy of clinical diagnosis of idiopathic Parkinson's disease: a clinico-pathological study of 100 cases. , 1992, Journal of neurology, neurosurgery, and psychiatry.

[67]  Jens Volkmann,et al.  The Influence of Dopaminergic Striatal Innervation on Upper Limb Locomotor Synergies , 2012, PloS one.

[68]  John Bell,et al.  A review of methods for the assessment of prediction errors in conservation presence/absence models , 1997, Environmental Conservation.