Automatic Speech Analysis in Patients with Parkinson's Disease using Feature Dimension Reduction

Dysphonia is a common speech disorder in Parkinson's disease. Speech analyses have already been used in patients with Parkinson's disease and class prediction is an essential task in automatic speech treatment. Speech data contain large amounts of redundancies and ambiguities among the attributes which result in considerable noise. Modern data analysis often faces high-dimensional data using dimension reduction statistical techniques. In this work, the potential of Common Factor Analysis (CFA), Principal Component Analysis (PCA) based modeling in dimensionality reduction is taken into consideration as the data smoothening tool with multiclass target expression data. On the basis of suggested CFA and PCA-based modeling, the power class prediction of logistic regression (LR) and Decision Tree (C5.0) in numeric data to develop an advanced classification model is investigated in publicly available Parkinson's disease dataset Silverman voice treatment (LSVT). In addition, using only 9 dysphonia features, classification accuracy was (99,20%) and (99,11%) for CFA-LR and PCA-C5.0 respectively. In sum, our combined dimension reduction and data smoothening approaches have significant potential to minimize the number of features and increase the classification accuracy and then automatically classify subjects in Parkinson's disease patients from healthy speakers.

[1]  Amos J Storkey,et al.  Machine Learning and Pattern Recognition : Preliminaries Course , 2009 .

[2]  S. Christian Albright,et al.  Data Analysis and Decision Making , 2004 .

[3]  L. Hartelius,et al.  Speech and swallowing symptoms associated with Parkinson's disease and multiple sclerosis: a survey. , 1994, Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics.

[4]  P. Fryzlewicz,et al.  High dimensional variable selection via tilting , 2012, 1611.08640.

[5]  Jens Myrup Pedersen,et al.  A method for classification of network traffic based on C5.0 Machine Learning Algorithm , 2012, 2012 International Conference on Computing, Networking and Communications (ICNC).

[6]  Girija Chetty,et al.  Smart Phone Based Data Mining for Human Activity Recognition , 2015 .

[7]  J. R. Quinlan,et al.  Data Mining Tools See5 and C5.0 , 2004 .

[8]  Souad El Bernoussi,et al.  Mining human activity using dimensionality reduction and pattern recognition , 2016 .

[9]  J Sreemathy,et al.  AN EFFICIENT TEXT CLASSIFICATION USING KNN AND NAIVE BAYESIAN , 2012 .

[10]  Max A. Little,et al.  Objective Automatic Assessment of Rehabilitative Speech Treatment in Parkinson's Disease , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[11]  P. Snyder,et al.  Variability in fundamental frequency during speech in prodromal and incipient Parkinson's disease: A longitudinal case study , 2004, Brain and Cognition.

[12]  Qingwei Yan,et al.  Auto-recognizing DBMS Workload Based on C5.0 Algorithm , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.

[13]  K. Bötzel,et al.  Prevalence and incidence of Parkinson's disease in Europe , 2005, European Neuropsychopharmacology.

[14]  Lucila Ohno-Machado,et al.  Logistic regression and artificial neural network classification models: a methodology review , 2002, J. Biomed. Informatics.

[15]  C. K. Bhensdadia,et al.  Improved Decision Tree Induction Algorithm with Feature Selection , Cross Validation , Model Complexity and Reduced Error Pruning , 2012 .

[16]  Lixing Zhu,et al.  Robust estimating equation-based sufficient dimension reduction , 2015, J. Multivar. Anal..

[17]  Monique Snoeck,et al.  Classification With Ant Colony Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[18]  K. RaghavendraB. Evaluation of Logistic Regression and Neural Network Model With Sensitivity Analysis on Medical Datasets , 2011 .

[19]  Yurii Nesterov,et al.  Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..