论文信息 - A comparative analysis of speech signal processing algorithms for Parkinson's disease classification and the use of the tunable Q-factor wavelet transform

A comparative analysis of speech signal processing algorithms for Parkinson's disease classification and the use of the tunable Q-factor wavelet transform

Abstract In recent years, there has been increasing interest in the development of telediagnosis and telemonitoring systems for Parkinson’s disease (PD) based on measuring the motor system disorders caused by the disease. As approximately 90% percent of PD patients exhibit some form of vocal disorders in the earlier stages of the disease, the recent PD telediagnosis studies focus on the detection of the vocal impairments from sustained vowel phonations or running speech of the subjects. In these studies, various speech signal processing algorithms have been used to extract clinically useful information for PD assessment, and the calculated features were fed to learning algorithms to construct reliable decision support systems. In this study, we apply, to the best of our knowledge for the first time, the tunable Q-factor wavelet transform (TQWT) to the voice signals of PD patients for feature extraction, which has higher frequency resolution than the classical discrete wavelet transform. We compare the effectiveness of TQWT with the state-of-the-art feature extraction methods used in diagnosis of PD from vocal disorders. For this purpose, we have collected the voice recordings of 252 subjects in the context of this study and extracted multiple feature subsets from the voice recordings. The feature subsets are fed to multiple classifiers and the predictions of the classifiers are combined with ensemble learning approaches. The results show that TQWT performs better or comparable to the state-of-the-art speech signal processing techniques used in PD classification. We also find that Mel-frequency cepstral and the tunable-Q wavelet coefficients, which give the highest accuracies, contain complementary information in PD classification problem resulting in an improved system when combined using a filter feature selection technique.

[1] Fraser Shein,et al. Characterization of atypical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility , 2012, Speech Commun..

[2] Ingo R. Titze,et al. Principles of voice production , 1994 .

[3] Ethem Alpaydin,et al. Introduction to machine learning , 2004, Adaptive computation and machine learning.

[4] Lars Kai Hansen,et al. Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Max A. Little,et al. Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity , 2011, Journal of The Royal Society Interface.

[6] Max A. Little,et al. Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[7] M. Lindstrom,et al. Articulatory movements during vowels in speakers with dysarthria and healthy controls. , 2008, Journal of speech, language, and hearing research : JSLHR.

[8] J. Jankovic. Parkinson’s disease: clinical features and diagnosis , 2008, Journal of Neurology, Neurosurgery, and Psychiatry.

[9] Hüseyin Gürüler,et al. A novel diagnosis system for Parkinson’s disease using complex-valued artificial neural network with k-means clustering feature weighting method , 2017, Neural Computing and Applications.

[10] Gorkem Serbes,et al. Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson's disease , 2017, PloS one.

[11] Fuhui Long,et al. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Ilker Bayram,et al. An Analytic Wavelet Transform With a Flexible Time-Frequency Covering , 2013, IEEE Transactions on Signal Processing.

[13] Ilker Bayram,et al. A Simple Prior for Audio Signals , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[14] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[15] Giles M. Foody,et al. Feature Selection for Classification of Hyperspectral Data by SVM , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[16] Olcay Kursun,et al. Telediagnosis of Parkinson’s Disease Using Measurements of Dysphonia , 2010, Journal of Medical Systems.

[17] Lorene M Nelson,et al. Incidence of Parkinson's disease: variation by age, gender, and race/ethnicity. , 2003, American journal of epidemiology.

[18] Ömer Deniz Akyildiz,et al. Primal-dual algorithms for audio decomposition using mixed norms , 2013, Signal, Image and Video Processing.

[19] Sherif Hashem,et al. Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[20] Huseyin Seker,et al. Combining multiple clusterings for protein structure prediction , 2014, Int. J. Data Min. Bioinform..

[21] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[22] Musa Peker,et al. Computer-Aided Diagnosis of Parkinson's Disease Using Complex-Valued Neural Networks and mRMR Feature Selection Algorithm. , 2015, Journal of healthcare engineering.

[23] David W. Opitz,et al. Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[24] Fikret S. Gürgen,et al. Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings , 2013, IEEE Journal of Biomedical and Health Informatics.

[25] Pedro Gómez Vilda,et al. Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[26] Ivan W. Selesnick,et al. Wavelet Transform With Tunable Q-Factor , 2011, IEEE Transactions on Signal Processing.

[27] Ivan W. Selesnick,et al. Resonance-based signal decomposition: A new sparsity-enabled signal analysis method , 2011, Signal Process..

[28] Bayya Yegnanarayana,et al. Combining evidence from residual phase and MFCC features for speaker recognition , 2006, IEEE Signal Processing Letters.

[29] Zia ur Rehman,et al. Intelligent churn prediction for telecom using GP-AdaBoost learning and PSO undersampling , 2019, Cluster Computing.

[30] Fikret S. Gürgen,et al. A feature selection method based on kernel canonical correlation analysis and the minimum Redundancy-Maximum Relevance filter method , 2012, Expert Syst. Appl..

[31] Max A. Little,et al. Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[32] Max A. Little,et al. Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[33] A. S. Grove,et al. Testing objective measures of motor impairment in early Parkinson's disease: Feasibility study of an at‐home testing device , 2009, Movement disorders : official journal of the Movement Disorder Society.

[34] A. Hofman,et al. Prevalence of Parkinson's disease in Europe: A collaborative study of population-based cohorts. Neurologic Diseases in the Elderly Research Group. , 2000, Neurology.

[35] Musa Peker,et al. A decision support system to improve medical diagnosis using a combination of k-medoids clustering based attribute weighting and SVM , 2016, Journal of Medical Systems.