Analysis of speaker recognition methodologies and the influence of kinetic changes to automatically detect Parkinson's Disease

Abstract The diagnosis of Parkinson's Disease is a challenging task which might be supported by new tools to objectively evaluate the presence of deviations in patient's motor capabilities. To this respect, the dysarthric nature of patient's speech has been exploited in several works to detect the presence of this disease, but none of them has deeply studied the use of state-of-the-art speaker recognition techniques for this task. In this paper, two classification schemes (GMM-UBM and i-Vectors-GPLDA) are employed separately with several parameterization techniques, namely PLP, MFCC and LPC. Additionally, the influence of the kinetic changes, described by their derivatives, is analysed. With the proposed methodology, an accuracy of 87% with an AUC of 0.93 is obtained in the optimal configuration. These results are comparable to those obtained in other works employing speech for Parkinson's Disease detection and confirm that the selected speaker recognition techniques are a solid baseline to compare with future works. Results suggest that Rasta-PLP is the most reliable parameterization for the proposed task among all the tested features while the two employed classification schemes perform similarly. Additionally, results confirm that kinetic changes provide a substantial performance improvement in Parkinson's Disease automatic detection systems and should be considered in the future.

[1]  Jaakko Astola,et al.  The Mel-Frequency Cepstral Coefficients in the Context of Singer Identification , 2005, ISMIR.

[2]  Giovanni Defazio,et al.  Assessment of voice and speech symptoms in early Parkinson’s disease by the Robertson dysarthria profile , 2016, Neurological Sciences.

[3]  John J Sidtis,et al.  Formulaic Language in Parkinson's Disease and Alzheimer's Disease: Complementary Effects of Subcortical and Cortical Dysfunction. , 2015, Journal of speech, language, and hearing research : JSLHR.

[4]  S. Skodda,et al.  Intonation and speech rate in Parkinson's disease: general and dynamic aspects and responsiveness to levodopa admission. , 2011, Journal of voice : official journal of the Voice Foundation.

[5]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[6]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[7]  John J Sidtis,et al.  Dramatic effects of speech task on motor and linguistic planning in severely dysfluent parkinsonian speech , 2012, Clinical linguistics & phonetics.

[8]  José B. Mariño,et al.  Albayzin speech database: design of the phonetic corpus , 1993, EUROSPEECH.

[9]  Pedro Gómez Vilda,et al.  Methodological issues in the development of automatic systems for voice pathology detection , 2006, Biomed. Signal Process. Control..

[10]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[11]  Yoav Ben-Shlomo,et al.  The accuracy of diagnosis of parkinsonian syndromes in a specialist movement disorder service. , 2002, Brain : a journal of neurology.

[12]  A. Aronson,et al.  Differential diagnostic patterns of dysarthria. , 1969, Journal of speech and hearing research.

[13]  Antonio Benítez-Burraco,et al.  [A core deficit in Parkinson disease?]. , 2016, Neurologia.

[14]  Meysam Asgari,et al.  Predicting severity of Parkinson's disease from speech , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[15]  Max A. Little,et al.  Objective Automatic Assessment of Rehabilitative Speech Treatment in Parkinson's Disease , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[16]  J R Orozco-Arroyave,et al.  Automatic detection of Parkinson's disease in running speech spoken in three different languages. , 2016, The Journal of the Acoustical Society of America.

[17]  Max A. Little,et al.  Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests , 2009, IEEE Transactions on Biomedical Engineering.

[18]  V. Fraïle,et al.  Temporal control of voicing in Parkinson's disease and tardive dyskinesia speech , 1999 .

[19]  Rahul Gupta,et al.  Automatic estimation of parkinson's disease severity from diverse speech tasks , 2015, INTERSPEECH.

[20]  Raymond D. Kent,et al.  Acoustic studies of dysarthric speech: methods, progress, and potential. , 1999, Journal of communication disorders.

[21]  P. Snyder,et al.  Variability in fundamental frequency during speech in prodromal and incipient Parkinson's disease: A longitudinal case study , 2004, Brain and Cognition.

[22]  E. Růžička,et al.  Imprecise vowel articulation as a potential early marker of Parkinson's disease: effect of speaking task. , 2013, The Journal of the Acoustical Society of America.

[23]  Danial Taheri Far,et al.  Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine , 2014 .

[24]  Roman Cmejla,et al.  Automatic Evaluation of Articulatory Disorders in Parkinson’s Disease , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[25]  Sadaoki Furui,et al.  Speaker-independent isolated word recognition using dynamic features of speech spectrum , 1986, IEEE Trans. Acoust. Speech Signal Process..

[26]  Jesús Francisco Vargas-Bonilla,et al.  New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease , 2014, LREC.

[27]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[28]  Andrew J. Lees,et al.  Speech therapy in Parkinson's disease , 2002 .

[29]  Cara E Stepp,et al.  Relative fundamental frequency during vocal onset and offset in older speakers with and without Parkinson's disease. , 2013, The Journal of the Acoustical Society of America.

[30]  Y. Sohn,et al.  Acoustic characteristics of vowel sounds in patients with Parkinson disease. , 2013, NeuroRehabilitation.

[31]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[32]  S. Fahn Unified Parkinson's Disease Rating Scale, In : S. Fahn, CD. Marsden, DB. Calne, M. Goldstein, Recent Developments in Parkinson's Disease , 1987 .

[33]  E. Růžička,et al.  Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease. , 2011, The Journal of the Acoustical Society of America.

[34]  A. Goberman,et al.  Fundamental frequency change during offset and onset of voicing in individuals with Parkinson disease. , 2008, Journal of voice : official journal of the Voice Foundation.

[35]  Elmar Nöth,et al.  Automatic evaluation of parkinson's speech - acoustic, prosodic and voice related cues , 2013, INTERSPEECH.

[36]  Florin Curelaru,et al.  Front-End Factor Analysis For Speaker Verification , 2018, 2018 International Conference on Communications (COMM).

[37]  Guozhen An,et al.  Automatic recognition of unified parkinson's disease rating from speech with acoustic, i-vector and phonotactic features , 2015, INTERSPEECH.

[38]  I Litvan,et al.  Progression of dysarthria and dysphagia in postmortem-confirmed parkinsonian disorders. , 2001, Archives of neurology.

[39]  Howard Poizner,et al.  Articulatory Consequences of Parkinson's Disease: Perspectives from Two Modalities , 1999, Brain and Cognition.

[40]  M. Picheny,et al.  Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .

[41]  Ahmed Hammouch,et al.  Detecting Patients with Parkinson ’ s disease using Mel Frequency Cepstral Coefficients and Support Vector Machines , 2015 .

[42]  S. Skodda,et al.  Impairment of Vowel Articulation as a Possible Marker of Disease Progression in Parkinson's Disease , 2012, PloS one.

[43]  Houeto Jean-Luc [Parkinson's disease]. , 2022, La Revue du praticien.

[44]  Anne Smith,et al.  Basic parameters of articulatory movements and acoustics in individuals with Parkinson's disease , 2012, Movement disorders : official journal of the Movement Disorder Society.

[45]  J. Tetrud,et al.  Preclinical Parkinson's disease , 1991, Neurology.

[46]  Gregory J. Snyder,et al.  Speech Rate Deficits in Individuals with Parkinson's Disease: A Review of the Literature , 2009 .

[47]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[48]  Jean-François Bonastre,et al.  Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia) , 2005, INTERSPEECH.

[49]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[50]  Mickael Rouvier,et al.  A global optimization framework for speaker diarization , 2012, Odyssey.

[51]  S. Skodda,et al.  Speech rate and rhythm in Parkinson's disease , 2008, Movement disorders : official journal of the Movement Disorder Society.

[52]  Jason A. Whitfield,et al.  Articulatory-acoustic vowel space: application to clear speech in individuals with Parkinson's disease. , 2014, Journal of communication disorders.

[53]  L. Ramig,et al.  The Parkinson larynx: tremor and videostroboscopic findings. , 1996, Journal of voice : official journal of the Voice Foundation.

[54]  Max A. Little,et al.  Detecting and monitoring the symptoms of Parkinson's disease using smartphones: A pilot study. , 2015, Parkinsonism & related disorders.

[55]  Shrikanth S. Narayanan,et al.  Simplified supervised i-vector modeling with application to robust and efficient language identification and speaker verification , 2014, Comput. Speech Lang..

[56]  Noureddine Ellouze,et al.  Edema and Nodule Pathological Voice Identification by SVM Classifier on Speech Signal , 2015 .

[57]  Antanas Verikas,et al.  Automated speech analysis applied to laryngeal disease categorization , 2008, Comput. Methods Programs Biomed..

[58]  T Hehr,et al.  Oral diadochokinesis in neurological dysarthrias. , 1995, Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics.

[59]  F Cuetos,et al.  [A core deficit in Parkinson disease?]. , 2016, Neurologia.

[60]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[61]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[62]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[63]  H. Ackermann,et al.  Articulatory deficits in parkinsonian dysarthria: an acoustic analysis. , 1991, Journal of neurology, neurosurgery, and psychiatry.

[64]  Tomi Kinnunen,et al.  Factors affecting i-vector based foreign accent recognition: A case study in spoken Finnish , 2015, Speech Commun..

[65]  Gary Weismer,et al.  Philosophy of research in motor speech disorders , 2006, Clinical linguistics & phonetics.

[66]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[67]  J. Illes,et al.  Language production in Parkinson's disease: Acoustic and linguistic considerations , 1988, Brain and Language.

[68]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[69]  Roman Cmejla,et al.  Distinct patterns of imprecise consonant articulation among Parkinson’s disease, progressive supranuclear palsy and multiple system atrophy , 2017, Brain and Language.

[70]  Kris Tjaden,et al.  Vowel acoustics in Parkinson's disease and multiple sclerosis: comparison of clear, loud, and slow speaking conditions. , 2013, Journal of speech, language, and hearing research : JSLHR.

[71]  M. Hoehn,et al.  Parkinsonism , 1967, Neurology.