X-Vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection From Speech

Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect PD at an early stage using voice analysis. X-vectors are embeddings extracted from Deep Neural Networks (DNNs), which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients—Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (recently diagnosed PD subjects and healthy controls) with a high-quality microphone and via the telephone network. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of the audio segment durations, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for the text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7–15% improvement). This result was observed for both recording types (high-quality microphone and telephone).

[1]  Elmar Nöth,et al.  Automatic Detection of Parkinson's Disease Based on Modulated Vowels , 2016, INTERSPEECH.

[2]  M. Breteler,et al.  Epidemiology of Parkinson's disease , 2006, The Lancet Neurology.

[3]  Michal Novotný,et al.  Speech disorders reflect differing pathophysiology in Parkinson’s disease, progressive supranuclear palsy and multiple system atrophy , 2015, Journal of Neurology.

[4]  Shrikanth Narayanan,et al.  Feature analysis for automatic detection of pathological speech , 2002, Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] [Engineering in Medicine and Biology.

[5]  Juan Ignacio Godino-Llorente,et al.  Automatic Detection of Laryngeal Pathologies in Records of Sustained Vowels by Means of Mel-Frequency Cepstral Coefficient Parameters and Differentiation of Patients by Sex , 2009, Folia Phoniatrica et Logopaedica.

[6]  Magnus Johnsson,et al.  Diagnosing Parkinson by using artificial neural networks and support vector machines , 2009 .

[7]  Elmar Nöth,et al.  Phonological i-Vectors to Detect Parkinson's Disease , 2018, TSD.

[8]  J R Orozco-Arroyave,et al.  Automatic detection of Parkinson's disease in running speech spoken in three different languages. , 2016, The Journal of the Acoustical Society of America.

[9]  Marius Ene,et al.  Neural network-based approach to discriminate healthy people from those with Parkinson's disease , 2008 .

[10]  Parkinsonian Chinese Speech Analysis towards Automatic Classification of Parkinson’s Disease , 2020 .

[11]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[12]  E. Růžička,et al.  Imprecise vowel articulation as a potential early marker of Parkinson's disease: effect of speaking task. , 2013, The Journal of the Acoustical Society of America.

[13]  Danial Taheri Far,et al.  Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine , 2014 .

[14]  Roman Cmejla,et al.  Automatic Evaluation of Articulatory Disorders in Parkinson’s Disease , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  Carlos J. Perez,et al.  A two-stage variable selection and classification approach for Parkinson's disease detection by using voice recording replications , 2017, Comput. Methods Programs Biomed..

[16]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[17]  A. Dagher,et al.  Sex effects on brain structure in de novo Parkinson's disease: a multimodal neuroimaging study. , 2020, Brain : a journal of neurology.

[18]  Antanas Verikas,et al.  Detecting Parkinson’s disease from sustained phonation and speech signals , 2017, PloS one.

[19]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[20]  Gábor Gosztolya,et al.  Assessing Parkinson's Disease from Speech Using Fisher Vectors , 2019, INTERSPEECH.

[21]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[22]  Jesús Francisco Vargas-Bonilla,et al.  Voiced/unvoiced transitions in speech as a potential bio-marker to detect parkinson's disease , 2015, INTERSPEECH.

[23]  H. Okazawa,et al.  Sex Differences in White Matter Pathways Related to Language Ability , 2019, Front. Neurosci..

[24]  J. Jankovic,et al.  Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Process, format, and clinimetric testing plan , 2007, Movement disorders : official journal of the Movement Disorder Society.

[25]  Wenyao Xu,et al.  DeepVoice: A voiceprint-based mobile health framework for Parkinson's disease identification , 2018, 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[26]  Habib Benali,et al.  Comparison of Telephone Recordings and Professional Microphone Recordings for Early Detection of Parkinson's Disease, Using Mel-Frequency Cepstral Coefficients with Gaussian Mixture Models , 2019, INTERSPEECH.

[27]  P. Bühlmann,et al.  Analyzing Bagging , 2001 .

[28]  Elmar Nöth,et al.  Automatic evaluation of parkinson's speech - acoustic, prosodic and voice related cues , 2013, INTERSPEECH.

[29]  J. C. Vásquez-Correa,et al.  Parkinson’s Disease and Aging: Analysis of Their Effect in Phonation and Articulation of Speech , 2017, Cognitive Computation.

[30]  A. Lees,et al.  Ageing and Parkinson's disease: substantia nigra regional selectivity. , 1991, Brain : a journal of neurology.

[31]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[32]  Juan Ignacio Godino-Llorente,et al.  Analysis of speaker recognition methodologies and the influence of kinetic changes to automatically detect Parkinson's Disease , 2018, Appl. Soft Comput..

[33]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[34]  Fikret S. Gürgen,et al.  Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings , 2013, IEEE Journal of Biomedical and Health Informatics.

[35]  Sergey Ioffe,et al.  Probabilistic Linear Discriminant Analysis , 2006, ECCV.

[36]  Max A. Little,et al.  Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity , 2011, Journal of The Royal Society Interface.

[37]  Bastiaan R Bloem,et al.  Gender differences in Parkinson’s disease , 2006, Journal of Neurology, Neurosurgery & Psychiatry.

[38]  Mehmet Can,et al.  Diagnosis of Parkinson’s Disease using Fuzzy C-Means Clustering and Pattern Recognition , 2013, SOCO 2013.

[39]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[40]  Elmar Nöth,et al.  Language Independent Assessment of Motor Impairments of Patients with Parkinson's Disease Using i-Vectors , 2017, TSD.

[41]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[42]  Elmar Nöth,et al.  The INTERSPEECH 2015 computational paralinguistics challenge: nativeness, parkinson's & eating condition , 2015, INTERSPEECH.

[43]  K. Simonyan,et al.  Sexual Dimorphism Within Brain Regions Controlling Speech Production , 2019, Front. Neurosci..

[44]  Najim Dehak,et al.  Using X-Vectors to Automatically Detect Parkinson’s Disease from Speech , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[46]  Douglas E. Sturim,et al.  Automatic dysphonia recognition using biologically-inspired amplitude-modulation features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[47]  Sanjeev Khudanpur,et al.  Spoken Language Recognition using X-vectors , 2018, Odyssey.

[48]  Róbert Busa-Fekete,et al.  Assessing the degree of nativeness and parkinson's condition using Gaussian processes and deep rectifier neural networks , 2015, INTERSPEECH.

[49]  A. Hammouch,et al.  VOICE ANALYSIS FOR DETECTING PERSONS WITH PARKINSON’S DISEASE USING PLP AND VQ , 2014 .

[50]  L. Katz,et al.  Sex differences in the functional organization of the brain for language , 1995, Nature.

[51]  Sanjeev Khudanpur,et al.  X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[52]  Max A. Little,et al.  Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[53]  Dominique Hasboun,et al.  Hemispheric asymmetry and corpus callosum morphometry: a magnetic resonance imaging study , 2000, Neuroscience Research.

[54]  R. Viswanathan,et al.  Parkinson's Disease Diagnosis Based on Multivariate Deep Features of Speech Signal , 2018, 2018 IEEE Life Sciences Conference (LSC).

[55]  Emre Avuçlu,et al.  Evaluation of train and test performance of machine learning algorithms and Parkinson diagnosis with statistical measurements , 2020, Medical & Biological Engineering & Computing.

[56]  Laetitia Jeancolas Détection précoce de la maladie de Parkinson par l'analyse de la voix et corrélations avec la neuroimagerie. (Early detection of Parkinson's disease through voice analysis and correlations with neuroimaging) , 2019 .

[57]  Ayyoob Jafari,et al.  CLASSIFICATION OF PARKINSON'S DISEASE PATIENTS USING NONLINEAR PHONETIC FEATURES AND MEL-FREQUENCY CEPSTRAL ANALYSIS , 2013 .

[58]  Sanjeev Khudanpur,et al.  Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[59]  Nawwaf N. Kharma,et al.  Advances in Detecting Parkinson's Disease , 2010, ICMB.

[60]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[61]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[62]  M. Lerasle,et al.  Cross-validation improved by aggregation: Agghoo , 2017, 1709.03702.

[63]  Gorkem Serbes,et al.  Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson's disease , 2017, PloS one.

[64]  Benayad Nsiri,et al.  Diagnosis of Parkinson’s Disease based on Wavelet Transform and Mel Frequency Cepstral Coefficients , 2019, International Journal of Advanced Computer Science and Applications.

[65]  Roman Cmejla,et al.  Acoustic assessment of voice and speech disorders in Parkinson's disease through quick vocal test , 2011, Movement disorders : official journal of the Movement Disorder Society.

[66]  Jesús Francisco Vargas-Bonilla,et al.  Automatic detection of parkinson's disease from words uttered in three different languages , 2014, INTERSPEECH.

[67]  Haydar Ozkan,et al.  A Comparison of Classification Methods for Telediagnosis of Parkinson’s Disease , 2016 .

[68]  Joon Son Chung,et al.  VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.

[69]  Jesús Francisco Vargas-Bonilla,et al.  Towards an automatic monitoring of the neurological state of Parkinson's patients from speech , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[70]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[71]  Jing Zhang,et al.  Premotor biomarkers for Parkinson's disease - a promising direction of research , 2012, Translational Neurodegeneration.

[72]  Jirí Mekyska,et al.  Parkinson Disease Detection from Speech Articulation Neuromechanics , 2017, Front. Neuroinform..

[73]  J. J. Pekar,et al.  Sex differences in cerebral laterality of language and visuospatial processing , 2006, Brain and Language.

[74]  Athanasios Tsanas,et al.  Developing a large scale population screening tool for the assessment of Parkinson's disease using telephone-quality voice , 2019, The Journal of the Acoustical Society of America.

[75]  Jesús Francisco Vargas-Bonilla,et al.  Characterization Methods for the Detection of Multiple Voice Disorders: Neurological, Functional, and Laryngeal Diseases , 2015, IEEE Journal of Biomedical and Health Informatics.

[76]  Kapoor Tripti,et al.  Parkinson's disease Diagnosis using Mel-frequency Cepstral Coefficients and Vector Quantization , 2011 .

[77]  Evžen Růžička,et al.  Quantitative assessment of motor speech abnormalities in idiopathic rapid eye movement sleep behaviour disorder. , 2016, Sleep medicine.

[78]  Ahmed Hammouch,et al.  Discriminating Between Patients With Parkinson’s and Neurological Diseases Using Cepstral Analysis , 2016, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[79]  Erik McDermott,et al.  Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[80]  Jirí Mekyska,et al.  Identification of hypokinetic dysarthria using acoustic analysis of poem recitation , 2017, 2017 40th International Conference on Telecommunications and Signal Processing (TSP).

[81]  Sanjeev Khudanpur,et al.  Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.

[82]  Elmar Nöth,et al.  Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson's Disease , 2017, INTERSPEECH.