X-Vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection From Speech

Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect an early stage of PD from voice analysis. X-vectors are embeddings extracted from a deep neural network, which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients - Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (including recently diagnosed PD subjects and healthy controls) with a high-quality microphone and with their own telephone. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of audio segment duration, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7 to 15% improvement). This result was observed for both recording types (high-quality microphone and telephone).

[1]  Jirí Mekyska,et al.  Identification of hypokinetic dysarthria using acoustic analysis of poem recitation , 2017, 2017 40th International Conference on Telecommunications and Signal Processing (TSP).

[2]  Elmar Nöth,et al.  Automatic Detection of Parkinson's Disease Based on Modulated Vowels , 2016, INTERSPEECH.

[3]  Athanasios Tsanas,et al.  Developing a large scale population screening tool for the assessment of Parkinson's disease using telephone-quality voice , 2019, The Journal of the Acoustical Society of America.

[4]  Max A. Little,et al.  Suitability of Dysphonia Measurements for Telemonitoring of Parkinson's Disease , 2008, IEEE Transactions on Biomedical Engineering.

[5]  Jesús Francisco Vargas-Bonilla,et al.  Voiced/unvoiced transitions in speech as a potential bio-marker to detect parkinson's disease , 2015, INTERSPEECH.

[6]  Douglas E. Sturim,et al.  Automatic dysphonia recognition using biologically-inspired amplitude-modulation features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Fikret S. Gürgen,et al.  Collection and Analysis of a Parkinson Speech Dataset With Multiple Types of Sound Recordings , 2013, IEEE Journal of Biomedical and Health Informatics.

[8]  Jing Zhang,et al.  Premotor biomarkers for Parkinson's disease - a promising direction of research , 2012, Translational Neurodegeneration.

[9]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[10]  K. Simonyan,et al.  Sexual Dimorphism Within Brain Regions Controlling Speech Production , 2019, Front. Neurosci..

[11]  A. Lees,et al.  Ageing and Parkinson's disease: substantia nigra regional selectivity. , 1991, Brain : a journal of neurology.

[12]  Laetitia Jeancolas Détection précoce de la maladie de Parkinson par l'analyse de la voix et corrélations avec la neuroimagerie. (Early detection of Parkinson's disease through voice analysis and correlations with neuroimaging) , 2019 .

[13]  Joon Son Chung,et al.  VoxCeleb: A Large-Scale Speaker Identification Dataset , 2017, INTERSPEECH.

[14]  Sanjeev Khudanpur,et al.  Spoken Language Recognition using X-vectors , 2018, Odyssey.

[15]  Najim Dehak,et al.  Using X-Vectors to Automatically Detect Parkinson’s Disease from Speech , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Juan Ignacio Godino-Llorente,et al.  Automatic Detection of Laryngeal Pathologies in Records of Sustained Vowels by Means of Mel-Frequency Cepstral Coefficient Parameters and Differentiation of Patients by Sex , 2009, Folia Phoniatrica et Logopaedica.

[18]  Gorkem Serbes,et al.  Analyzing the effectiveness of vocal features in early telediagnosis of Parkinson's disease , 2017, PloS one.

[19]  Magnus Johnsson,et al.  Diagnosing Parkinson by using artificial neural networks and support vector machines , 2009 .

[20]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[21]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[22]  Elmar Nöth,et al.  Language Independent Assessment of Motor Impairments of Patients with Parkinson's Disease Using i-Vectors , 2017, TSD.

[23]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[24]  Habib Benali,et al.  Comparison of Telephone Recordings and Professional Microphone Recordings for Early Detection of Parkinson's Disease, Using Mel-Frequency Cepstral Coefficients with Gaussian Mixture Models , 2019, INTERSPEECH.

[25]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[26]  E. Katunina,et al.  [Epidemiology of Parkinson's disease]. , 2013, Zhurnal nevrologii i psikhiatrii imeni S.S. Korsakova.

[27]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[28]  E. Růžička,et al.  Imprecise vowel articulation as a potential early marker of Parkinson's disease: effect of speaking task. , 2013, The Journal of the Acoustical Society of America.

[29]  J R Orozco-Arroyave,et al.  Automatic detection of Parkinson's disease in running speech spoken in three different languages. , 2016, The Journal of the Acoustical Society of America.

[30]  Marius Ene,et al.  Neural network-based approach to discriminate healthy people from those with Parkinson's disease , 2008 .

[31]  J. Jankovic,et al.  Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Process, format, and clinimetric testing plan , 2007, Movement disorders : official journal of the Movement Disorder Society.

[32]  Ayyoob Jafari,et al.  CLASSIFICATION OF PARKINSON'S DISEASE PATIENTS USING NONLINEAR PHONETIC FEATURES AND MEL-FREQUENCY CEPSTRAL ANALYSIS , 2013 .

[33]  A. Hammouch,et al.  VOICE ANALYSIS FOR DETECTING PERSONS WITH PARKINSON’S DISEASE USING PLP AND VQ , 2014 .

[34]  Max A. Little,et al.  Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity , 2011, Journal of The Royal Society Interface.

[35]  Bastiaan R Bloem,et al.  Gender differences in Parkinson’s disease , 2006, Journal of Neurology, Neurosurgery & Psychiatry.

[36]  P. Bühlmann,et al.  Analyzing Bagging , 2001 .

[37]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[38]  Max A. Little,et al.  Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[39]  H. Okazawa,et al.  Sex Differences in White Matter Pathways Related to Language Ability , 2019, Front. Neurosci..

[40]  Juan Ignacio Godino-Llorente,et al.  Analysis of speaker recognition methodologies and the influence of kinetic changes to automatically detect Parkinson's Disease , 2018, Appl. Soft Comput..

[41]  Michal Novotný,et al.  Speech disorders reflect differing pathophysiology in Parkinson’s disease, progressive supranuclear palsy and multiple system atrophy , 2015, Journal of Neurology.

[42]  Saudi Arabia,et al.  Automatic Detection of Parkinson's Disease from Words Uttered in Three Different Languages , 2014 .

[43]  Jesús Francisco Vargas-Bonilla,et al.  Characterization Methods for the Detection of Multiple Voice Disorders: Neurological, Functional, and Laryngeal Diseases , 2015, IEEE Journal of Biomedical and Health Informatics.

[44]  Shrikanth Narayanan,et al.  Feature analysis for automatic detection of pathological speech , 2002, Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] [Engineering in Medicine and Biology.

[45]  R. Viswanathan,et al.  Parkinson's Disease Diagnosis Based on Multivariate Deep Features of Speech Signal , 2018, 2018 IEEE Life Sciences Conference (LSC).

[46]  Ahmed Hammouch,et al.  Discriminating Between Patients With Parkinson’s and Neurological Diseases Using Cepstral Analysis , 2016, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[47]  Elmar Nöth,et al.  Convolutional Neural Network to Model Articulation Impairments in Patients with Parkinson's Disease , 2017, INTERSPEECH.

[48]  Gábor Gosztolya,et al.  Assessing Parkinson's Disease from Speech Using Fisher Vectors , 2019, INTERSPEECH.

[49]  Sanjeev Khudanpur,et al.  X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Evžen Růžička,et al.  Quantitative assessment of motor speech abnormalities in idiopathic rapid eye movement sleep behaviour disorder. , 2016, Sleep medicine.

[51]  Elmar Nöth,et al.  Phonological i-Vectors to Detect Parkinson's Disease , 2018, TSD.

[52]  Kapoor Tripti,et al.  Parkinson's disease Diagnosis using Mel-frequency Cepstral Coefficients and Vector Quantization , 2011 .

[53]  Sergey Ioffe,et al.  Probabilistic Linear Discriminant Analysis , 2006, ECCV.

[54]  Roman Cmejla,et al.  Automatic Evaluation of Articulatory Disorders in Parkinson’s Disease , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[55]  Wenyao Xu,et al.  DeepVoice: A voiceprint-based mobile health framework for Parkinson's disease identification , 2018, 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI).

[56]  M. Hoehn,et al.  Parkinsonism , 1967, Neurology.

[57]  M. Lerasle,et al.  Cross-validation improved by aggregation: Agghoo , 2017, 1709.03702.

[58]  Sanjeev Khudanpur,et al.  Deep neural network-based speaker embeddings for end-to-end speaker verification , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[59]  Elmar Nöth,et al.  Automatic evaluation of parkinson's speech - acoustic, prosodic and voice related cues , 2013, INTERSPEECH.

[60]  Roman Cmejla,et al.  Acoustic assessment of voice and speech disorders in Parkinson's disease through quick vocal test , 2011, Movement disorders : official journal of the Movement Disorder Society.

[61]  Sanjeev Khudanpur,et al.  Deep Neural Network Embeddings for Text-Independent Speaker Verification , 2017, INTERSPEECH.

[62]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .