Novel Unsupervised Feature Extraction Protocol using Autoencoders for Connected Speech: Application in Parkinson's Disease Classification

Speech processing has generated substantial research interest for telemonitoring and classification applications in healthcare due to the ease of acquisition and availability of established research protocols. This growing research interest has shown significant progress in processing Parkinsonian speech for monitoring and classification applications. A considerable portion of the studies in this research area focuses on developing automatic telemonitoring protocols with passive data collection using wearable or mobile devices. Most of these studies focus on using sustained vowel phonations and handcrafted features for training classifiers. Though some researchers suggest better suitability of connected/running speech for this application, fewer studies focus on it predominantly because of the processing complexity. This study focuses on using connected speech with pitch synchronous segmentation and convolutional Autoencoders for feature extraction from regular and advanced spectrograms. The spectrograms were created using pitch synchronous and block processing segmentations have been evaluated in this study. This methodology also aims to bypass data availability issues by using standardized TIMIT dataset for training Autoencoders. With Logistic regression and Linear SVM, we achieved 85% classification accuracy using the features from Autoencoders. Mean accuracy of 84% was obtained under leave one subject out (LOSO) classification indicating the performance reliability for entirely new data.

[1]  Ravi Sankar,et al.  Parkinson’s Disease Classification using Pitch Synchronous Speech Segments and Fine Gaussian Kernels based SVM , 2020, 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC).

[2]  Ravi Sankar,et al.  A novel pitch cycle detection algorithm for tele monitoring applications , 2020, 2020 Wireless Telecommunications Symposium (WTS).

[3]  Mehr Yahya Durrani,et al.  A Spectrogram-Based Deep Feature Assisted Computer-Aided Diagnostic System for Parkinson’s Disease , 2020, IEEE Access.

[4]  Kartik Mahto,et al.  Stacked auto-encoder based Time- frequency features of Speech signal for Parkinson disease prediction , 2020, 2020 International Conference on Artificial Intelligence and Signal Processing (AISP).

[5]  Rytis Maskeliūnas,et al.  Detection of Speech Impairments Using Cepstrum, Auditory Spectrogram and Wavelet Time Scattering Domain Features , 2020, IEEE Access.

[6]  Elmar Nöth,et al.  Deep Learning Approach to Parkinson’s Disease Detection Using Voice Recordings and Convolutional Neural Network Dedicated to Image Classification , 2019, 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[7]  Rytis Maskeliunas,et al.  Detecting Parkinson's disease with sustained phonation and speech signals using machine learning techniques , 2019, Pattern Recognit. Lett..

[8]  G. Dimauro,et al.  Italian Parkinson's Voice and Speech , 2019 .

[9]  C. Adler,et al.  Importance of low diagnostic Accuracy for early Parkinson's disease , 2018, Movement disorders : official journal of the Movement Disorder Society.

[10]  Shivajirao M. Jadhav,et al.  Feature Ensemble Learning Based on Sparse Autoencoders for Diagnosis of Parkinson’s Disease , 2018, Advances in Intelligent Systems and Computing.

[11]  Ravi Sankar,et al.  Classification of Parkinson’s disease Using Pitch Synchronous Speech Analysis , 2018, 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[12]  Danilo Caivano,et al.  Assessment of Speech Intelligibility in Parkinson’s Disease Using a Speech-To-Text System , 2017, IEEE Access.

[13]  Fei Wang,et al.  An RNN Architecture with Dynamic Temporal Matching for Personalized Predictions of Parkinson's Disease , 2017, SDM.

[14]  K. Tjaden,et al.  Acoustic variation during passage reading for speakers with dysarthria and healthy controls. , 2016, Journal of communication disorders.

[15]  C. Stepp,et al.  Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers With Parkinson's Disease. , 2015, Journal of speech, language, and hearing research : JSLHR.

[16]  E. F. Martins,et al.  Motor and non-motor features of Parkinson's disease - a review of clinical and experimental studies. , 2012, CNS & neurological disorders drug targets.

[17]  Reliability of Speech Intelligibility Ratings Using the Unified Huntington Disease Rating Scale , 2003 .

[18]  R. Iansek,et al.  Speech impairment in a large sample of patients with Parkinson's disease. , 1998, Behavioural neurology.

[19]  H. Tohgi,et al.  [Parkinson's disease: diagnosis, treatment and prognosis]. , 1996, Nihon Ronen Igakkai zasshi. Japanese journal of geriatrics.

[20]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[21]  F. Klingholtz Acoustic recognition of voice disorders: a comparative study of running speech versus sustained vowels. , 1990, The Journal of the Acoustical Society of America.