Automated Dysarthria Severity Classification Using Deep Learning Frameworks

Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.

[1]  Heidi Christensen,et al.  Intelligibility Assessment and Speech Recognizer Word Accuracy Rate Prediction for Dysarthric Speakers in a Factor Analysis Subspace , 2015, ACM Trans. Access. Comput..

[2]  José A. R. Fonollosa,et al.  Automatic Speech Recognition with Deep Neural Networks for Impaired Speech , 2016, IberSPEECH.

[3]  Visar Berisha,et al.  Towards a clinical tool for automatic intelligibility assessment , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Tiago H. Falk,et al.  Spectral Features for Automatic Blind Intelligibility Estimation of Spastic Dysarthric Speech , 2011, INTERSPEECH.

[5]  F Rudzicz,et al.  Articulatory Knowledge in the Recognition of Dysarthric Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[7]  Mohammad Ali Keyvanrad,et al.  Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks , 2018, ETRI Journal.

[8]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[9]  Paavo Alku,et al.  Dysarthric Speech Classification Using Glottal Features Computed from Non-words, Words and Sentences , 2018, INTERSPEECH.

[10]  Frank Rudzicz,et al.  The TORGO database of acoustic and articulatory speech from speakers with dysarthria , 2011, Language Resources and Evaluation.

[11]  Ke Chen,et al.  Exploring hierarchical speech representations with a deep convolutional neural network , 2011 .

[12]  Anil Kumar Vuppala,et al.  Perceptually Enhanced Single Frequency Filtering for Dysarthric Speech Detection and Intelligibility Assessment , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[14]  R. Palmer,et al.  Methods of speech therapy treatment for stable dysarthria: A review , 2007 .

[15]  Sunil Kumar Kopparapu,et al.  Automatic assessment of dysarthria severity level using audio descriptors , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Myung Jong Kim,et al.  Dysarthric Speech Recognition Using Convolutional LSTM Neural Network , 2018, INTERSPEECH.

[17]  Myung Jong Kim,et al.  Dysarthric speech recognition using dysarthria-severity-dependent and speaker-adaptive models , 2013, INTERSPEECH.

[18]  Shrikanth S. Narayanan,et al.  An Overview on Perceptually Motivated Audio Indexing and Classification , 2013, Proceedings of the IEEE.

[19]  Elmar Nöth,et al.  PEAKS - A system for the automatic evaluation of voice and speech disorders , 2009, Speech Commun..