Automatic assessment of dysarthria severity level using audio descriptors

Dysarthria is a motor speech impairment, often characterized by speech that is generally indiscernible by human listeners. Assessment of the severity level of dysarthria provides an understanding of the patient's progression in the underlying cause and is essential for planning therapy, as well as improving automatic dysarthric speech recognition. In this paper, we propose a non-linguistic manner of automatic assessment of severity levels using audio descriptors or a set of features traditionally used to define timbre of musical instruments and have been modified to suit this purpose. Multitapered spectral estimation based features were computed and used for classification, in addition to the audio descriptors for timbre. An Artificial Neural Network (ANN) was trained to classify speech into various severity levels within Universal Access dysarthric speech corpus and the TORGO database. An average classification accuracy of 96.44% and 98.7% was obtained for UA speech corpus and TORGO database respectively.

[1]  Frank Rudzicz,et al.  The TORGO database of acoustic and articulatory speech from speakers with dysarthria , 2011, Language Resources and Evaluation.

[2]  Tiago H. Falk,et al.  Spectral Features for Automatic Blind Intelligibility Estimation of Spastic Dysarthric Speech , 2011, INTERSPEECH.

[3]  Sid-Ahmed Selouani,et al.  Fully automated speaker identification and intelligibility assessment in dysarthria disease using auditory knowledge , 2016 .

[4]  Naveen Kumar,et al.  Classification of Pathological Speech Using Fusion of Multiple Subsystems , 2012 .

[5]  P. Enderby,et al.  Frenchay Dysarthria Assessment , 1983 .

[6]  Peter Kulchyski and , 2015 .

[7]  Tomi Kinnunen,et al.  What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering , 2010, INTERSPEECH.

[8]  H. A. Leeper,et al.  Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility. , 1997, Journal of rehabilitation research and development.

[9]  Elmar Nöth,et al.  Combining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment , 2011, INTERSPEECH.

[10]  Elmar Nöth,et al.  PEAKS - A system for the automatic evaluation of voice and speech disorders , 2009, Speech Commun..

[11]  D. Thomson,et al.  Spectrum estimation and harmonic analysis , 1982, Proceedings of the IEEE.

[12]  Stuart P. Cunningham,et al.  Model adaptation and adaptive training for the recognition of dysarthric speech , 2015, SLPAT@Interspeech.

[13]  Chng Eng Siong,et al.  Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers , 2014, PloS one.

[14]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[15]  Myung Jong Kim,et al.  Automatic Assessment of Dysarthric Speech Intelligibility Based on Selected Phonetic Quality Features , 2012, ICCHP.

[16]  Visar Berisha,et al.  Towards a clinical tool for automatic intelligibility assessment , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Patrick Susini,et al.  The Timbre Toolbox: extracting audio descriptors from musical signals. , 2011, The Journal of the Acoustical Society of America.

[18]  R. Parker,et al.  Reducing the bias of multitaper spectrum estimates , 2007 .

[19]  Myung Jong Kim,et al.  Dysarthric speech recognition using dysarthria-severity-dependent and speaker-adaptive models , 2013, INTERSPEECH.