An Approach for Objective Assessment of Stuttered Speech Using MFCC Features

Syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) and Hidden Markov Model (HMM) requires high levels of agreement as prerequisite before attempting to train and test to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a new approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic: Segmentation is assisted manually which is tedious but straightforward. Feature extraction is implemented using well known Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Support Vector Machine (SVM) and compared with our previous work which uses Perceptron method. The proposed objective approach has an advantage over the manual (subjective), which provide consistent measurement required for assessment. The assessments by human judges on the read speech of 15 adults who stutter are described. 80% of data are used for training and 20% for testing. The average result was found to be 93.45%, which is better than our previous work [80.78%] using HMM.

[1]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[2]  M. R. Adams Voice onsets and segment durations of normal speakers and beginning stutterers , 1987 .

[3]  P Howell,et al.  Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. , 1997, Journal of speech, language, and hearing research : JSLHR.

[4]  Abdelouhab Zeroual,et al.  The Multi-Layered Perceptrons Neural Networks for the Prediction of Daily Solar Radiation , 2007 .

[5]  Goutam Saha,et al.  Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter , 2009 .

[6]  E. Boberg,et al.  An investigation of interclinic agreement in the identification of fluent and stuttered syllables , 1988 .

[8]  Neeta Awasthy,et al.  Spectral Analysis of Speech: A New Technique , 2008 .

[9]  B. Lewis,et al.  Disfluencies at the onset of stuttering. , 1984, Journal of speech and hearing research.

[10]  A. B.,et al.  SPEECH COMMUNICATION , 2001 .

[11]  P Howell,et al.  Acoustic analysis and perception of vowels in stuttered speech. , 1986, The Journal of the Acoustical Society of America.

[12]  P Howell,et al.  Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: I. Psychometric procedures appropriate for selection of training material for lexical dysfluency classifiers. , 1997, Journal of speech, language, and hearing research : JSLHR.

[13]  K. M. Ravikumar,et al.  Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies , 2008 .

[14]  Günther Ruske,et al.  Syllable segmentation of continuous speech with artificial neural networks , 1993, EUROSPEECH.

[15]  Cemal Ardil,et al.  Investigation of Combined use of MFCC and LPC Features in Speech Recognition Systems , 2007 .