Support vector machine-based stuttering dysfluency classification using GMM supervectors

It is generally acknowledged that recognition and classification of dysfluencies are an important criterion in the objective and accurate assessment of stuttered speech. For this reason, there is a growing interest in the application of Automatic Speech Recognition ASR technology to automate the dysfluency recognition. In this perspective, several studies have been carried out on the classification of dysfluencies by means of acoustic analysis, parametric and non-parametric feature extraction and statistical methods. This work is focused on introducing and evaluating Support Vector Machine SVM based dysfluency recognition system using a Gaussian Mixture Model GMM supervector. The experimental evaluation of the proposed system reveals that an SVM-based GMM supervector is effective for dysfluency classification. We have obtained substantial improvements in the performance by considering cepstral and their delta features.

[1]  D. O'Shaughnessy,et al.  Speaker recognition , 1986, IEEE ASSP Magazine.

[2]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[3]  Peter Howell,et al.  Facilities to assist people to research into stammered speech. , 2004, Stammering research : an on-line journal published by the British Stammering Association.

[4]  O. Bloodstein A handbook on stuttering , 1969 .

[5]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[6]  Andrzej Czyzewski,et al.  Intelligent Processing of Stuttered Speech , 2003, Journal of Intelligent Information Systems.

[7]  Elmar Nöth,et al.  Automatic stuttering recognition using hidden Markov models , 2000, INTERSPEECH.

[8]  M. Do Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models , 2003, IEEE Signal Processing Letters.

[9]  D. Sherman Clinical and experimental use of the Iowa Scale of Severity of Stuttering. , 1952, Journal of Speech and Hearing Disorders.

[10]  P. Mahesha,et al.  Classification of Speech Dysfluencies Using Speech Parameterization Techniques and Multiclass SVM , 2013, QSHINE.

[11]  Peter Howell,et al.  The UCLASS archive of stuttered speech , 2009 .

[12]  Wiesława Kuniszyk-Jóźkowiak,et al.  Automatic detection of prolonged fricative phonemes with the Hidden Markov Models approach , 2007 .

[13]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[14]  M. Hariharan,et al.  MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA , 2009, 2009 IEEE Student Conference on Research and Development (SCOReD).

[15]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[16]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[17]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[18]  D. E. Williams,et al.  Comparison of procedures for scaling severity of stuttering. , 1963, Journal of speech and hearing research.

[19]  K. M. Ravikumar,et al.  Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies , 2008 .

[20]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[21]  J. Suykens,et al.  A tutorial on support vector machine-based methods for classification problems in chemometrics. , 2010, Analytica chimica acta.

[22]  Marek Wisniewski,et al.  Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach , 2008, Computer Recognition Systems 2.

[23]  P Howell,et al.  Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. , 1997, Journal of speech, language, and hearing research : JSLHR.

[24]  You-ming Ma Predicting ratings of severity of stuttering. , 1961 .

[25]  M. Hariharan,et al.  Automatic detection of prolongations and repetitions using LPCC , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[26]  P. Mahesha,et al.  Combining Cepstral and Prosodic Features for Classification of Disfluencies in Stuttered Speech , 2015 .