Segment-Removal Based Stuttered Speech Remediation

Speech remediation by identifying those segments which take away from the substance of the speech content can be performed by correctly identifying portions of speech which can be deleted without diminishing from the speech quality, but rather improving the speech. Speech remediation is especially important when the speech is disfluent as in the case of stuttered speech. In this paper, we describe a stuttered speech remediation approach based on the identification of those segments of speech which when removed would enhance speech understandability in terms of both speech content and speech flow. The approach we adopted consists of first identifying and extracting speech segments that have weak significance due to their low relative intensity, then classifying the segments that should be removed. We trained several classifiers using a large set of inherent and derived features extracted from the audio segments for the purpose of automatic improvement of stuttered speech by providing a second layer filtering stage. This second layer would discern the audio segments that need to be eliminated from the ones that do not. The resulting speech is then compared to the manually-labeled “gold standard” optimal speech. The quality comparisons of the resulting enhanced speeches and their manually-labeled counterparts were favorable and the corresponding tabulated results are presented below. To further enhance the quality of the classifiers we adopted a voting techniques that encompassed an extended set of models from 14 algorithms and presented the classifier performance measures from different voting threshold values. This voting approach allowed us to improve the specificity of the classification by reducing the false positive classifications at the expense on additional false negatives thus improving the practical effectiveness of the system.

[1]  Sazali Yaacob,et al.  Classification of Speech Dysfluencies Using LPC Based Parameterization Techniques , 2012, Journal of Medical Systems.

[2]  Wieslawa Kuniszyk-Józkowiak,et al.  Hierarchical ANN system for stuttering identification , 2013, Comput. Speech Lang..

[3]  Andrzej Czyzewski,et al.  Intelligent Processing of Stuttered Speech , 2003, Journal of Intelligent Information Systems.

[4]  Peter Howell,et al.  The UCLASS archive of stuttered speech , 2009 .

[5]  Andreas Stolcke,et al.  Comparing HMM, maximum entropy, and conditional random fields for disfluency detection , 2005, INTERSPEECH.

[6]  M RaviKumarK,et al.  Comparison of Multidimensional MFCC Feature Vectors for Objective Assessment of Stuttered Disfluencies , 2011 .

[7]  Sazali Yaacob,et al.  Overview of Automatic Stuttering Recognition System , 2009 .

[8]  Matthew Lease,et al.  Recognizing disfluencies in conversational speech , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[10]  Tanja Schultz,et al.  Automatic disfluency removal on recognized spontaneous speech - rapid adaptation to speaker-dependent disfluencies , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Ravi Kumar,et al.  Comparison of Multidimensional MFCC Feature Vectors for Objective Assessment of Stuttered Disfluencies , 2011 .

[12]  Sazali Yaacob,et al.  Classification of speech dysfluencies with MFCC and LPCC features , 2012, Expert Syst. Appl..

[13]  Tanja Schultz,et al.  Correction of disfluencies in spontaneous speech using a noisy-channel approach , 2003, INTERSPEECH.

[14]  Andreas Stolcke,et al.  Using Conditional Random Fields for Sentence Boundary Detection in Speech , 2005, ACL.

[15]  Peter Howell,et al.  The University College London Archive of Stuttered Speech (UCLASS). , 2009, Journal of speech, language, and hearing research : JSLHR.

[16]  M nbspRaghavendra,et al.  Determination Of Disfluencies Associated In Stuttered Speech Using MFCC Feature Extraction , 2016 .

[17]  H. C. Nagaraj,et al.  An Approach for Objective Assessment of Stuttered Speech Using MFCC Features , 2009 .

[18]  A. Craig,et al.  Epidemiology of stuttering in the community across the entire life span. , 2002, Journal of speech, language, and hearing research : JSLHR.

[19]  Sazali Yaacob,et al.  Comparison of speech parameterization techniques for the classification of speech disfluencies , 2013 .