An Automatic Prolongation Detection Approach in Continuous Speech With Robustness Against Speaking Rate Variations

In recent years, many methods have been introduced for supporting the diagnosis of stuttering for automatic detection of prolongation in the speech of people who stutter. However, less attention has been paid to treatment processes in which clients learn to speak more slowly. The aim of this study was to develop a method to help speech-language pathologists (SLPs) during diagnosis and treatment sessions. To this end, speech signals were initially parameterized to perceptual linear predictive (PLP) features. To detect the prolonged segments, the similarities between successive frames of speech signals were calculated based on correlation similarity measures. The segments were labeled as prolongation when the duration of highly similar successive frames exceeded a threshold specified by the speaking rate. The proposed method was evaluated by UCLASS and self-recorded Persian speech databases. The results were also compared with three high-performance studies in automatic prolongation detection. The best accuracies of prolongation detection were 99 and 97.1% for UCLASS and Persian databases, respectively. The proposed method also indicated promising robustness against artificial variation of speaking rate from 70 to 130% of normal speaking rate.

[1]  C. Starkweather Fluency and stuttering , 1987 .

[2]  Wiesława Kuniszyk-Jóźkowiak,et al.  Automatic detection of prolonged fricative phonemes with the Hidden Markov Models approach , 2007 .

[3]  Nader Jafarnia Dabanloo,et al.  Automatic classification of speech dysfluencies in continuous speech based on similarity measures and morphological image processing tools , 2016, Biomed. Signal Process. Control..

[4]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[5]  Mark Liberman,et al.  Robust speaking rate estimation using broad phonetic class recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  R. Curlee Observer agreement on disfluency and stuttering. , 1981, Journal of speech and hearing research.

[7]  Colleen K. Worthington,et al.  Treatment Resource Manual for Speech-Language Pathology , 1996 .

[8]  Thilo Pfau,et al.  Estimating the speaking rate by vowel detection , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[9]  Waldemar Suszynski,et al.  Prolongation detection with application of fuzzy logic , 2003, Ann. UMCS Informatica.

[10]  J. Scott Yaruss,et al.  Clinical Measurement of Stuttering Behaviors , 1997 .

[11]  Claudia Regina Furquim de Andrade,et al.  Relationship between the stuttering severity index and speech rate , 2003, Sao Paulo medical journal = Revista paulista de medicina.

[12]  Sazali Yaacob,et al.  Comparison of speech parameterization techniques for the classification of speech disfluencies , 2013 .

[13]  Wieslawa Kuniszyk-Józkowiak,et al.  Hierarchical ANN system for stuttering identification , 2013, Comput. Speech Lang..

[14]  Peter Howell,et al.  The UCLASS archive of stuttered speech , 2009 .

[15]  N. H. de Jong,et al.  Automatic measurement of speech rate in spoken Dutch , 2007 .

[16]  Sazali Yaacob,et al.  Classification of speech dysfluencies with MFCC and LPCC features , 2012, Expert Syst. Appl..

[17]  Sazali Yaacob,et al.  Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy , 2013, Digit. Signal Process..

[18]  P. Mahesha,et al.  Gaussian Mixture Model Based Classification of Stuttering Dysfluencies , 2016, J. Intell. Syst..

[19]  M. R. Adams A clinical strategy for differentiating the normally nonfluent child and the incipient stutterer , 1977 .