Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy

Dysfluency and stuttering are a break or interruption of normal speech such as repetition, prolongation, interjection of syllables, sounds, words or phrases and involuntary silent pauses or blocks in communication. Stuttering assessment through manual classification of speech dysfluencies is subjective, inconsistent, time consuming and prone to error. This paper proposes an objective evaluation of speech dysfluencies based on the wavelet packet transform with sample entropy features. Dysfluent speech signals are decomposed into six levels by using wavelet packet transform. Sample entropy (SampEn) features are extracted at every level of decomposition and they are used as features to characterize the speech dysfluencies (stuttered events). Three different classifiers such as k-nearest neighbor (kNN), linear discriminant analysis (LDA) based classifier and support vector machine (SVM) are used to investigate the performance of the sample entropy features for the classification of speech dysfluencies. 10-fold cross validation method is used for testing the reliability of the classifier results. The effect of different wavelet families on the classification performance is also performed. Experimental results demonstrate that the proposed features and classification algorithms give very promising classification accuracy of 96.67% with the standard deviation of 0.37 and also that the proposed method can be used to help speech language pathologist in classifying speech dysfluencies.

[1]  Hava T. Siegelmann,et al.  A support vector clustering method , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[2]  Eric Achten,et al.  fMRI of developmental stuttering: A pilot study , 2003, Brain and Language.

[3]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[4]  Y. V. Geetha,et al.  Classification of childhood disfluencies using neural networks , 2000 .

[5]  Sazali Yaacob,et al.  Classification of Speech Dysfluencies Using LPC Based Parameterization Techniques , 2012, Journal of Medical Systems.

[6]  M RaviKumarK,et al.  Comparison of Multidimensional MFCC Feature Vectors for Objective Assessment of Stuttered Disfluencies , 2011 .

[7]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[8]  J. Richman,et al.  Physiological time-series analysis using approximate entropy and sample entropy. , 2000, American journal of physiology. Heart and circulatory physiology.

[9]  Volkan Kumbasar,et al.  Performance comparison of wavelet based and conventional OFDM systems in multipath Rayleigh fading channels , 2012, Digit. Signal Process..

[10]  S. S. Awad The application of digital speech processing to stuttering therapy , 1997, IEEE Instrumentation and Measurement Technology Conference Sensing, Processing, Networking. IMTC Proceedings.

[11]  Roberto Hornero,et al.  Optimal parameters study for sample entropy-based atrial fibrillation organization analysis , 2010, Comput. Methods Programs Biomed..

[12]  M. Hariharan,et al.  Automatic detection of prolongations and repetitions using LPCC , 2009, 2009 International Conference for Technical Postgraduates (TECHPOS).

[13]  K. M. Ravikumar,et al.  Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies , 2008 .

[14]  P Howell,et al.  Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: II. ANN recognition of repetitions and prolongations with supplied word segment markers. , 1997, Journal of speech, language, and hearing research : JSLHR.

[15]  Peter Howell,et al.  Facilities to assist people to research into stammered speech. , 2004, Stammering research : an on-line journal published by the British Stammering Association.

[16]  Elmar Nöth,et al.  Automatic stuttering recognition using hidden Markov models , 2000, INTERSPEECH.

[17]  Guang-Ming Xian,et al.  An intelligent fault diagnosis method based on wavelet packer analysis and hybrid support vector machines , 2009, Expert Syst. Appl..

[18]  Peter Howell,et al.  The University College London Archive of Stuttered Speech (UCLASS). , 2009, Journal of speech, language, and hearing research : JSLHR.

[19]  M. Hariharan,et al.  MFCC based recognition of repetitions and prolongations in stuttered speech using k-NN and LDA , 2009, 2009 IEEE Student Conference on Research and Development (SCOReD).

[20]  David G. Stork,et al.  Pattern Classification , 1973 .

[21]  H. C. Nagaraj,et al.  An Approach for Objective Assessment of Stuttered Speech Using MFCC Features , 2009 .

[22]  Andrzej Czyzewski,et al.  Intelligent Processing of Stuttered Speech , 2003, Journal of Intelligent Information Systems.

[23]  Peter Howell,et al.  The UCLASS archive of stuttered speech , 2009 .

[24]  Wieslawa Kuniszyk-Józkowiak,et al.  Artificial Neural Networks in the Disabled Speech Analysis , 2009, Computer Recognition Systems 3.

[25]  Chee-Ming Ting,et al.  Application of Malay speech technology in Malay Speech Therapy Assistance Tools , 2007, 2007 International Conference on Intelligent and Advanced Systems.

[26]  Wiesława Kuniszyk-Jóźkowiak,et al.  Automatic detection of prolonged fricative phonemes with the Hidden Markov Models approach , 2007 .

[27]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[28]  Wiesława Kuniszyk-Jóźkowiak,et al.  The application of Kohonen and Multilayer Perceptron Networks in the speech nonfluency analysis , 2014 .

[29]  Zhiguo Zhang,et al.  Recovery of the optimal approximation from samples in wavelet subspace , 2012, Digit. Signal Process..

[30]  Sazali Yaacob,et al.  Classification of speech dysfluencies with MFCC and LPCC features , 2012, Expert Syst. Appl..

[31]  Jian-Da Wu,et al.  An expert system for fault diagnosis in internal combustion engines using wavelet packet transform and neural network , 2009, Expert Syst. Appl..

[32]  C. Sanchez,et al.  Wavelet sample entropy: A new approach to predict termination of atrial fibrillation , 2006, 2006 Computers in Cardiology.

[33]  Engin Avci,et al.  A novel approach for digital radio signal classification: Wavelet packet energy-multiclass support vector machine (WPE-MSVM) , 2008, Expert Syst. Appl..

[34]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[35]  Sazali Yaacob,et al.  Pathological infant cry analysis using wavelet packet transform and probabilistic neural network , 2011, Expert Syst. Appl..

[36]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[37]  Ravi Kumar,et al.  Comparison of Multidimensional MFCC Feature Vectors for Objective Assessment of Stuttered Disfluencies , 2011 .

[38]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[39]  P Howell,et al.  Development of a two-stage procedure for the automatic recognition of dysfluencies in the speech of children who stutter: I. Psychometric procedures appropriate for selection of training material for lexical dysfluency classifiers. , 1997, Journal of speech, language, and hearing research : JSLHR.

[40]  Marek Wisniewski,et al.  Automatic Detection of Disorders in a Continuous Speech with the Hidden Markov Models Approach , 2008, Computer Recognition Systems 2.

[41]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[42]  F. Chapeau-Blondeau,et al.  Multifractality, sample entropy, and wavelet analyses for age-related changes in the peripheral cardiovascular system: preliminary results. , 2008, Medical physics.

[43]  Jian-Da Wu,et al.  Speaker identification using discrete wavelet packet transform technique with irregular decomposition , 2009, Expert Syst. Appl..

[44]  Roberto Kawakami Harrop Galvão,et al.  Optimized orthonormal wavelet filters with improved frequency separation , 2012, Digit. Signal Process..

[45]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .