Automatic Recognition of Pathological Phoneme Production

Objective: Proper diagnosis and therapy of pathological pronunciation of phonemes play an important role in modern logopedics. To enhance the efficiency of diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is addressed in this paper. The authors focus on the therapy of phoneme substitution disorders. Patients and Methods: Recognized speech samples come from speech-impaired Polish children and partially from persons imitating speech disorders. Recognized speech disorders were substitutions in pairs {s, ʂ}, {ɕ, ʂ}, {ʦ, tʂ+}, {ʨ, tʂ+}, {ʣ, dʐ+ }, and {ʥ, dʐ+ } embedded in Polish carrier words. In order to detect substitutions in the recognized words, recently proposed human factor cepstral coefficients (HFCC) have been implemented. Efficiency of the HFCC approach was compared to the application of standard mel-frequency cepstral coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) were used as classifiers. The HMM classifier was based on whole-word models as well as phoneme models. Results present a comparative analysis of DTW and HMM methods. Conclusions: The superiority of HFCC features over those of MFCC was demonstrated. Results obtained by DTW methods, mainly by modified phoneme-based DTW classifier, were slightly better in comparison with the HMM classifier. Results obtained for the detection of substitution in pairs {s, ʂ}, {ʦ, tʂ+}, {ʣ, ʣ+ } are very promising. The methods developed for these cases can be integrated into computer systems for speech therapy. For substitutions in pairs {ɕ, ʂ}, {ʨ, tʂ+}, {ʥ, dʐ+ } further research is necessary.

[1]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[2]  M. Kuhn,et al.  Improvements in isolated word recognition , 1983 .

[3]  Steve Young,et al.  Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[4]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[5]  Mark D Skowronski,et al.  Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition. , 2004, The Journal of the Acoustical Society of America.

[6]  R. Wielgat,et al.  HFCC based recognition of bird species , 2007, Signal Processing Algorithms, Architectures, Arrangements, and Applications SPA 2007.

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.