Automatic intonation classification for speech training systems

A prosodic Hidden Markov model (HMM) based modality recognizer has been developed, which, after supra-segmental acoustic pre-processing, can perform clause and sentence boundary detection and modality (sentence type) recognition. This modality recognizer is adapted to carry out automatic evaluation of the intonation of the produced utterances in a speech training system for hearing-impaired persons or foreign language learners. The system is evaluated on utterances from normally-speaking persons and tested with speech-impaired (due to hearing problems) persons. To allow a deeper analysis, the automatic classification of the intonation is compared to subjective listening tests.

[1]  György Szaszák,et al.  Using Prosody in Fixed Stress Languages for Improvement of Speech Recognition , 2007, COST 2102 Workshop.

[2]  Keikichi Hirose,et al.  Continuous Speech Recognition of Japanese Using Prosodic Word Boundaries Detected by Mora Transition Modeling of Fundamental Frequency Contours , 2001 .

[3]  E. F. James,et al.  THE ACQUISITION OF PROSODIC FEATURES OF SPEECH USING A SPEECH VISUALIZER , 1976 .

[4]  György Szaszák,et al.  Using prosody for the improvement of ASR - sentence modality recognition , 2008, INTERSPEECH.

[5]  Ralf Kompe,et al.  Prosody in Speech Understanding Systems , 1997, Lecture Notes in Computer Science.

[6]  Jan Nouza Computer-aided spoken-language training with enhanced visual and auditory feedback , 1999, EUROSPEECH.

[7]  Pavel Král,et al.  Sentence Modality Recognition in French based on Prosody , 2005 .

[8]  Janet Anderson-Hsieh Interpreting Visual Feedback on English Suprasegmentals in Computer Assisted Pronunciation Instruction , 1994 .

[9]  H. Fujisaki,et al.  The use of a generative model of F/sub 0/ contours for multilingual speech synthesis , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[10]  László Hunyadi,et al.  Hungarian Sentence Prosody and Universal Grammar: On the Phonology – Syntax Interface , 2002 .

[11]  Klára Vicsi,et al.  Distance score evaluation of the visualised speech spectra at audio-visual articulation training , 1999, EUROSPEECH.

[12]  N. M. Veilleuz,et al.  Prosody/Parse Scoring and Its Application in ATIS , 1993, HLT.