Phonetic Question Generation Using Misrecognition

Most automatic speech recognition systems are currently based on tied state triphones These tied states are usually determined by a decision tree Decision trees can automatically cluster triphone states into many classes according to data available allowing each class to be trained efficiently In order to achieve higher accuracy, this clustering is constrained by manually generated phonetic questions Moreover, the tree generated from these phonetic questions can be used to synthesize unseen triphones The quality of decision trees therefore depends on the quality of the phonetic questions Unfortunately, manual creation of phonetic questions requires a lot of time and resources To overcome this problem, this paper is concerned with an alternative method for generating these phonetic questions automatically from misrecognition items These questions are tested using the standard TIMIT phone recognition task.

[1]  Jj Odell,et al.  The Use of Context in Large Vocabulary Speech Recognition , 1995 .

[2]  Christoph Neukirchen,et al.  Refining tree-based state clustering by means of formal concept analysis, balanced decision trees and automatically generated model-sets , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[4]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5]  Mirjam Wester,et al.  An elitist approach to articulatory-acoustic feature classification , 2001, INTERSPEECH.

[6]  Richard M. Stern,et al.  Automatic clustering and generation of contextual questions for tied states in hidden Markov models , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  Hermann Ney,et al.  Automatic question generation for decision tree based state tying , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Supphanat Kanokphara,et al.  Automatic Question Generation for HMM State Tying using a Feature Table , 2004 .

[9]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[10]  Supphanat Kanokphara,et al.  A study of HMM-based automatic segmentations for Thai continuous speech recognition system , 2002 .