论文信息 - Experimenting with a global decision tree for state clustering in automatic speech recognition systems

Experimenting with a global decision tree for state clustering in automatic speech recognition systems

In modern automatic speech recognition systems, it is standard practice to cluster several logical hidden Markov model states into one physical, clustered state. Typically, the clustering is done such that logical states from different phones or different states can not share the same clustered state. In this paper, we present a collection of experiments that lift this restriction. The results show that, for Aurora 2 and Aurora 3, much smaller models perform as least as well as the standard baseline. On a TIMIT phone recognition task, we analyze the tying structures introduced, and discuss the implications for building better acoustic models.

Alex Acero | Jasha Droppo

[1] Roland Kuhn,et al. Improving decision trees for acoustic modeling , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2] Douglas B. Paul. Extensions to phone-state decision-tree clustering: single tree and tagged clustering , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[4] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[5] S. J. Young,et al. Tree-based state tying for high accuracy acoustic modelling , 1994 .

[6] Tanja Schultz,et al. Enhanced tree clustering with single pronunciation dictionary for conversational speech recognition , 2003, INTERSPEECH.

[7] Hsiao-Wuen Hon,et al. Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..