Decision tree distribution tying based on a dimensional split technique

In this paper, a new clustering technique called Dimensional Split Phonetic Decision Tree (DS-PDT) is proposed. In DSPDT, state distributions are split dimensionally when applying phonetic question. This technique is an extension of the decision tree based acoustic modeling. It gives a proper context-dependent sharing structure of each dimension automatically while maintaining the correlations among the dimensions. In speaker-independent continuous speech recognition experiments, DS-PDT achieved about 8% error reduction over the phonetic decision tree clustering.

[1]  Kay-Fu Lee,et al.  Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[2]  Shigeki Matsuda,et al.  Feature-dependent allophone clustering , 2000, INTERSPEECH.

[3]  Keiichi Tokuda,et al.  An adaptive algorithm for mel-cepstral analysis of speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Mei-Yuh Hwang,et al.  Predicting unseen triphones with senones , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Mari Ostendorf,et al.  HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..

[6]  Koichi Shinoda,et al.  MDL-based context-dependent subword modeling for speech recognition , 2000 .

[7]  Mei-Yuh Hwang,et al.  Predicting unseen triphones with senones , 1996, IEEE Trans. Speech Audio Process..

[8]  Shigeki Sagayama,et al.  A successive state splitting algorithm for efficient allophone modeling , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Steve J. Young,et al.  Tree-Based State Tying for High Accuracy Modelling , 1994, HLT.

[10]  B. Juang,et al.  Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition , 2008 .