论文信息 - Quasi-continuous local codebook features for multilingual acoustic phonetic modelling

Quasi-continuous local codebook features for multilingual acoustic phonetic modelling

In this article we present a method for defining the question set used for the induction of acoustic phonetic decision trees. The method is data driven resulting in an ordered feature space in contrast to the usual categorical one consisting of phonetic attribute values. Visualization of the feature space verifies that the derived characteristics are meaningful. We apply the features to a multilingual speech recognition task, showing that comparable results to the standard method, using question sets devised by human experts, can be derived.

Asunción Moreno | Frank Diehl

[1] Albino Nogueiras,et al. The demiphone: an efficient subword unit for continuous speech recognition , 1997, EUROSPEECH.

[2] Albino Nogueiras Rodríguez,et al. The demiphone:an efficient subword unit for Continuous Speech Recognition , 1997 .

[3] Andrej Zgank,et al. Data driven generation of broad classes for decision tree construction in acoustic modeling , 2003, INTERSPEECH.

[4] Hermann Ney,et al. Automatic question generation for decision tree based state tying , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5] Teuvo Kohonen,et al. Self-Organizing Maps , 2010 .

[6] Asunción Moreno,et al. Local Codebook Features for Mono-and Multilingual Acoustic Phonetic Modelling , 2004 .

[7] William J. Byrne,et al. Towards language independent acoustic modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8] Albino Nogueiras,et al. The demiphone: An efficient contextual subword unit for continuous speech recognition , 2000, Speech Commun..

[9] Asunción Moreno,et al. Acoustic phonetic modeling using local codebook features , 2004, INTERSPEECH.

[10] Ciprian Chelba,et al. Mutual information phone clustering for decision tree induction , 2002, INTERSPEECH.