High performance connected digit recognition using codebook exponents

The authors describe the latest developments by the speech research group at CRIM in speaker-independent connected digit recognition, using hidden Markov models (HMMs) trained with maximum mutual information estimation (MMIE). The work presented is a continuation of work previously described by the authors (see Proc. 1991 IEEE Inf. Conf. on Acoust. Speech and Sign. Process., pp.533-536). The main differences are: (1) use of the 20-kHz TI/NIST corpus available on CD-ROM (instead of the 10-kHz distribution tape), (2) use of word models (instead of sub-word units), (3) addition of second derivative parameters, and (4) a more elaborate training procedure for codebook exponents. The experiments described were all performed on the complete adult portion of the corpus. The baseline system, using discrete HMMs and MMIE, has a 0.67% word error rate and a 2.03% string error rate. The authors describe techniques that allowed them to improve greatly the recognition rate.<<ETX>>

[1]  Salvatore D. Morgera,et al.  An improved MMIE training algorithm for speaker-independent, small vocabulary, continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Yves Normandin,et al.  Hidden Markov models, maximum mutual information estimation, and the speech recognition problem , 1992 .

[3]  Renato De Mori,et al.  A hybrid coder for hidden Markov models using a recurrent neural networks , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  George R. Doddington Phonetically sensitive discriminants for improved speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  Chin-Hui Lee,et al.  Improvements in connected digit recognition using higher order spectral and energy features , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jerome R. Bellegarda,et al.  Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[7]  Y.-L. Chow Maximum mutual information estimation of HMM parameters for continuous speech recognition using the N-best algorithm , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  Frank K. Soong,et al.  High performance connected digit recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.

[10]  Renato De Mori,et al.  High performance connected digit recognition using maximum mutual information estimation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.