Piecewise HMM discriminative training

This paper address the problem of training HMMs using long files of uninterrupted speech with limited and constant memory requirements. The classical training algorithms usually require limited duration training utterances due to memory constraints for storing the generated trellis. Our solution allows to exploits databases that are transcribed, but not partitioned into sentences, using a sliding window Forward-Backward algorithm. This approach has been tested on the connected digits TI/NIST database and on long sequences of Italian digits. Our experimental results show that for a lookahead valueL of about 1-2 sec it is possible to achieve reestimation counts that are affected by errors less than 1.e-7, producing similar reestimated models. Another application of our sliding window Forward-Backward algorithm is MMIE training, that we have tested on the TI/NIST database connected digits using as a general model the recognition tree rather than the N-best hypotheses, or the word lattices.

[1]  Pietro Laface,et al.  Connected digit recognition using short and long duration models , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Yves Normandin Maximum Mutual Information Estimation of Hidden Markov Models , 1996 .

[3]  Douglas D. O'Shaughnessy,et al.  Books on tape as training data for continuous speech recognition , 1994, Speech Commun..

[4]  Michael D. Brown,et al.  An algorithm for connected word recognition , 1982, ICASSP.