Speech recognition experiments using multi-span statistical language models

A multi-span framework was proposed to integrate the various constraints, both local and global, that are present in the language. In this approach, local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. The performance of the resulting multi-span language models, as measured by the perplexity, has been shown to compare favorably with the corresponding n-gram performance. This paper reports on actual speech recognition experiments, and shows that word error rate is also substantially reduced. On a subset of the Wall Street Journal speaker-independent, 20,000-word vocabulary, continuous speech task, the multi-span framework resulted in a reduction in average word error rate of up to 17%.

[1]  Jerome R. Bellegarda,et al.  Exploiting both local and global constraints for multi-span statistical language modeling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Ronald Rosenfeld,et al.  Trigger-based language models: a maximum entropy approach , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Jerome R. Bellegarda Large vocabulary speech recognition with multispan statistical language models , 2000, IEEE Trans. Speech Audio Process..

[4]  Steve Renals,et al.  Document space models using latent semantic analysis , 1997, EUROSPEECH.

[5]  Jerome R. Bellegarda,et al.  A novel word clustering algorithm based on latent semantic analysis , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  Mitch Weintraub,et al.  The Hub and Spoke Paradigm for CSR Evaluation , 1994, HLT.

[7]  Jerome R. Bellegarda Multi-Span statistical language modeling for large vocabulary speech recognition , 1998, ICSLP.

[8]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[9]  Michael W. Berry,et al.  Large-Scale Sparse Singular Value Computations , 1992 .

[10]  Jerome R. Bellegarda,et al.  A multispan language modeling framework for large vocabulary speech recognition , 1998, IEEE Trans. Speech Audio Process..

[11]  Jerome R. Bellegarda,et al.  A latent semantic analysis framework for large-Span language modeling , 1997, EUROSPEECH.