Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition

This paper studies algorithms for reducing the computational e ort of the mixture density calculations in HMM-based speech recognition systems. These likelihood calculations take about 70 85% of the total recognition time in the RWTH system for large vocabulary continuous speech recognition. To reduce the computational cost of the likelihood calculations, we investigate several space partitioning methods. A detailed comparison of these techniques is given on the North American Business Corpus (NAB'94) for a 20 000word task. As a result, the so-called projection search algorithm in combination with the VQ method reduces the cost of likelihood computation by a factor of about 8 with no signi cant loss in the word recognition accuracy.

[1]  Peter Beyerlein,et al.  Hamming distance approximation for a fast log-likelihood computation for mixture densities , 1995, EUROSPEECH.

[2]  Shree K. Nayar,et al.  Closest point search in high dimensions , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Hermann Ney,et al.  Implementation Of Word Based Statistical Language Models , 1997 .

[4]  Hermann Ney,et al.  Look-ahead techniques for fast beam search , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Enrico Bocchieri,et al.  Vector quantization for the efficient computation of continuous density likelihoods , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[7]  Ivica Rogina,et al.  The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  Mark J. F. Gales,et al.  Use of Gaussian selection in large vocabulary continuous speech recognition using HMMS , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.