Model complexity control and compression using discriminative growth functions

State-of-the-art large vocabulary speech recognition systems are highly complex. Many techniques affect both system complexity and recognition performance. The need to determine the appropriate complexity without having to build each possible system has led to the development of automatic complexity control criteria. In this paper further experiments are carried out using a recently proposed criterion based on marginalizing a maximum mutual information (MMI) growth function. The use of this criterion is much detailed for determining the appropriate dimensionality in a multiple HLDA system and the number of components per state. A scheme for also using this criterion for model compression is described. Experimental results on a spontaneous telephone speech recognition task are described. Initial system compression experiments are inconclusive. However, comparing a standard state-of-the-art system with one generated using complexity control shows a reduction in word error rate.

[1]  Mark J. F. Gales Maximum likelihood multiple subspace projections for hidden Markov models , 2002, IEEE Trans. Speech Audio Process..

[2]  Mark J. F. Gales,et al.  Maximum likelihood multiple projection schemes for hidden Markov models , 1999 .

[3]  Mark J. F. Gales,et al.  Automatic complexity control for HLDA systems , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  Zdravko Kacic,et al.  A novel loss function for the overall risk criterion based discriminative training of HMM models , 2000, INTERSPEECH.

[8]  M. Padmanabhan,et al.  Model complexity adaptation using a discriminant measure , 2000, IEEE Trans. Speech Audio Process..

[9]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.