Gradient steepness metrics using extended Baum-Welch transformations for universal pattern recognition tasks

In many pattern recognition tasks, given some input data and a family of models, the "best" model is defined as the one which maximizes the likelihood of the data given the model. Extended Baum- Welch (EBW) transformations are most commonly used as a discriminative technique for estimating parameters of Gaussian mixtures. In this paper, we use the EBW transformations to derive a novel gradient steepness measurement to find which model best explains the data. We use this gradient measurement to derive a variety of EBW metrics to explain model fit to the data. We apply these EBW metrics to audio segmentation via Hidden Markov Models (HMMs) and show that our gradient steepness measurement is robust across different EBW metrics and model complexities.

[1]  D. Kanevsky Extended Baum Transformations for General Functions , II , 2005 .

[2]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[3]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[4]  Tara N. Sainath,et al.  Audio classification using extended baum-welch transformations , 2007, INTERSPEECH.

[5]  Jing Huang,et al.  Impact of audio segmentation and segment clustering on automated transcription accuracy of large spoken archives , 2003, INTERSPEECH.

[6]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[7]  Dimitri Kanevsky Extended Baum transformations for general functions , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Brian Kingsbury,et al.  Evaluation of Proposed Modifications to MPE for Large Scale Discriminative Training , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9]  Mukund Padmanabhan,et al.  Maximum-likelihood nonlinear transformation for acoustic adaptation , 2004, IEEE Transactions on Speech and Audio Processing.

[10]  S. Young,et al.  Lattice-based discriminative training for large vocabulary speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[11]  David G. Stork,et al.  Pattern Classification , 1973 .