Gaussian mixture modeling with volume preserving nonlinear feature space transforms

The paper introduces a new class of nonlinear feature space transformations in the context of Gaussian mixture models. This class of nonlinear transformations is characterized by computationally efficient training algorithms. Experimental results with quadratic feature space transforms are shown to yield modestly improved recognition performance in a speech recognition context. The quadratic feature space transforms are also shown to be beneficial in an adaptation setting.

[1]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[2]  Benoît Maison,et al.  A robust high accuracy speech recognition system for mobile applications , 2002, IEEE Trans. Speech Audio Process..

[3]  Mukund Padmanabhan,et al.  Maximum-likelihood nonlinear transformation for acoustic adaptation , 2004, IEEE Transactions on Speech and Audio Processing.

[4]  Ramesh A. Gopinath,et al.  Model selection in acoustic modeling , 1999, EUROSPEECH.

[5]  Ramesh A. Gopinath,et al.  Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[7]  Chin-Hui Lee,et al.  A maximum-likelihood approach to stochastic matching for robust speech recognition , 1996, IEEE Trans. Speech Audio Process..

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[10]  P. Dayan,et al.  Curved Gaussian models with application to modeling foreign exchange rates , 2000 .

[11]  N. Campbell CANONICAL VARIATE ANALYSIS—A GENERAL MODEL FORMULATION , 1984 .

[12]  Mark Hasegawa-Johnson,et al.  Non-linear maximum likelihood feature transformation for speech recognition , 2003, INTERSPEECH.

[13]  Scott Axelrod,et al.  Acoustic modeling with mixtures of subspace constrained exponential models , 2003, INTERSPEECH.