论文信息 - Acoustic modeling with mixtures of subspace constrained exponential models

Acoustic modeling with mixtures of subspace constrained exponential models

Gaussian distributions are usually parameterized with their natural parameters: the mean µ and the covariance Σ. They can also be re-parameterized as exponential models with canonical parameters P =Σ −1 and ψ = Pµ . In this paper we consider modeling acoustics with mixtures of Gaussians parameterized with canonical parameters where the parameters are constrained to lie in a shared affine subspace. This class of models includes Gaussian models with various constraints on its parameters: diagonal covariances, MLLT models, and the recently proposed EMLLT and SPAM models. We describe how to perform maximum likelihood estimation of the subspace and parameters within a fixed subspace. In speech recognition experiments, we show that this model improves upon all of the above classes of models with roughly the same number of parameters and with little computational overhead. In particular we get 30-40% relative improvement over LDA+MLLT models when using roughly the same number of parameters.

Scott Axelrod | Ramesh A. Gopinath | Karthik Visweswariah

[1] J ThuenteDavid,et al. Line search algorithms with guaranteed sufficient decrease , 1994 .

[2] Scott Axelrod,et al. Maximum likelihood training of subspaces for inverse covariance modeling , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[3] Scott Axelrod,et al. Modeling with a subspace constraint on inverse covariance matrices , 2002, INTERSPEECH.

[4] Mark J. F. Gales,et al. Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[5] Peder A. Olsen,et al. Modeling inverse covariance matrices by basis expansion , 2002, IEEE Transactions on Speech and Audio Processing.

[6] Scott Axelrod,et al. Dimensional reduction, covariance modeling, and computational complexity in ASR systems , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7] David J. Thuente,et al. Line search algorithms with guaranteed sufficient decrease , 1994, TOMS.

[8] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[9] Ramesh A. Gopinath,et al. Maximum likelihood modeling with Gaussian distributions for classification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.