The Geometry of the Channel Space in GMM-Based Speaker Recognition

We describe an extension of the joint factor analysis model of speaker and channel variability in which channel supervectors are modeled by mixtures of low-rank Gaussians rather than by a unimodal Gaussian. This version of the joint factor analysis model includes data-driven feature mapping and the standard joint factor analysis models as limiting cases and it enables us to explore a range of possibilities between these two extremes. Our experimental results indicate that unimodal models of relatively high rank perform better than mixture models of lower rank and they confirm the appropriateness of the unimodal assumption in the standard joint factor analysis model

[1]  Patrick Kenny,et al.  Improvements in Factor Analysis Based Speaker Verification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Patrick Kenny,et al.  Speaker and Session Variability in GMM-Based Speaker Verification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[4]  Douglas A. Reynolds,et al.  Channel robust speaker verification via feature mapping , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[6]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[7]  Patrick Kenny,et al.  Eigenvoice modeling with sparse training data , 2005, IEEE Transactions on Speech and Audio Processing.

[8]  Sridha Sridharan,et al.  Feature warping for robust speaker verification , 2001, Odyssey.

[9]  Sridha Sridharan,et al.  Data-driven clustering for blind feature mapping in speaker verification , 2005, INTERSPEECH.

[10]  Patrick Kenny,et al.  Factor analysis simplified [speaker verification applications] , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Patrick Kenny,et al.  Joint Factor Analysis Versus Eigenchannels in Speaker Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.