Variational Bayesian GMM for speech recognition

In this paper, we explore the potentialities of Variational Bayesian (VB) learning for speech recognition problems. VB methods deal in a more rigorous way with model selection and are a generalization of MAP learning. VB training for Gaussian Mixture Models is less affected than EM-ML training by overfitting and singular solutions. We compare two types of Variational Bayesian Gaussian Mixture Models (VBGMM) with classical EM-ML GMM in a phoneme recognition task on the TIMIT database. VB learning performs better than EM-ML learning and is less affected by the initial model guess.