Unsupervised Online Adaptation of Segmental Switching Linear Gaussian Hidden Markov Models for Robust Speech Recognition

In our previous works, a segmental switching linear Gaussian hidden Markov model (SSLGHMM) was proposed to model "noisy" speech utterance for robust speech recognition. Both ML (maximum likelihood) and MCE (minimum classification error) training procedures were developed for training model parameters and their effectiveness was confirmed by evaluation experiments on Aurora2 and Aurora3 databases. In this paper, we present an ML approach to unsupervised online adaptation (OLA) of SSLGHMM parameters for achieving further performance improvement. An important implementation issue of how to initialize the switching linear Gaussian model parameters is also studied. Evaluation results on Finnish Aurora3 database show that in comparison with the performance of a baseline system based on ML-trained SSLGHMMs, unsupervised OLA yields a relative word error rate reduction of 4.3%, 9.1%, and 17.8% for well-matched, medium-mismatched, and high-mismatched conditions respectively