Maximum conditional likelihood linear regression and maximum a posteriori for hidden conditional random fields speaker adaptation

This paper shows how to improve hidden conditional random fields (HCRFs) for phone classification by applying various speaker adaptation techniques. These include maximum a posteriori (MAP) adaptation as well as a new technique we introduce called maximum conditional likelihood linear regression (MCLLR), a discriminative variant of the widely used MLLR algorithm. In previous work, we and others have shown that HCRFs outperform even discriminatively trained HMMs. In this paper we show that HCRFs adapted via MCLLR or via MAP adaptation also work better than similarly adapted HMMs. We also compare MCLLR and MAP adaptation performance with different amounts of adaptation data. MCLLR adaptation performs better when the amount of adaptation data is relatively small, while MAP adaptation outperforms MCLLR with larger amounts of adaptation.

[1]  Chin-Hui Lee,et al.  Bayesian Learning of Gaussian Mixture Densities for Hidden Markov Models , 1991, HLT.

[2]  Vassilios Digalakis,et al.  Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..

[3]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[4]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[5]  William J. Byrne,et al.  Discriminative speaker adaptation with conditional maximum likelihood linear regression , 2001, INTERSPEECH.

[6]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[9]  Andrew McCallum,et al.  A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance , 2005, UAI.

[10]  Alex Acero,et al.  Hidden conditional random fields for phone classification , 2005, INTERSPEECH.

[11]  Trevor Darrell,et al.  Hidden-state Conditional Random Fields , 2006 .

[12]  Daniel Jurafsky,et al.  Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[13]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.