Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification

We show a number of improvements in the use of Hidden Conditional Random Fields (HCRFs) for phone classification on the TIMIT and Switchboard corpora. We first show that the use of regularization effectively prevents overfitting, improving over other methods such as early stopping. We then show that HCRFs are able to make use of non-independent features in phone classification, at least with small numbers of mixture components, while HMMs degrade due to their strong independence assumptions. Finally, we successfully apply Maximum a Posteriori adaptation to HCRFs, decreasing the phone classification error rate in the Switchboard corpus by around 1% -5% given only small amounts of adaptation data.

[1]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[3]  Trevor Darrell,et al.  Hidden-state Conditional Random Fields , 2006 .

[4]  Alex Acero,et al.  Hidden conditional random fields for phone classification , 2005, INTERSPEECH.

[5]  James R. Glass,et al.  Heterogeneous acoustic measurements for phonetic classification 1 , 1997, EUROSPEECH.

[6]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[7]  James R. Glass,et al.  HETEROGENEOUS ACOUSTIC MEASUREMENTS FOR PHONETIC CLASSIFICATION , 1997 .

[8]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[9]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[10]  Steven Greenberg,et al.  INSIGHTS INTO SPOKEN LANGUAGE GLEANED FROM PHONETIC TRANSCRIPTION OF THE SWITCHBOARD CORPUS , 1996 .

[11]  Ben Taskar,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[12]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[13]  Andrew McCallum,et al.  A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance , 2005, UAI.

[14]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  A. Nadas,et al.  A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood , 1983 .

[16]  Lawrence K. Saul,et al.  Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[17]  Chin-Hui Lee,et al.  Bayesian Learning of Gaussian Mixture Densities for Hidden Markov Models , 1991, HLT.