Domain Adaptation of Maximum Entropy Language Models

We investigate a recently proposed Bayesian adaptation method for building style-adapted maximum entropy language models for speech recognition, given a large corpus of written language data and a small corpus of speech transcripts. Experiments show that the method consistently outperforms linear interpolation which is typically used in such cases.

[1]  Stanley F. Chen,et al.  Performance Prediction for Exponential Language Models , 2009, NAACL.

[2]  Stanley F. Chen,et al.  Shrinking Exponential Language Models , 2009, NAACL.

[3]  Hermann Ney,et al.  Improved clustering techniques for class-based statistical language modelling , 1993, EUROSPEECH.

[4]  Christopher D. Manning,et al.  Hierarchical Bayesian Domain Adaptation , 2009, NAACL.

[5]  Jun Wu,et al.  Building a topic-dependent maximum entropy model for very large corpora , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[7]  Paul Deléglise,et al.  The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news , 2005, INTERSPEECH.

[8]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[9]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[10]  Ronald Rosenfeld,et al.  A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..

[11]  Joshua Goodman,et al.  Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).