Randomized Language Models via Perfect Hash Functions

We propose a succinct randomized language model which employs a perfect hash function to encode fingerprints of n-grams and their associated probabilities, backoff weights, or other parameters. The scheme can represent any standard n-gram model and is easily combined with existing model reduction techniques such as entropy-pruning. We demonstrate the space-savings of the scheme via machine translation experiments within a distributed language modeling framework.

[1]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[2]  Kenneth Ward Church,et al.  Compressing Trigram Language Models With Golomb Coding , 2007, EMNLP.

[3]  Andreas Stolcke,et al.  Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.

[4]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[5]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[6]  Jianfeng Gao,et al.  Language model size reduction by pruning and clustering , 2000, INTERSPEECH.

[7]  Ahmad Emami,et al.  Large-Scale Distributed Language Modeling , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Miles Osborne,et al.  Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap , 2007, EMNLP.

[9]  EstimationPeter,et al.  The Mathematics of Machine Translation : Parameter , 2004 .

[10]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[11]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[12]  Larry Carter,et al.  Universal Classes of Hash Functions , 1979, J. Comput. Syst. Sci..

[13]  George Havas,et al.  A Family of Perfect Hashing Methods , 1996, Comput. J..

[14]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[15]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[16]  Larry Carter,et al.  Exact and approximate membership testers , 1978, STOC.

[17]  Bernard Chazelle,et al.  The Bloomier filter: an efficient data structure for static support lookup tables , 2004, SODA '04.

[18]  Miles Osborne,et al.  Randomised Language Modelling for Statistical Machine Translation , 2007, ACL.