Localized Smoothing for Multinomial Language Models
暂无分享,去创建一个
We explore a formal approach to dealing with the zero frequency problem that arises in applications of probabilistic models to language. In this report we introduce the zero frequency problem in the context of probabilistic language models, describe several popular solutions, and introduce localized smoothing, a potentially better alternative. We formulate localized smoothing as a two-step maximization process, outline the estimation details for both steps and present the experiments which show the technique to have potential for improving performance.
[1] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.
[2] James Allan,et al. INQUERY and TREC-8 , 1998, TREC.
[3] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.