Nearest Neighbor Smoothing of Language Models in IR
暂无分享,去创建一个
We hypothesize the use of one or more nearest neighbors of the document will give better estimates of the probabilities, effectively increasing the sample size. We will treat both Problems 1 and 2 as the same problem, basing our work on a model similar to that of Lavrenko. We will incorporate an average of the probabilities from the k nearest neighbor s. We will include this average using a linear interpolation with the estimate of the proba bility of a term given a document. The interpolation will also include an estimate based on the collection-wide s tatistic .
[1] Victor Lavrenko. Localized Smoothing for Multinomial Language Models , 2000 .
[2] W. Bruce Croft,et al. A language modeling approach to information retrieval , 1998, SIGIR '98.