论文信息 - Nearest Neighbor Smoothing of Language Models in IR

Nearest Neighbor Smoothing of Language Models in IR

We hypothesize the use of one or more nearest neighbors of the document will give better estimates of the probabilities, effectively increasing the sample size. We will treat both Problems 1 and 2 as the same problem, basing our work on a model similar to that of Lavrenko. We will incorporate an average of the probabilities from the k nearest neighbor s. We will include this average using a linear interpolation with the estimate of the proba bility of a term given a document. The interpolation will also include an estimate based on the collection-wide s tatistic .

Paul Ogilvie

[1] Victor Lavrenko. Localized Smoothing for Multinomial Language Models , 2000 .

[2] W. Bruce Croft,et al. A language modeling approach to information retrieval , 1998, SIGIR '98.