Full Text Retrieval based on Probalistic Equations with Coefficients fitted by Logistic Regression

The experiments described here ara part of research program whose objective is to develop a full text retrieval methodology that is statistically sound and powerful, yet reasonably simple. The methodology is based on the use of a probabilistic model whose parameters ara fitted empirically to a learning set of relevance judgements by logistic regression. The method was applied to the TIPSTER data with optimally relativized frequencies of occurence of match stems as the regression variables. In a routing retrieval experiment, these were supplemented by other variables coresponding to sums of logodds associated with particular match stems