TIRS: a topological information retrieval system satisfying the requirements of the Waller-Kraft wish list

Most document retrieval systems based on probabilistic models of feature distributions assume random selection of documents for retrieval. The assumptions of these models are met when documents are randomly selected from the database or when retrieving all available documents. A more suitable model for retrieval of a single document assumes that the best document available is to be retrieved first. Models of document retrieval systems assuming random selection and best-first selection are developed and compared under binary independence and two Poisson independence feature distribution models. Under the best-first model, feature discrimination varies with the number of documents in each relevance class in the database. A weight similar to the Inverse Document Frequency weight and consistent with the best-first model is suggested which does not depend on knowledge of the characteristics of relevant documents.