Probabilistic Learning for Information Filtering

In this paper we describe and evaluate a learning model for information filtering which is an adaptation of the generalised probabilistic model of Information Retrieval. The model is based on the concept of "uncertainty sampling" a technique that allows for relevance feedback both on relevant and non relevant documents. The proposed learning model is the core of a prototype information filtering system called ProFile.

[1]  Cyril W. Cleverdon,et al.  Aslib Cranfield research project - Factors determining the performance of indexing systems; Volume 1, Design; Part 2, Appendices , 1966 .

[2]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[3]  R. Jeffrey The Logic of Decision , 1984 .

[4]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[5]  Helen R. Tibbo,et al.  The Cystic Fibrosis Database: Content and Research Opportunities. , 1991 .

[6]  Jaakko Hintikka,et al.  On Semantic Information , 1970 .

[7]  David C. Blair STAIRS redux: thoughts on the STAIRS evaluation, ten years after , 1996 .

[8]  James Allan,et al.  Incremental relevance feedback for information filtering , 1996, SIGIR '96.

[9]  Hector Garcia-Molina,et al.  SIFT - a Tool for Wide-Area Information Dissemination , 1995, USENIX.

[10]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[11]  David D. Lewis,et al.  A sequential algorithm for training text classifiers: corrigendum and additional data , 1995, SIGF.

[12]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[13]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[14]  Fredrik Kilander A Brief Comparison of News Filtering Software , 1995 .

[15]  James P. Callan,et al.  Document filtering with inference networks , 1996, SIGIR '96.

[16]  Donna K. Harman,et al.  Overview of the Fifth Text REtrieval Conference (TREC-5) , 1996, TREC.

[17]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[18]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[19]  A. Rényi,et al.  Foundations of probability , 1970 .

[20]  C. J. van Rijsbergen,et al.  An Evaluation of feedback in Document Retrieval using Co‐Occurrence Data , 1978, J. Documentation.

[21]  IJsbrand Jan Aalbersberg,et al.  Incremental relevance feedback , 1992, SIGIR '92.

[22]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[23]  David Lindley,et al.  Logical foundations of probability , 1951 .

[24]  M. E. Maron,et al.  Automatic Indexing: An Experimental Inquiry , 1961, JACM.

[25]  Mark D. Dunlop The effect of accessing nonmatching documents on relevance feedback , 1997, TOIS.