Probabilistic Indexing and Categorisation Tool, Intermediate Prototype

WP4 deals with automatic categorisation of web documents that is based on a description oriented approach to document indexing. This deliverable describes further progress with respect to the work done in Deliverable 4.1 as well as an Intermediate Prototype which implements parts of the architecture given in Deliverable 4.1.

[1]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[2]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[3]  Yoram Singer,et al.  Context-sensitive learning methods for text categorization , 1996, SIGIR '96.

[4]  Yiming Yang,et al.  An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[5]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[6]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[7]  Li Yang,et al.  A hypertext query language for images , 1994, SGMD.

[8]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[9]  David D. Lewis,et al.  A comparison of two learning algorithms for text categorization , 1994 .

[10]  Chris Buckley,et al.  A probabilistic learning approach for document indexing , 1991, TOIS.

[11]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[12]  Gerhard Knorz,et al.  Automatisches Indexieren als Erkennen abstrakter Objekte , 1983 .

[13]  David D. Lewis,et al.  Text categorization of low quality images , 1995 .

[14]  Chris Buckley,et al.  Optimizing Document Indexing and Search Term Weighting Based on Probabilistic Models , 1992, TREC.

[15]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.