A system for automatic personalized tracking of scientific literature on the Web

We introduce a system as part of the CiteSeer digital library project for automatic tracking of scientific literature that is relevant to a user’s research interests. Unlike previous systems that use simple keyword matching, CiteSeer is able to track and recommend topically relevant papers even when keyword based query profiles fail. This is made possible through the use of a heterogenous profile to represent user interests. These profiles include several representations, including content based relatedness measures. The CiteSeer tracking system is well integrated into the search and browsing facilities of CiteSeer, and provides the user with great flexibility in tuning a profile to better match his or her interests. The software for this system is available, and a sample database is online as a public service.

[1]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[2]  C. Lee Giles,et al.  CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications , 1998, AGENTS '98.

[3]  Ellen Spertus,et al.  ParaSite: Mining Structural Information on the Web , 1997, Comput. Networks.

[4]  Christos Faloutsos,et al.  A survey of information retrieval and filtering methods , 1995 .

[5]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[6]  Michael J. Pazzani,et al.  Syskill & Webert: Identifying Interesting Web Sites , 1996, AAAI/IAAI, Vol. 1.

[7]  S. Lazerow The Institute of Scientific Information , 1961, Nature.

[8]  Gilles Burel,et al.  Detection and localization of faces on digital images , 1994, Pattern Recognit. Lett..

[9]  Marko Balabanovic,et al.  An adaptive Web page recommendation service , 1997, AGENTS '97.

[10]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[11]  Bruce Krulwich,et al.  Learning user information interests through extraction of semantically significant phrases , 1996 .

[12]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Katherine W. McCain,et al.  Descriptor and citation retrieval in the medical behavioral sciences literature:retrieval overlaps and novelty distribution , 1989 .

[14]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[15]  Filippo Menczer,et al.  ARACHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods for Information Discovery , 1997, ICML 1997.

[16]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[17]  Filippo Menczer,et al.  ARCCHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods , 1997, ICML.

[18]  Inderjeet Mani,et al.  Representational Issues in Machine Learning of User Profiles , 1996 .

[19]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[20]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[21]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[22]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[23]  Yiyu Yao Measuring retrieval effectiveness based on user preference of documents , 1995 .

[24]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[25]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[26]  Alexandros Moukas Amalthaea Information Discovery and Filtering Using a Multiagent Evolving Ecosystem , 1997, Appl. Artif. Intell..

[27]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[28]  C. A. Giffard Ancient Rome's Daily Gazette , 1975 .

[29]  Mark S. Ackerman,et al.  Do I Care? -- Tell Me What's Changed on the Web , 1996 .