Hybrid HillClimbing and Knowledge-Based Techniques for Intelligent News Filtering

As the size of the Internet increases, the amount of data available to users has dramatically risen, resulting in an information overload for users. This work involved the creation of an intelligent information news filtering system named INFOS (Intelligent News Filtering Organizational System) to reduce the user’s search burden by automatically eliminating Usenet news articles predicted to be irrelevant. These predictions are learned automatically by adapting an internal user model that is based upon features taken from articles and collaborative features derived from other users. The features are manipulated through keyword-based techniques and knowledge-based techniques to perform the actual filtering. Knowledge-based systems have the advantage of analyzing input text in detail, but at the cost of computational complexity and the difficulty of scaling up to large domains. In contrast, statistical and keyword approaches scale up readily but result in a shallower understanding of the input. A hybrid system integrating both approaches improves accuracy over keyword approaches, supports domain knowledge, and retains scalability. Content Areas: software agents Abstract ID: A626 Word Count: 6728

[1]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[2]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[3]  R. E. Eberts,et al.  Knowledge acquisition using neural networks for intelligent interface design , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.

[4]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[5]  Andrew Jennings,et al.  A Personal News Service Based on a User Model Neural Network , 1992 .

[6]  Pattie Maes,et al.  Collaborative Interface Agents , 1994, AAAI.

[7]  Mary Hart,et al.  Automatic indexing using selective NLP and first-order thesauri , 1991, RIAO.

[8]  Mark Sanderson,et al.  Conceptual Information Retrieval – A Case Study in Adaptive Partial Parsing , 1992 .

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Beerud Dilip Sheth,et al.  A learning approach to personalized information filtering , 1994 .

[11]  W. Bruce Croft,et al.  An Approach to Incorporating CBR Concepts in IR Systems , 1993 .

[12]  Frank Curtis Stevens,et al.  Knowledge-based assistance for accessing large, poorly structured information spaces , 1993 .

[13]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[14]  Ashwin Ram Natural language understanding for information-filtering systems , 1992, CACM.

[15]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[16]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .