Hybrid Hill-Climbing and Knowledge-Based Methods for Intelligent News Filtering

As the size of the Internet increases, the amount of data available to users has dramatically risen, resulting in an information overload for users. This work involved the creation of an intelligent information news filtering system named INFOS (Intelligent News Filtering Organizational System) to reduce the user's search burden by automatically eliminating Usenet news articles predicted to be irrelevant. These predictions are learned automatically by adapting an internal user model that is based upon features taken from articles and collaborative features derived from other users. The features are manipulated through keyword-based techniques and knowledge-based techniques to perform the actual filtering. Knowledge-based systems have the advantage of analyzing input text in detail, but at the cost of computational complexity and the difficulty of scaling up to large domains. In contrast, statistical and keyword approaches scale up readily but result in a shallower understanding of the input. A hybrid system integrating both approaches improves accuracy over keyword approaches, supports domain knowledge, and retains scalability. The system would be enhanced by more robust word disambiguation.

[1]  Pattie Maes,et al.  Collaborative Interface Agents , 1994, AAAI.

[2]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[3]  Kenrick Mock,et al.  Intelligent information filtering via hybrid techniques: hill climbing, case-based reasoning, index patterns, and genetic algorithms , 1996 .

[4]  Frank Curtis Stevens,et al.  Knowledge-based assistance for accessing large, poorly structured information spaces , 1993 .

[5]  Andrew Jennings,et al.  A Personal News Service Based on a User Model Neural Network , 1992 .

[6]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[7]  W. Bruce Croft,et al.  An Approach to Incorporating CBR Concepts in IR Systems , 1993 .

[8]  Michael L. Mauldin,et al.  Conceptual Information Retrieval: A Case Study in Adaptive Partial Parsing , 1991 .

[9]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[10]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[11]  Beerud Dilip Sheth,et al.  A learning approach to personalized information filtering , 1994 .

[12]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[13]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[14]  Mary Hart,et al.  Automatic indexing using selective NLP and first-order thesauri , 1991, RIAO.

[15]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .

[16]  Ashwin Ram Natural language understanding for information-filtering systems , 1992, CACM.

[17]  R. E. Eberts,et al.  Knowledge acquisition using neural networks for intelligent interface design , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.