A RULE-BASED APPROACH TO INFORMATION SONE RESULTS AND COMMENTS BETBIEVAL:

Richard M. Tong, Daniel G. Shapiro, Brian P. McCune and Jeffrey S. Dean Advanced Information & Decision Systems 201 San Antonio Circle, Suite 286 Mountain View, CA 94040, USA. This paper is a report of our early efforts to use a rule-based approach in the information retrieval task. We have developed a prototype system that allows the user to specify his or her retrieval concept as a hierarchy of sub-concepts which are then implemented as a set of production rules. The paper contains a brief description of the system and some of the preliminary testing we have done. In particular, we make some observations on the need for an appropriate language for expressing conceptual queries, and on the interactions between rule formulation and uncertainty representation. I THE INFORMATION RETRIEVAL PROBLEM Existing approaches to textual information retrieval suffer from problems of precision and recall, understandability, and scope of applicability. Boolean keyword retrieval systems (such as Lockheed’s DIALOG) operate at a lexica 1 level, and hence ignore much of the available information that is syntactic, semantic, or contextual. The underlying reasoning behind the responses of statistical retrieval systems [41 is difficult to explain to a user in an understandable and intuitive way, and systems that rely on a semantic understanding [ 51 must severely restrict the style and content of the natural language in the documents. In the near future, large online document repositories will be made available via computer networks to relatively naive computer users. In this context, it is important that future retrieval systems possess the following attributes: (1) Queries should be posed at the user’s own conceptual level, using his or her vocabulary of concepts, and without requiring camp lex programming . (2) The system should be able to provide partial matching of queries to documents, thereby acknowledging the inherent imprecision in the concept of a relevant document. (3) The number of documents retri eved should be dependent upon the needs of the user (e.g., (4)