Automatic construction of rule-based trees for conceptual retrieval

Many intelligent retrieval approaches have been studied to bridge the terminological gap existing between the way in which users specify their information needs and the way in which queries are expressed. One of the approaches, called RUBRIC (RUle-Based Retrieval of Information by Computer), uses production rules to capture user query concepts (or topics). A set of related production rules is represented as an AND/OR tree, called a rule-based tree. One of the main problems in this approach is how to construct such rules that can capture user query concepts. This paper provides a logical framework that is semantically essential to defining the rules for the user query concepts, and proposes a way to automatically construct rule-based trees from typical thesauri. Experiments performed on small collections with a domain-specific thesaurus show that the automatically constructed rules are more effective than hand-made rules in terms of precision.