On Modeling of Concept Based Retrieval in Generalized Vector Spaces

One of the main issues in the field of information retrieval is to bridge the terminological gap existing between the way in which users specify their information needs and the way in which queries are expressed. One of the approaches for this purpose, called Rule Based Information Retrieval by Computer (RUBRIC), involves the use of production rules to capture user query concepts (or topics). In RUBRIC, a set of related production rules is represented as an AND/OR tree. The retrieval output is determined by Boolean evaluation of the AND/OR tree. However, since the Boolean evaluation ignores the termterm association unless it is explicitly represented in the tree, the terminological gap between users' queries and their information needs can still remain. To solve this problem, we adopt the generalized vector space model (GVSM) in which the term-term association is well established, and extend the RUBRIC model based on GVSM. Experiments have been performed on some variations of the extended RUBRIC model, and the results have also been compared to the original RUBRIC model based on recall-precision.