Propositional Logic Representations for Documents and Queries: A Large-Scale Evaluation

Expressive power is a potential source of benefits for Information Retrieval. Indeed, a number of works have been traditionally devoting their efforts to defining models able to manage structured documents. Similarly, many researchers have looked at query formulation and proposed different methods to generate structured queries. Nevertheless few attempts have addressed the combination of both expressive documents and expressive queries and its effects on retrieval performance. This is mostly due to the lack of a coherent and expressive framework in which both documents and queries can be handled in an homogeneous and efficient way. In this work we aim at filling this gap. We test the impact of logical representations for documents and queries under a large-scale evaluation. The experiments show clearly that, under the same conditions, the use of logical representations for both documents and queries leads to significant improvements in retrieval performance. Moreover, the overall performance results make evident that logic-based approaches can be competitive in the field of Information Retrieval.

[1]  David E. Losada,et al.  Embedding Term Similarity and Inverse Document Frequency into a Logical Model of Information , 2003, J. Assoc. Inf. Sci. Technol..

[2]  David E. Losada,et al.  A homogeneous framework to model relevance feedback , 2001, SIGIR '01.

[3]  Nicholas J. Belkin,et al.  The effect multiple query representations on information retrieval system performance , 1993, SIGIR.

[4]  David E. Losada,et al.  Using a belief revision operator for document ranking in extended Boolean models , 1999, SIGIR '99.

[5]  David E. Losada,et al.  Rating the impact of logical representations on retrieval performance , 2001, 12th International Workshop on Database and Expert Systems Applications.

[6]  Christian Plaunt,et al.  Subtopic structuring for full-length document access , 1993, SIGIR.

[7]  David A. Hull Using Structured Queries for Disambiguation in Cross-Language Information Retrieval , 1997 .

[8]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[9]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[10]  David E. Losada,et al.  A Logical Model for Information Retrieval based on Propositional Logic and Belief Revision , 2001, Comput. J..

[11]  Mounia Lalmas,et al.  Information Retrieval: Uncertainty and Logics: Advanced Models for the Representation and Retrieval of Information , 1998 .

[12]  Ross Wilkinson,et al.  Effective retrieval of structured documents , 1994, SIGIR '94.

[13]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[14]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[15]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[16]  C. J. van Rijsbergen,et al.  A Non-Classical Logic for Information Retrieval , 1997, Comput. J..

[17]  David E. Losada,et al.  Implementing document ranking within a logical framework , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[18]  D. Losada,et al.  Efficient algorithms for ranking documents represented as DNF formulas , 2000 .

[19]  Mukesh Dalal,et al.  Investigations into a Theory of Knowledge Base Revision , 1988, AAAI.

[20]  Jaana Kekäläinen,et al.  The impact of query structure and query expansion on retrieval performance , 1998, SIGIR '98.