Using Supertags in Document Filtering: the Eeect of Increased Context on Information Retrieval Eeectiveness

The syntactic information latent in any coherent text can be exploited to overcome some inadequacies of keyword-based retrieval and make information retrieval more e ective. We have earlier quantitatively demonstrated how syntactic information is useful in ltering out irrelevant documents. We have implemented a system which exploits a rich syntactic representation of supertags in a exible manner to lter documents for relevance. The system has been tested on a large collection of newswire sentences, and achieves recall and precision gures of 88% and 97% for ltering out irrelevant documents. Its performance and modularity makes it a promising postprocessing addition to any Information Retrieval system. In this paper we examine how the performance of this system is a ected by varying the context provided to the system and show that the experimentally determined optimal size of the context validates linguistic intuitions.