Searching the Web with Server-side Filtering of Irrelevant Information Searching the Web with Server-side Filtering of Irrelevant Information

Even experienced users of IR systems experience a high degree of frustration in searching for information on the World Wide Web, in part because current search engines concentrate on speed and coverage at the expense of precision. In this paper, we describe an approach to increase precision of retrieval based on ltering out irrelevant material. Potentially relevant matches got from a standard Web search engine are ltered using, for example, augmented patterns derived from syntactic structure inherent in natural language text. We argue that the performance of these and other methods of ltering for IR can be improved by the notion of server side scripting, a concept which has not been exploited yet. We describe an implementation of such a system, and discuss issues that arise out this model of improving IR. We conclude with a discussion of areas where this mode of ltering is most appropriate. Abstract Even experienced users of IR systems experience a high degree of frustration in searching for information on the World Wide Web, in part because current search engines concentrate on speed and coverage at the expense of precision. In this paper, we describe an approach to increase precision of retrieval based on ltering out irrelevant material. Potentially relevant matches got from a standard Web search engine are ltered using, for example, augmented patterns derived from syntactic structure inherent in natural language text. We argue that the performance of these and other methods of ltering for IR can be improved by the notion of server side scripting, a concept which has not been exploited yet. We describe an implementation of such a system, and discuss issues that arise out this model of improving IR. We conclude with a discussion of areas where this mode of ltering is most appropriate.