WebGlimpse: combining browsing and searching

The two paradigms of searching and browsing are currently almost always used separately. One can either look at the library card catalog, or browse the shelves; one can either search large WWW sites (or the whole web), or browse page by page. In this paper we describe a software tool we developed, called WebGlimpse, that combines the two paradigms. It allows the search to be limited to a neighborhood of the current document. WebGlimpse automatically analyzes collections of web pages and computes those neighborhoods (at indexing time). With WebGlimpse users can browse at will, using the same pages; they can also jump from each page, through a search, to "close-by" pages related to their needs. In a sense, our combined paradigm allows users to browse using hypertext links that are constructed on the fly through a neighborhood search. The design of WebGlimpse concentrated on four goals: fast search, efficient indexing (both in terms of time and space), flexible facilities for defining neighborhoods, and non-wasteful use of Internet resources. Our implementation was geared towards the World-Wide Web, but the general design is applicable to any large-scale information bases. We believe that the concept of combining browsing and searching is very powerful, and deserves much more attention. Further information about WebGlimpse, including the complete source code, documentations, demos, and examples of use, can be found at http://glimpse.cs.arizona.edu/webglimpse/.