Searching Distributed Hypermedia

One of the main problems with the WWW is resource discovery, defined as “finding the right information” [FP94]. The current trend to solve this problem is to create indexes such as Archie, Veronica, WWW, Gifford’s Semantic File System, Gopher, WAIS [Bra94, Gil94]. In general, these indexes are kept in either one, or else a rather limited number of places. A consequence of this is that the sites that provide these index facilities are often overloaded. Also, to build these indexes, complete files from an enormous number of sites have to be retrieved, consequently overloading the network. Finally, with the ever-growing number of documents, it takes too long to construct these indexes, so they are permanently out-of-date. A proposal to overcome these problems is Harvest [BDMS94]. Index information is gathered decentrally by Gatherers and sent to a large number of index servers, called Brokers. By placing the Gatherers at the servers that provide information both the network and server load are reduced.