Lycos: design choices in an Internet search service

One of the enabling technologies of the World Wide Web, along with browsers, domain name servers, and hypertext markup language, is the search engine. Although the Web contains over 100 million pages of information, those millions of pages are useless if you cannot find the pages you need. All major Web search engines operate the same way: a gathering program explores the hyperlinked documents of the Web, foraging for Web pages to index. These pages are stockpiled by storing them in some kind of database or repository. Finally, a retrieval program takes a user query and creates a list of links to Web documents matching the words, phrases, or concepts in the query. Although the retrieval program itself is correctly called a search engine, by popular usage the term now means a database combined with a retrieval program. For example, the Lycos search engine comprises the Lycos Catalog of the Internet and the Pursuit retrieval program. This paper describes the Lycos system for collecting, storing, and retrieving information about pages on the Web. After outlining the history and precursors of the Lycos system, the paper discusses some of the design choices made in building this Web indexer and touches briefly on the economic issues involved in working with very large retrieval systems.