Web Search -- Your Way: Improving Web searching with user preferences

We describe a metasearch engine architecture, in use at NEC Research Institute, that allows users to provide preferences in the form of an information need category. This extra information is used to direct the search process, providing more valuable results than by considering only the query. Using our architecture, identical keyword queries may be sent to different search engines, and results may be scored differently for different users. Unlike typical search (or metasearch) engines, our architecture considers the user’s information need when determining which sources are queried, how queries are modified for those sources, and how to score the retrieved results. Each of these can vary independently from the keyword query. The Web is a very large collection of heterogenous documents, however, Web pages are unlike typical documents in traditional databases. Pages can be active (animations, Java), can be automatically generated in real time (current stock prices or weather information), and may contain multimedia (sound or video). The authors of Web pages have very diverse backgrounds, knowledge, cultures, and aims. Furthermore, the availability of metadata is inconsistent (for example, some authors use the HTML heading tags to denote headings and subheadings in their text, while others use different methods, such as the HTML font tags or images). Efforts such as XML and Dublin Core aim to improve metadata, however, it seems unlikely that all Web page authors will adhere to complex standards. Only about oneImproving Web searching with user preferences.