SE-LEGO: creating metasearch engines on demand

Extended Abstract As a system that provides unified access to multiple existing search systems, a metasearch engine can alleviate ordinary users from the formidable task of identifying useful sources and searching them individually. At present, the largest metasearch engines such as ProFusion (www.profusion.com) and SavvySearch (www.search.com) can connect to about 1,000 search engines. This means that only a small fraction of the information sources on the Web, including both the Surface Web and the Deep Web, are connected, as the number of such sources is estimated to be in the order of hundreds of thousands [1]. Most of these Websites have their own search capabilities and provide search interfaces. Many of these Websites provide high quality information that has been frequently queried by specialists and researchers in particular fields. Present major metasearch engines usually do not connect to these specialized Websites. Currently, building a metasearch engine is an expensive and labor-intensive job that needs diverse expertise. As a result, it is difficult for an ordinary Web user to create a metasearch engine based on the search engines of the user’s choice. Some metasearch engine companies (e.g., ProFusion) allow user to build customized metasearch engines, but only search engines in a pre-compiled list can be used because the capability to connect to these search engines need to be established in advance.