论文信息 - Automatic information discovery from the "invisible Web"

Automatic information discovery from the "invisible Web"

A large amount of online information resides on the "invisible Web" - Web pages that are generated dynamically from databases and other data sources hidden from the user. They are not indexed by a static URL but are generated when queries are made via a search interface (a specialized search engine). In this paper, we propose a system that is capable of automatically making use of these specialized engines to find information on the invisible Web. We describe our overall architecture and process: from obtaining the search engines to picking the right engines to query. Experiments show that we can find information that is not found by traditional search engines.

Hui Chen | King-Ip Lin

[1] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2] Heikki Mannila,et al. Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[3] Mika Klemettinen,et al. Applying data mining techniques for descriptive phrase extraction in digital document collections , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[4] Sriram Raghavan,et al. Crawling the Hidden Web , 2001, VLDB.

[5] Heikki Mannila,et al. Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[6] B. Huberman,et al. The Deep Web : Surfacing Hidden Value , 2000 .

[7] Oren Etzioni,et al. Query routing for Web search engines: architecture and experiments , 2000, Comput. Networks.