论文信息 - Using the Web Efficiently: Mobile Crawlers

Using the Web Efficiently: Mobile Crawlers

Search engines have become important tools for Web navigation. In order to provide powerful search facilities, search engines maintain comprehensive indices of documents available on the Web. The creation and maintenance of Web indices is done by Web crawlers, which recursively traverse and download Web pages on behalf of search engines. Analysis of the collected information is performed after the data has been downloaded. In this research, we propose an alternative, more efficient approach to building Web indices based on mobile crawlers. Our proposed crawlers are transferred to the source(s) where the data resides in order to filter out any unwanted data locally before transferring it back to the search engine. Our approach to Web crawling is particularly well suited for implementing so-called “smart” crawling algorithms which determine an efficient crawling path based on the contents of Web pages that have been visited so far. In order to demonstrate the viability of our approach we have built a prototype mobile crawling system within the University of Florida Intranet.

Joachim Hammer | Jan Fiedler

[1] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[2] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.