Design and implementation of a high-performance distributed Web crawler
暂无分享,去创建一个
[1] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[2] Sriram Raghavan,et al. WebBase: a repository of Web pages , 2000, Comput. Networks.
[3] Steve Chien,et al. Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.
[4] Hector Garcia-Molina,et al. The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.
[5] Marc Najork,et al. On near-uniform URL sampling , 2000, Comput. Networks.
[6] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[7] Andrew McCallum,et al. Using Reinforcement Learning to Spider the Web Efficiently , 1999, ICML.
[8] Marc Najork,et al. Measuring Index Quality Using Random Walks on the Web , 1999, Comput. Networks.
[9] Sriram Raghavan,et al. Crawling the Hidden Web , 2001, VLDB.
[10] Marc Najork,et al. Breadth-First Search Crawling Yields High-Quality Pages , 2001 .
[11] Sriram Raghavan,et al. Searching the Web , 2001, ACM Trans. Internet Techn..
[12] Ian H. Witten,et al. Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .
[13] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[14] Michael Kaufmann,et al. Derandomizing algorithms for routing and sorting on meshes , 1994, SODA '94.
[15] Marc Najork,et al. Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.
[16] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD '00.
[17] Marc Najork,et al. Breadth-first crawling yields high-quality pages , 2001, WWW '01.
[18] Torsten Suel,et al. Compressing the graph structure of the Web , 2001, Proceedings DCC 2001. Data Compression Conference.
[19] Andrei Z. Broder,et al. The Connectivity Server: Fast Access to Linkage Information on the Web , 1998, Comput. Networks.
[20] Jerome Talim,et al. Controlling the robots of Web search engines , 2001, SIGMETRICS '01.
[21] Jon M. Kleinberg,et al. Mining the Web's Link Structure , 1999, Computer.
[22] Soumen Chakrabarti,et al. Distributed Hypertext Resource Discovery Through Examples , 1999, VLDB.
[23] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.