Parallel crawler architecture and web page change detection
暂无分享,去创建一个
[1] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[2] Frank M. Shipman,et al. Perception of content, structure, and presentation changes in Web-based hypertext , 2001, Hypertext.
[3] Serge Abiteboul,et al. Detecting changes in XML documents , 2002, Proceedings 18th International Conference on Data Engineering.
[4] Hector Garcia-Molina,et al. Parallel crawlers , 2002, WWW.
[5] Hector Garcia-Molina,et al. Effective page refresh policies for Web crawlers , 2003, TODS.
[6] 廷冕 李,et al. 応用 (Application) について , 1981 .
[7] Zhen Liu,et al. Optimal Robot Scheduling for Web Search Engines , 1998 .
[8] Daniel Rocco,et al. Efficient web change monitoring with page digest , 2004, WWW Alt. '04.
[9] Daniel Rocco,et al. Page Digest for large-scale Web services , 2003, EEE International Conference on E-Commerce, 2003. CEC 2003..
[10] L. Khan,et al. Change Detection of XML Documents Using Signatures , 2002 .
[11] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD '00.
[12] George Samaras,et al. Distributed location aware web crawling , 2004, WWW Alt. '04.
[13] David J. DeWitt,et al. X-Diff: an effective change detection algorithm for XML documents , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[14] Calton Pu,et al. WebCQ-detecting and delivering information changes on the web , 2000, CIKM '00.
[15] Edward A. Fox,et al. Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries , 2001 .
[16] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[17] Minoru Uehara,et al. Distributed information retrieval by using cooperative meta search engines , 2001, Proceedings 21st International Conference on Distributed Computing Systems Workshops.
[18] Hector Garcia-Molina,et al. Estimating frequency of change , 2003, TOIT.
[19] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD 2000.
[20] Christopher Olston,et al. What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.
[21] Curtis E. Dyreson,et al. Schema-Less, Semantics-Based Change Detection for XML Documents , 2004, WISE.
[22] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[23] Marc Najork,et al. A large‐scale study of the evolution of Web pages , 2003, WWW '03.
[24] Divakar Yadav,et al. Architecture for Parallel Crawling and Algorithm for Change Detection in Web Pages , 2007 .
[25] Marc Najork,et al. Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.
[26] A. K. Sharma,et al. Change Detection in Web Pages , 2007, 10th International Conference on Information Technology (ICIT 2007).
[27] David Eichmann,et al. The RBSE spider — Balancing effective search against Web load , 1994, WWW Spring 1994.
[28] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[29] Frank M. Shipman,et al. Managing change on the web , 2001, JCDL '01.
[30] Philip S. Yu,et al. Optimal crawling strategies for web search engines , 2002, WWW '02.
[31] Dustin Boswell. Distributed High-performance Web Crawlers : A Survey of the State of the Art , 2003 .