The Viúva Negra crawler: an experience report
暂无分享,去创建一个
[1] Junghoo Cho,et al. Impact of search engines on page popularity , 2004, WWW '04.
[2] Ricardo A. Baeza-Yates,et al. Crawling the Infinite Web: Five Levels Are Enough , 2004, WAW.
[3] Andrei Z. Broder,et al. Efficient URL caching for world wide web crawling , 2003, WWW '03.
[4] Binzhang Liu. Characterizing Web Response Time , 1998 .
[5] Daniel Gomes,et al. Design and Selection Criteria for a National Web Archive , 2006, ECDL.
[6] Tim Berners-Lee,et al. Hypertext transfer protocol--http/i , 1993 .
[7] Edleno Silva de Moura,et al. Detecção de Réplicas Utilizando Conteúdo e Estrutura , 2005, SBBD.
[8] Sebastiano Vigna,et al. UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..
[9] Roy T. Fielding,et al. Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.
[10] Steffen Staab,et al. On deep annotation , 2003, WWW '03.
[11] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[12] Andrei Z. Broder,et al. A Comparison of Techniques to Find Mirrored Hosts on the WWW , 2000, IEEE Data Eng. Bull..
[13] Daniel Gomes,et al. Managing duplicates in a web archive , 2006, SAC.
[14] Mário J. Silva,et al. The WebCAT framework automatic generation of meta-data for Web resources , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).
[15] Craig E. Wills. Examining the Cacheability of User-Requested Web Resources , 1999 .
[16] Allison Woodruff,et al. An Investigation of Documents from the World Wide Web , 1996, Comput. Networks.
[17] Linda Dailey Paulson,et al. Building Rich Web Applications with Ajax , 2005, Computer.
[18] Kevin S. McCurley,et al. Ranking the web frontier , 2004, WWW '04.
[19] Mário J. Silva,et al. Language identification in web pages , 2005, SAC '05.
[20] Torsten Suel,et al. Design and implementation of a high-performance distributed Web crawler , 2002, Proceedings 18th International Conference on Data Engineering.
[21] Daniel Gomes,et al. Versus: A Web Repository , 2002 .
[22] Sriram Raghavan,et al. Crawling the Hidden Web , 2001, VLDB.
[23] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[24] Elliot Berk,et al. JLex: A lexical analyzer generator for Java , 2004 .
[25] Marc Najork,et al. Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.
[26] Daniel Gomes,et al. Characterizing a national community web , 2005, TOIT.
[27] Marc Abrams,et al. Analysis of Sources of Latency in Downloading Web Pages , 2000, WebNet.
[28] Hector Garcia-Molina,et al. Crawler-Friendly Web Servers , 2000, PERV.
[29] Phillip Hallam-Baker,et al. Session Identification URI , 1996 .
[30] Vivian Cothey,et al. Web-crawling reliability , 2004, J. Assoc. Inf. Sci. Technol..
[31] Anja Feldmann,et al. Rate of Change and other Metrics: a Live Study of the World Wide Web , 1997, USENIX Symposium on Internet Technologies and Systems.
[32] Hector Garcia-Molina,et al. Parallel crawlers , 2002, WWW.
[33] Sriram Raghavan,et al. Stanford WebBase components and applications , 2006, TOIT.
[34] David Barr,et al. Common DNS Operational and Configuration Errors , 1996, RFC.
[35] Mike Thelwall,et al. A free database of university web links: data collection issues , 2002 .
[36] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[37] Ricardo A. Baeza-Yates,et al. On the image content of the Chilean Web , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).
[38] Hongfei Yan,et al. Architectural design and evaluation of an efficient web-crawling system , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[39] Rajeev Motwani,et al. Stratified Planning , 2009, IJCAI.
[40] Hanoch Levy,et al. Evaluating web user perceived latency using server side measurements , 2003, Comput. Commun..
[41] Andy Cockburn,et al. What do web users do? An empirical analysis of web use , 2001, Int. J. Hum. Comput. Stud..