暂无分享,去创建一个
Jian Wu | Michael L. Nelson | C. Lee Giles | Yasith Jayawardana | Sampath Jayarathna | Alexander C. Nwala | Gavindya Jayawardena
[1] Yoelle Maarek,et al. The Shark-Search Algorithm. An Application: Tailored Web Site Mapping , 1998, Comput. Networks.
[2] C. Lee Giles,et al. The evolution of a crawling strategy for an academic document search engine: whitelists and blacklists , 2012, WebSci '12.
[3] SangKeun Lee,et al. Novel approaches to crawling important pages early , 2012, Knowledge and Information Systems.
[4] C. Lee Giles,et al. CiteSeer: an automatic citation indexing system , 1998, DL '98.
[5] Richard P. Brent,et al. An Algorithm with Guaranteed Convergence for Finding a Zero of a Function , 1971, Comput. J..
[6] Herbert Van de Sompel,et al. Memento: Time Travel for the Web , 2009, ArXiv.
[7] Frank Shipman,et al. Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).
[8] D. Hochbaum,et al. Analysis of the greedy approach in problems of maximum k‐coverage , 1998 .
[9] Marc Najork,et al. Web Crawling , 2010, Found. Trends Inf. Retr..
[10] J-L. Funck Brentano,et al. Intelligent Multimedia Information Retrieval Systems and Management - Volume 1 , 1994 .
[11] Geert-Jan Houben,et al. Information Retrieval in Distributed Hypertexts , 1994, RIAO.
[12] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD 2000.
[13] Miguel Costa,et al. A Survey on Web Archiving Initiatives , 2011, TPDL.
[14] Matthew Farrell,et al. Web Archiving in the United States - A 2017 Survey , 2014 .
[15] Zhen Liu,et al. Optimal Robot Scheduling for Web Search Engines , 1998 .
[16] Jon M. Kleinberg,et al. Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.
[17] Hector Garcia-Molina,et al. The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.
[18] Jian Wu,et al. CiteSeerX: 20 years of service to scholarly big data , 2019, AIDR.
[19] Hector Garcia-Molina,et al. Estimating frequency of change , 2003, TOIT.
[20] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[21] Mike Thelwall,et al. Graph structure in three national academic Webs: Power laws with anomalies , 2003, J. Assoc. Inf. Sci. Technol..
[22] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[23] Ricardo A. Baeza-Yates,et al. Scheduling algorithms for Web crawling , 2004, WebMedia and LA-Web, 2004. Proceedings.
[24] Sandeep Pandey,et al. Recrawl scheduling based on information longevity , 2008, WWW.
[25] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD '00.
[26] Brad Tofel. ‘Wayback’ for Accessing Web Archives , 2007 .
[27] Wallace Koehler,et al. Web page change and persistence - A four-year longitudinal study , 2002, J. Assoc. Inf. Sci. Technol..
[28] Paul N. Bennett,et al. Predicting content change on the web , 2013, WSDM.
[29] Thomas Risse,et al. iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling , 2015, JCDL.