DynWebStats: A Framework for Determining Dynamic and Up-to-date Web Indicators
暂无分享,去创建一个
Adriano M. Pereira | Wagner Meira | Alexandre Barbosa | Israel Guerra | Diogo Marques Santa | Vagner Diniz | Heitor Ganzeli | Marcelo Pitta
[1] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[2] Christopher Krügel,et al. Relevant change detection: a framework for the precise extraction of modified and novel web-based content as a filtering technique for analysis engines , 2014, WWW.
[3] Tamal Das,et al. Sequence Estimation over Finite-State Markov Channel via the Expectation Maximization Algorithm , 2007 .
[4] George Cybenko,et al. How dynamic is the Web? , 2000, Comput. Networks.
[5] Marc Najork,et al. Web Crawling , 2010, Found. Trends Inf. Retr..
[6] Andrei Z. Broder,et al. Graph structure in the Web , 2000, Comput. Networks.
[7] Ahmed Patel,et al. Empirical evaluation of the link and content-based focused Treasure-Crawler , 2013, Comput. Stand. Interfaces.
[8] Ari Pirkola,et al. Addressing the limited scope problem of focused crawling using a result merging approach , 2010, SAC '10.
[9] Torsten Suel,et al. Design and implementation of a high-performance distributed Web crawler , 2002, Proceedings 18th International Conference on Data Engineering.
[10] Ricardo Baeza-Yates,et al. Um novo retrato da Web brasileira , 2005 .
[11] Hector Garcia-Molina,et al. Parallel crawlers , 2002, WWW.
[12] Juliana Freire,et al. An adaptive crawler for locating hidden-Web entry points , 2007, WWW '07.
[13] Anja Feldmann,et al. Rate of Change and other Metrics: a Live Study of the World Wide Web , 1997, USENIX Symposium on Internet Technologies and Systems.
[14] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .
[15] Hector Garcia-Molina,et al. Synchronizing a database to improve freshness , 2000, SIGMOD '00.
[16] Ricardo A. Baeza-Yates,et al. Characterization of national Web domains , 2007, TOIT.
[17] Marc Najork,et al. Breadth-first crawling yields high-quality pages , 2001, WWW '01.
[18] Wojciech Rytter,et al. Efficient web searching using temporal factors , 1999, Theor. Comput. Sci..
[19] Vipin Kumar,et al. Discovery of Web Robot Sessions Based on their Navigational Patterns , 2004, Data Mining and Knowledge Discovery.
[20] Sriram Raghavan,et al. Crawling the Hidden Web , 2001, VLDB.
[21] M. Koster,et al. Robots in the Web : threat or treat ? , 1995, WWW Spring 1995.
[22] Ricardo A. Baeza-Yates,et al. Scheduling algorithms for Web crawling , 2004, WebMedia and LA-Web, 2004. Proceedings.
[24] Vivian Cothey,et al. Web-crawling reliability , 2004, J. Assoc. Inf. Sci. Technol..
[25] Carlos Castillo,et al. Effective web crawling , 2005, SIGF.
[26] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[27] A. K. Sharma,et al. Architecture for Parallel Crawling and Algorithm for Change Detection in Web Pages , 2007, 10th International Conference on Information Technology (ICIT 2007).
[28] Zhenglu Yang,et al. Fires on the Web: Towards Efficient Exploring Historical Web Graphs , 2010, DASFAA.
[29] Jerome Talim,et al. Controlling the robots of Web search engines , 2001, SIGMETRICS '01.
[30] Herbert Van de Sompel,et al. Archival HTTP redirection retrieval policies , 2013, WWW '13 Companion.
[31] Kristinn Sigurðsson. Incremental Crawling with Heritrix , 2010 .
[32] Ricardo A. Baeza-Yates,et al. Understanding Content Reuse on the Web: Static and Dynamic Analyses , 2006, WEBKDD.
[33] Daniel Gomes,et al. Design and Selection Criteria for a National Web Archive , 2006, ECDL.