Archiving the web using page changes patterns: a case study
暂无分享,去创建一个
[1] Yuanyuan Zhou,et al. Association Proceedings of the Third USENIX Conference on File and Storage Technologies San Francisco , CA , USA March 31 – April 2 , 2004 , 2004 .
[2] Frank M. Shipman,et al. Managing change on the web , 2001, JCDL '01.
[3] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[4] George Cybenko,et al. How dynamic is the Web? , 2000, Comput. Networks.
[5] Ramanathan V. Guha,et al. Information diffusion through blogspace , 2004, WWW '04.
[6] Julien Masanès,et al. Web Archiving , 2014, Encyclopedia of Social Network Analysis and Mining.
[7] Wei-Ying Ma,et al. Learning block importance models for web pages , 2004, WWW '04.
[8] Gerhard Weikum,et al. SHARC: Framework for Quality-Conscious Web Archiving , 2009, Proc. VLDB Endow..
[9] Jiawei Han,et al. Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.
[10] Philip S. Yu,et al. Optimal crawling strategies for web search engines , 2002, WWW '02.
[11] Sandeep Pandey,et al. User-centric Web crawling , 2005, WWW '05.
[12] Stéphane Gançarski,et al. Vi-DIFF: Understanding Web Pages Changes , 2010, DEXA.
[13] Yun Chi,et al. Monitoring RSS Feeds Based on User Browsing Pattern , 2007, ICWSM.
[14] Frank M. Shipman,et al. Application of kalman filters to identify unexpected change in blogs , 2008, JCDL '08.
[15] Jaideep Srivastava,et al. Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.
[16] Susan T. Dumais,et al. Resonance on the web: web dynamics and revisitation patterns , 2009, CHI.
[17] Serge Abiteboul,et al. A First Experience in Archiving the French Web , 2002, ECDL.
[18] Hayato Yamana,et al. Exploiting idle CPU cores to improve file access performance , 2009, ICUIMC '09.
[19] Andreas Rauber,et al. Proceedings of the 10th International Web Archiving Workshop (IWAW 2010), in conjunction with the 7th International Conference on Preservation of Digital Objects (iPRES2010), Vienna, Austria, September 22 - September 23, 2010. , 2009 .
[20] Yuanyuan Zhou,et al. CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.
[21] Frank M. Shipman,et al. Perception of content, structure, and presentation changes in Web-based hypertext , 2001, Hypertext.
[22] Susan T. Dumais,et al. The web changes everything: understanding the dynamics of web content , 2009, WSDM '09.
[23] Hector Garcia-Molina,et al. Effective page refresh policies for Web crawlers , 2003, TODS.
[24] Daniel Gomes,et al. Managing duplicates in a web archive , 2006, SAC.
[25] Jiawei Han,et al. Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[26] Hyun-Kyu Cho,et al. Efficient Monitoring Algorithm for Fast News Alerts , 2007, IEEE Transactions on Knowledge and Data Engineering.
[27] George Cybenko,et al. Keeping up with the changing Web , 2000, Computer.
[28] G. Weikum,et al. The SOLAR System for Sharp Web Archiving , 2010 .
[29] Marilena Oita,et al. Archiving Data Objects using Web Feeds , 2010 .
[30] Michalis Vazirgiannis,et al. Archiving the Greek Web , 2004 .
[31] Hector Garcia-Molina,et al. The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.
[32] Stéphane Gançarski,et al. Using visual pages analysis for optimizing web archiving , 2010, EDBT '10.
[33] Julien Masanès. Web Archiving , 2014, Encyclopedia of Social Network Analysis and Mining.
[34] Hector Garcia-Molina,et al. Estimating frequency of change , 2003, TOIT.
[35] Ricardo A. Baeza-Yates,et al. Scheduling algorithms for Web crawling , 2004, WebMedia and LA-Web, 2004. Proceedings.
[36] Jenny Edwards,et al. An adaptive model for optimizing performance of an incremental web crawler , 2001, WWW '01.
[37] Sandeep Pandey,et al. Recrawl scheduling based on information longevity , 2008, WWW.
[38] Gerhard Weikum,et al. “Catch me if you can”: visual Analysis of Coherence Defects in Web Archiving , 2009 .
[39] Risto Vaarandi,et al. A data clustering algorithm for mining patterns from event logs , 2003, Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No.03EX764).
[40] Wei-Ying Ma,et al. VIPS: a Vision-based Page Segmentation Algorithm , 2003 .
[41] Frank M. Shipman,et al. Longitudinal study of changes in blogs , 2007, JCDL '07.
[42] Gerhard Weikum,et al. Data quality in web archiving , 2009, WICOW.
[43] Kanak Saxena,et al. Significant Interval and Frequent Pattern Discovery in Web Log Data , 2010, ArXiv.
[44] Myra Spiliopoulou,et al. Monitoring the Evolution of Web Usage Patterns , 2003, EWMF.
[45] Mong-Li Lee,et al. Efficient Mining of XML Query Patterns for Caching , 2003, VLDB.