Intelligent crawling of web applications for web archiving
暂无分享,去创建一个
[1] Yida Wang,et al. iRobot: an intelligent crawler for web forums , 2008, WWW.
[2] Kristinn Sigurðsson. Incremental Crawling with Heritrix , 2010 .
[3] Lars Littig. Classifying web sites , 2007, WWW '07.
[4] Arnaud Sahuguet,et al. Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F , 1999, VLDB.
[5] Tok Wang Ling,et al. A rule-based query language for HTML , 2001, Proceedings Seventh International Conference on Database Systems for Advanced Applications. DASFAA 2001.
[6] Hao Zhang,et al. Path sharing and predicate evaluation for high-performance XML filtering , 2003, TODS.
[7] Julien Masanès,et al. Web Archiving , 2014, Encyclopedia of Social Network Analysis and Mining.
[8] Federico Girosi,et al. An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.
[9] Timothy W. Finin,et al. SVMs for the Blogosphere: Blog Identification and Splog Detection , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[10] David Carmel,et al. The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.
[11] Hiroyuki Kitagawa,et al. Wraplet: Wrapping Your Web Contents with a Lightweight Language , 2007, 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System.
[12] Yan Guo,et al. Board Forum Crawling: A Web Crawling Method for Web Forum , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).
[13] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[14] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[15] I-Chen Wu,et al. ON DESIGN OF BROWSER-ORIENTED DATA EXTRACTION SYSTEM AND THE PLUG-INS , 2010 .
[16] Juliana Freire,et al. An adaptive crawler for locating hidden-Web entry points , 2007, WWW '07.
[17] Christoph Lindemann,et al. Coarse-grained classification of web sites by their structural properties , 2006, WIDM '06.
[18] Andrew Tomkins,et al. The volume and evolution of web page templates , 2005, WWW '05.
[19] Tim Furche,et al. OXPath , 2011, Proc. VLDB Endow..
[20] Julien Masanès. Web Archiving , 2014, Encyclopedia of Social Network Analysis and Mining.
[21] Dennis Shasha,et al. WebFilter: A High-throughput XML-based Publish and Subscribe System , 2001, VLDB.
[22] Juliana Freire,et al. Searching for Hidden-Web Databases , 2005, WebDB.
[23] Ruihua Song,et al. Joint optimization of wrapper generation and template detection , 2007, KDD '07.
[24] Jeffrey F. Naughton,et al. Declarative Information Extraction Using Datalog with Embedded Extraction Predicates , 2007, VLDB.