A Scalable Approach to Harvest Modern Weblogs
暂无分享,去创建一个
[1] Eric C. Jensen,et al. Metadata Encoding and Transmission Standard , 2009, Encyclopedia of Database Systems.
[2] Steffen Staab,et al. SXPath - Extending XPath towards Spatial Querying on Web Documents , 2010, Proc. VLDB Endow..
[3] Georg Gottlob,et al. The Lixto data extraction project: back and forth between theory and practice , 2004, PODS.
[4] Lance Porter,et al. Uses and Perceptions of Blogs: A Report on Professional Journalists and Journalism Educators , 2007 .
[5] J. Wiest,et al. The Arab Spring| Social Media in the Egyptian Revolution: Reconsidering Resource Mobilization Theory , 2011 .
[6] Craig A. Knoblock,et al. Hierarchical Wrapper Induction for Semistructured Information Sources , 2004, Autonomous Agents and Multi-Agent Systems.
[7] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[8] Nikos Kasioumis,et al. Towards building a blog preservation platform , 2014, World Wide Web.
[9] Douglas C. Schmidt,et al. Active object: an object behavioral pattern for concurrent programming , 1996 .
[10] Robert Hundt,et al. Loop Recognition in C++/Java/Go/Scala , 2011 .
[11] Marilena Oita,et al. Archiving Data Objects using Web Feeds , 2010 .
[12] Christoph Meinel,et al. Mapping the Blogosphere--Towards a Universal and Scalable Blog-Crawler , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.
[13] L. R. Dice. Measures of the Amount of Ecologic Association Between Species , 1945 .
[14] Kay G. Johnson. Are Blogs Here to Stay?: An Examination of the Longevity and Currency of a Static List of Library and Information Science Weblogs , 2008 .
[15] Tim Furche,et al. OXPath: A language for scalable data extraction, automation, and crawling on the deep web , 2012, The VLDB Journal.
[16] Alberto H. F. Laender,et al. Automatic web news extraction using tree edit distance , 2004, WWW '04.
[17] S. Amerio,et al. EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (CERN) , 2011 .
[18] Torsten Suel,et al. Design and implementation of a high-performance distributed Web crawler , 2002, Proceedings 18th International Conference on Data Engineering.
[19] Marc Najork,et al. Mercator: A scalable, extensible Web crawler , 1999, World Wide Web.
[20] Sahibsingh A. Dudani. The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.
[21] Charlie Lindahl,et al. Weblogs: Simplifying Web Publishing , 2003, Computer.
[22] Muhammad Faheem. Intelligent crawling of web applications for web archiving , 2012, WWW.
[23] Alexandra I. Cristea,et al. Self-supervised Automated Wrapper Generation for Weblog Data Extraction , 2013, BNCOD.
[24] Sebastiano Vigna,et al. UbiCrawler: a scalable fully distributed Web crawler , 2004, Softw. Pract. Exp..
[25] Brian Lavoie. Meeting the challenges of digital preservation: the OAIS reference model , 2000 .
[26] Matthias Trier,et al. The Blogosphere as Oeuvre: Individual and Collective Influence on Bloggers , 2012, ECIS 2012.
[27] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.
[28] Linda Cantara. METS: The Metadata Encoding and Transmission Standard , 2005 .