Self-supervised Automated Wrapper Generation for Weblog Data Extraction
暂无分享,去创建一个
Alexandra I. Cristea | Mike Joy | Karen Stepanyan | George Gkotsis | M. Joy | A. Cristea | G. Gkotsis | K. Stepanyan
[1] Bing Liu,et al. Web data extraction based on partial tree alignment , 2005, WWW '05.
[2] Calton Pu,et al. XWRAP: an XML-enabled wrapper construction system for Web information sources , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).
[3] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.
[4] Berthier A. Ribeiro-Neto,et al. A brief survey of web data extraction tools , 2002, SGMD.
[5] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.
[6] Télécom Paristech. Archiving Data Objects using Web Feeds , 2010 .
[7] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .
[8] Brad Adelberg,et al. NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents , 1998, SIGMOD '98.
[9] Oren Etzioni,et al. TextRunner: Open Information Extraction on the Web , 2007, NAACL.
[10] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[11] Georg Gottlob,et al. Visual Web Information Extraction with Lixto , 2001, VLDB.
[12] Georg Gottlob,et al. Web Data Extraction System , 2009, Encyclopedia of Database Systems.
[13] Craig A. Knoblock,et al. Hierarchical Wrapper Induction for Semistructured Information Sources , 2004, Autonomous Agents and Multi-Agent Systems.
[14] Maureen Pennock,et al. ArchivePress: A Really Simple Solution to Archiving Blog Content , 2009, iPRES.
[15] William E. Winkler,et al. String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage. , 1990 .
[16] Ian H. Witten,et al. Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.
[17] Ian Witten,et al. Data Mining , 2000 .
[18] Nicholas Kushmerick,et al. Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..
[19] W. Dutton,et al. Next Generation Users: The Internet in Britain , 2011 .
[20] Kai-Uwe Kühnberger,et al. Classification of Documents Based on the Structure of Their DOM Trees , 2007, ICONIP.