Entropy-based automated wrapper generation for weblog data extraction
暂无分享,去创建一个
Alexandra I. Cristea | Mike Joy | Karen Stepanyan | George Gkotsis | M. Joy | A. Cristea | G. Gkotsis | K. Stepanyan
[1] Shunsuke Ihara,et al. Information theory - for continuous systems , 1993 .
[2] Alexandra I. Cristea,et al. Self-supervised Automated Wrapper Generation for Weblog Data Extraction , 2013, BNCOD.
[3] Li Yujian,et al. A Normalized Levenshtein Distance Metric , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] William E. Winkler,et al. AN APPLICATION OF THE FELLEGI-SUNTER MODEL OF RECORD LINKAGE TO THE 1990 U.S. DECENNIAL CENSUS , 1987 .
[5] Bing Liu,et al. Web data extraction based on partial tree alignment , 2005, WWW '05.
[6] Maureen Pennock,et al. ArchivePress: A Really Simple Solution to Archiving Blog Content , 2009, iPRES.
[7] Marilena Oita,et al. Archiving Data Objects using Web Feeds , 2010 .
[8] Georg Gottlob,et al. Web Data Extraction System , 2009, Encyclopedia of Database Systems.
[9] Craig A. Knoblock,et al. Hierarchical Wrapper Induction for Semistructured Information Sources , 2004, Autonomous Agents and Multi-Agent Systems.
[10] Calton Pu,et al. XWRAP: an XML-enabled wrapper construction system for Web information sources , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).
[11] Pierre Senellart,et al. Intelligent and Adaptive Crawling of Web Applications for Web Archiving , 2013, ICWE.
[12] Berthier A. Ribeiro-Neto,et al. A brief survey of web data extraction tools , 2002, SGMD.
[13] Ian H. Witten,et al. Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.
[14] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[15] Brad Adelberg,et al. NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents , 1998, SIGMOD '98.
[16] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.
[17] Kristinn Sigurðsson. Incremental Crawling with Heritrix , 2010 .
[18] Kai-Uwe Kühnberger,et al. Classification of Documents Based on the Structure of Their DOM Trees , 2007, ICONIP.
[19] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[20] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[21] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[22] Nicholas Kushmerick,et al. Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..
[23] W. Dutton,et al. Next Generation Users: The Internet in Britain , 2011 .
[24] Karl Rihaczek,et al. 1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.
[25] Christoph Meinel,et al. Mapping the Blogosphere--Towards a Universal and Scalable Blog-Crawler , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.
[26] Bing Liu,et al. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.
[27] Georg Gottlob,et al. Visual Web Information Extraction with Lixto , 2001, VLDB.
[28] Kweku-Muata Bryson,et al. Comparison of two families of entropy-based classification measures with and without feature selection , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.