Extracting Content from Web Pages Using the Sliding Window
暂无分享,去创建一个
[1] Chia-Hui Chang,et al. IEPAD: information extraction based on pattern discovery , 2001, WWW '01.
[2] Ben Wellner,et al. Adaptive web-page content identification , 2007, WIDM '07.
[3] Wei-Ying Ma,et al. Learning important models for web page blocks based on layout and content analysis , 2004, SKDD.
[4] Frederick H. Lochovsky,et al. Data-rich section extraction from HTML pages , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..
[5] Wei Li,et al. QuASM: a system for question answering using semi-structured data , 2002, JCDL '02.
[6] Salvatore J. Stolfo,et al. Extracting context to improve accuracy for HTML content extraction , 2005, WWW '05.
[7] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[8] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[9] Gail E. Kaiser,et al. DOM-based content extraction of HTML documents , 2003, WWW '03.
[10] Wei-Ying Ma,et al. VIPS: a Vision-based Page Segmentation Algorithm , 2003 .
[11] Berthier A. Ribeiro-Neto,et al. A brief survey of web data extraction tools , 2002, SGMD.
[12] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[13] James A. M. McHugh,et al. Mining the World Wide Web , 2001, The Information Retrieval Series.
[14] Calton Pu,et al. A fully automated object extraction system for the World Wide Web , 2001, Proceedings 21st International Conference on Distributed Computing Systems.
[15] Thomas Gottron. Combining content extraction heuristics: the CombinE system , 2008, iiWAS.
[16] Andreas Paepcke,et al. Power browser: efficient Web browsing for PDAs , 2000, CHI.
[17] Soumen Chakrabarti,et al. Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction , 2001, WWW '01.
[18] Khaled Shaalan,et al. A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.
[19] Barry Smyth,et al. Fact or Fiction: Content Classification for Digital Libraries , 2001, DELOS.