Crawling programs for wrapper-based applications
暂无分享,去创建一个
Valter Crescenzi | Paolo Merialdo | Claudio Bertoli | P. Merialdo | Valter Crescenzi | Claudio Bertoli
[1] Padmini Srinivasan,et al. Learning to crawl: Comparing classification schemes , 2005, TOIS.
[2] Nicholas Kushmerick,et al. Wrapper Induction for Information Extraction , 1997, IJCAI.
[3] Dayne Freitag,et al. Information Extraction from HTML: Application of a General Machine Learning Approach , 1998, AAAI/IAAI.
[4] Georg Gottlob,et al. Visual Web Information Extraction with Lixto , 2001, VLDB.
[5] Edleno Silva de Moura,et al. GoGetIt!: a tool for generating structure-driven web crawlers , 2006, WWW '06.
[6] Frederick H. Lochovsky,et al. Data-rich section extraction from HTML pages , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..
[7] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[8] Khaled Shaalan,et al. A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.
[9] Ming-Syan Chen,et al. Mining Web informative structures and contents based on entropy analysis , 2004, IEEE Transactions on Knowledge and Data Engineering.
[10] Stephen Soderland,et al. Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.
[11] Georg Gottlob,et al. Visual Programming of Web Data Aggregation Applications , 2003, IIWeb.
[12] Chun-Nan Hsu,et al. Generating Finite-State Transducers for Semi-Structured Data Extraction from the Web , 1998, Inf. Syst..
[13] Valter Crescenzi,et al. Clustering Web pages based on their structure , 2005, Data Knowl. Eng..
[14] Edleno Silva de Moura,et al. Structure-driven crawler generation by example , 2006, SIGIR.
[15] Valter Crescenzi,et al. RoadRunner: automatic data extraction from data-intensive web sites , 2002, SIGMOD '02.
[16] Nicholas Kushmerick,et al. Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..
[17] Gerhard Weikum,et al. The BINGO! System for Information Portal Generation and Expert Web Search , 2003, CIDR.
[18] Michael Benedikt,et al. VeriWeb: Automatically Testing Dynamic Web Sites , 2002 .
[19] Juliana Freire,et al. Automating Web navigation with the WebVCR , 2000, Comput. Networks.
[20] Oren Etzioni,et al. A Grammar Inference Algorithm for the World Wide Web , 2002 .
[21] Hector Garcia-Molina,et al. Extracting structured data from Web pages , 2003, SIGMOD '03.
[22] Tobias Anton. XPath-Wrapper Induction by generating tree traversal patterns , 2005, LWA.
[23] Robert Baumgartner,et al. DeepWeb Navigation in Web Data Extraction , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).
[24] Valter Crescenzi,et al. Automatic information extraction from large websites , 2004, JACM.
[25] Craig A. Knoblock,et al. A hierarchical approach to wrapper induction , 1999, AGENTS '99.
[26] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[27] Ee-Peng Lim,et al. An Automated Algorithm for Extracting Website Skeleton , 2004, DASFAA.
[28] Valter Crescenzi,et al. Fine-grain web site structure discovery , 2003, WIDM '03.