xCrawl: a high-recall crawling method for Web mining
暂无分享,去创建一个
Gerhard Friedrich | Dietmar Jannach | Kostyantyn M. Shchekotykhin | D. Jannach | G. Friedrich | Konstantin Schekotihin
[1] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[2] Hans-Peter Kriegel,et al. Accurate and Efficient Crawling for Relevant Websites , 2004, VLDB.
[3] Idit Keidar,et al. Do not crawl in the DUST: different URLs with similar text , 2006, WWW.
[4] Gerhard Friedrich,et al. AllRight: Automatic Ontology Instantiation from Tabular Web Documents , 2007, ISWC/ASWC.
[5] Boris Chidlovskii,et al. Crawling for domain-specific hidden Web resources , 2003, Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003..
[6] Anirban Dasgupta,et al. The discoverability of the web , 2007, WWW '07.
[7] Taher H. Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..
[8] Jiawei Han,et al. PEBL: Web page classification without negative examples , 2004, IEEE Transactions on Knowledge and Data Engineering.
[9] Hans-Peter Kriegel,et al. Focused Web Crawling: A Generic Framework for Specifying the User Interest and for Adaptive Crawling Strategies , 2001 .
[10] Stephen E. Robertson,et al. On Term Selection for Query Expansion , 1991, J. Documentation.
[11] C. Lee Giles,et al. DEADLINER: building a new niche search engine , 2000, CIKM '00.
[12] Ravi Kumar,et al. Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.
[13] Wolfgang Gatterbauer,et al. Towards domain-independent information extraction from web tables , 2007, WWW '07.
[14] Tom M. Mitchell,et al. Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..
[15] Filippo Menczer,et al. Evaluating topic-driven web crawlers , 2001, SIGIR '01.
[16] Gerhard Friedrich,et al. Clustering web documents with tables for information extraction , 2007, K-CAP '07.
[17] Luis Gravano,et al. Querying text databases for efficient information extraction , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).
[18] Soumen Chakrabarti,et al. Mining the web - discovering knowledge from hypertext data , 2002 .
[19] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.
[20] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.
[21] Gerhard Friedrich,et al. An Integrated Environment for the Development of Knowledge-Based Recommender Applications , 2006, Int. J. Electron. Commer..
[22] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[23] Philip S. Yu,et al. Intelligent crawling on the World Wide Web with arbitrary predicates , 2001, WWW '01.
[24] Jon M. Kleinberg,et al. The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.
[25] Soumen Chakrabarti,et al. Accelerated focused crawling through online relevance feedback , 2002, WWW.
[26] Ramanathan V. Guha,et al. SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.
[27] Arie van Deursen,et al. Crawling AJAX by Inferring User Interface State Changes , 2008, 2008 Eighth International Conference on Web Engineering.
[28] Kevin Chen-Chuan Chang,et al. PEBL: positive example based learning for Web page classification using SVM , 2002, KDD.
[29] Gerhard Friedrich,et al. Automated ontology instantiation from tabular web sources - The AllRight system , 2009, J. Web Semant..
[30] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[31] Andrew McCallum,et al. Using Reinforcement Learning to Spider the Web Efficiently , 1999, ICML.
[32] Idit Keidar,et al. Do not crawl in the DUST: Different URLs with similar text , 2009, ACM Trans. Web.
[33] Jian Hu,et al. Using Wikipedia knowledge to improve text classification , 2009, Knowledge and Information Systems.
[34] Wanli Zuo,et al. SVM based adaptive learning method for text classification from positive and unlabeled documents , 2008, Knowledge and Information Systems.
[35] Christos Faloutsos,et al. Random walk with restart: fast solutions and applications , 2008, Knowledge and Information Systems.