Language Specific and Topic Focused Web Crawling
暂无分享,去创建一个
[1] W. B. Cavnar,et al. N-gram-based text categorization , 1994 .
[2] Rayid Ghani,et al. Mining the web to create minority language corpora , 2001, CIKM '01.
[3] Michele Banko,et al. Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.
[4] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[5] Philip S. Yu,et al. Intelligent crawling on the World Wide Web with arbitrary predicates , 2001, WWW '01.
[6] J. Curran,et al. Domain-specific Web site identification: the CROSSMARC focused Web crawler , 2003 .
[7] Ahmed Patel,et al. Building Topic-Specific Collections with Intelligent Agents , 1999, IS&N.
[8] Marco Gori,et al. Focused Crawling Using Context Graphs , 2000, VLDB.
[9] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.
[10] Adam Rifkin,et al. Nutch: A Flexible and Scalable Open-Source Web Search Engine , 2005 .
[11] Jimmy J. Lin,et al. Web question answering: is more always better? , 2002, SIGIR '02.
[12] Masaru Kitsuregawa,et al. Simulation Study of Language Specific Web Crawling , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).
[13] D. Lindberg,et al. The Unified Medical Language System , 1993, Yearbook of Medical Informatics.
[14] Preslav Nakov,et al. A study of using search engine page hits as a proxy for n-gram frequencies , 2005 .