RetriBlog: a framework for creating blog crawlers
暂无分享,去创建一个
Evandro Costa | Henrique Pacca Loureiro Luna | Rinaldo Lima | Rafael Ferreira | Frederico Luiz Gonçalves de Freitas | Jean Melo
[1] K. Fujimura,et al. BLOGRANGER – A Multi-faceted Blog Search Engine , 2006 .
[2] Bernardo A. Huberman,et al. Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..
[3] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[4] Tim Weninger,et al. Text Extraction from the Web via Text-to-Tag Ratio , 2008, 2008 19th International Workshop on Database and Expert Systems Applications.
[5] Otis Gospodnetic,et al. Lucene in Action (In Action series) , 2004 .
[6] Wei Li,et al. QuASM: a system for question answering using semi-structured data , 2002, JCDL '02.
[7] Jennifer Jie Xu,et al. A Blog Mining Framework , 2009, IT Professional.
[8] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.
[9] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.
[10] Hua Qian,et al. Anonymity and Self-Disclosure on Weblogs , 2007, J. Comput. Mediat. Commun..
[11] Frederic P. Miller,et al. Levenshtein Distance: Information theory, Computer science, String (computer science), String metric, Damerau?Levenshtein distance, Spell checker, Hamming distance , 2009 .
[12] Shengyi Jiang,et al. An improved K-nearest-neighbor algorithm for text categorization , 2012, Expert Syst. Appl..
[13] Thomas Gottron. EVALUATING CONTENT EXTRACTION ON HTML DOCUMENTS , 2007 .
[14] Andreas Hotho,et al. A Brief Survey of Text Mining , 2005, LDV Forum.
[15] Mukul Joshi,et al. BlogHarvest: Blog Mining and Search Framework , 2006, COMAD.
[16] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .
[17] L. R. Rasmussen,et al. In information retrieval: data structures and algorithms , 1992 .