Web Page Classific ation and Hierarchy Adaptation

[1]  David M. Pennock,et al.  The structure of broad topics on the web , 2002, WWW.

[2]  Weiming Hu,et al.  A Novel Web Page Filtering System by Combining Texts and Images , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[3]  Songbo Tan,et al.  Combining error-correcting output codes and model-refinement for text categorization , 2007, SIGIR.

[4]  Osmar R. Zaïane,et al.  Finding Similar Queries to Satisfy Searches Based on Query Traces , 2002, OOIS Workshops.

[5]  Monika Henzinger,et al.  A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification , 2011, TWEB.

[6]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[7]  Stephen E. Robertson,et al.  Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.

[8]  Brian D. Davison,et al.  Looking into the past to better classify web spam , 2009, AIRWeb '09.

[9]  Scott Nowson The Language of Weblogs: A study of genre and individual differences , 2006 .

[10]  Javed Mostafa,et al.  An application of text categorization methods to gene ontology annotation , 2005, SIGIR '05.

[11]  Chris H. Q. Ding,et al.  Web document clustering using hyperlink structures , 2001, Comput. Stat. Data Anal..

[12]  Yiming Yang,et al.  A Study of Approaches to Hypertext Categorization , 2002, Journal of Intelligent Information Systems.

[13]  Xing Xie,et al.  A comparative study on classifying the functions of web page blocks , 2006, CIKM '06.

[14]  Ee-Peng Lim,et al.  Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[16]  Haym Hirsh,et al.  Using LSI for text classification in the presence of background text , 2001, CIKM '01.

[17]  Christopher C. Yang,et al.  Web site topic-hierarchy generation based on link structure , 2009 .

[18]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Min-Yen Kan Web page classification without the web page , 2004, WWW Alt. '04.

[20]  Jiawei Han,et al.  PEBL: Web page classification without negative examples , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  Evgeniy Gabrilovich,et al.  Harnessing the Expertise of 70, 000 Human Editors: Knowledge-Based Feature Generation for Text Categorization , 2007, J. Mach. Learn. Res..

[22]  Qiang Yang,et al.  Deep classification in large-scale text hierarchies , 2008, SIGIR '08.

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Filippo Menczer,et al.  Algorithmic detection of semantic similarity , 2005, WWW '05.

[25]  Brian D. Davison,et al.  Adversarial Web Search , 2011, Found. Trends Inf. Retr..

[26]  Thomas Hofmann,et al.  Text classification in a hierarchical mixture model for small training sets , 2001, CIKM '01.

[27]  Aixin Sun,et al.  Blog Classification Using Tags: An Empirical Study , 2007, ICADL.

[28]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[29]  Gerhard Weikum,et al.  Query-Log Based Authority Analysis for Web Information Search , 2004, WISE.

[30]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[31]  Masatoshi Yoshikawa,et al.  Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages , 2003, HYPERTEXT '03.

[32]  Qiang Yang,et al.  Deep classifier: automatically categorizing search results into large-scale hierarchies , 2008, WSDM '08.

[33]  Sung-Hyon Myaeng,et al.  A practical hypertext catergorization method using links and incrementally available class information , 2000, SIGIR '00.

[34]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[35]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[36]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[37]  Gilad Mishne,et al.  Capturing Global Mood Levels using Blog Posts , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[38]  Hong Qu,et al.  Automated Blog Classification: Challenges and Pitfalls , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[39]  Brian D. Davison,et al.  Enhancing web search with entity intent , 2011, WWW.

[40]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[41]  Oren Etzioni,et al.  Scaling question answering to the Web , 2001, WWW '01.

[42]  Arlindo L. Oliveira,et al.  An Empirical Comparison of Text Categorization Methods , 2003, SPIRE.

[43]  Tom M. Mitchell,et al.  Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.

[44]  William W. Cohen Improving a Page Classifier with Anchor Extraction and Link Analysis , 2002, NIPS.

[45]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[46]  Thorsten Joachims,et al.  Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.

[47]  Wen Gao,et al.  Two-phase Web site classification based on hidden Markov tree models , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[48]  Andreas Hotho,et al.  Tag Recommendations in Folksonomies , 2007, LWA.

[49]  Joydeep Ghosh,et al.  Automatically learning document taxonomies for hierarchical classification , 2005, WWW '05.

[50]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[51]  Brian D. Davison,et al.  Hierarchy evolution for improved classification , 2011, CIKM '11.

[52]  Grace Hui Yang,et al.  Effectiveness of web page classification on finding list answers , 2004, SIGIR '04.

[53]  Ellen Riloff,et al.  Learning and Evaluating the Content and Structure of a Term Taxonomy , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[54]  Matthew Richardson,et al.  The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.

[55]  Pu-Jen Cheng,et al.  Query taxonomy generation for web search , 2006, CIKM '06.

[56]  Paul N. Bennett,et al.  Refined experts: improving classification in large taxonomies , 2009, SIGIR.

[57]  Hugh E. Williams,et al.  Strategies for minimising errors in hierarchical web categorisation , 2002, CIKM '02.

[58]  Jong-Hyeok Lee,et al.  Text categorization based on k-nearest neighbor approach for Web site classification , 2003, Inf. Process. Manag..

[59]  Andrei Z. Broder,et al.  A semantic approach to contextual advertising , 2007, SIGIR.

[60]  Shui-Lung Chuang,et al.  Automatic query taxonomy generation for information retrieval applications , 2003, Online Inf. Rev..

[61]  Evgeniy Gabrilovich,et al.  Parameterized generation of labeled datasets for text categorization based on a hierarchical directory , 2004, SIGIR '04.

[62]  Shui-Lung Chuang,et al.  Using a web-based categorization approach to generate thematic metadata from texts , 2004, TALIP.

[63]  Jennifer Neville,et al.  Why collective inference improves relational classification , 2004, KDD.

[64]  Shui-Lung Chuang,et al.  Towards automatic generation of query taxonomy: a hierarchical query clustering approach , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[65]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[66]  Brian D. Davison,et al.  Choosing your own adventure: automatic taxonomy generation to permit many paths , 2010, CIKM.

[67]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[68]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[69]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[70]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[71]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[72]  Christoph Lindemann,et al.  Coarse-grained classification of web sites by their structural properties , 2006, WIDM '06.

[73]  Hans-Peter Kriegel,et al.  Web site mining: a new way to spot competitors, customers and suppliers in the world wide web , 2002, KDD.

[74]  Hugh E. Williams,et al.  Simple and accurate feature selection for hierarchical categorisation , 2002, DocEng '02.

[75]  Sanda M. Harabagiu,et al.  Experiments with Open-Domain Textual Question Answering , 2000, COLING.

[76]  Wei Liu,et al.  Importance-Based Web Page Classification Using Cost-Sensitive SVM , 2005, WAIM.

[77]  Minyi Guo,et al.  A class-feature-centroid classifier for text categorization , 2009, WWW '09.

[78]  John M. Pierre,et al.  On the Automated Classification of Web Sites , 2001, ArXiv.

[79]  Huan Liu,et al.  Topic taxonomy adaptation for group profiling , 2008, TKDD.

[80]  Susan T. Dumais,et al.  Bringing order to the Web: automatically categorizing search results , 2000, CHI.

[81]  Monika Henzinger,et al.  Purely URL-based topic classification , 2009, WWW '09.

[82]  Huan Liu,et al.  Acclimatizing Taxonomic Semantics for Hierarchical Content Classification , 2006, KDD '06.

[83]  Hugo Liu,et al.  A Corpus-based Approach to Finding Happiness , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[84]  Min-Yen Kan,et al.  Fast webpage classification using URL features , 2005, CIKM '05.

[85]  Lalit M. Patnaik,et al.  Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..

[86]  Johannes Fürnkranz,et al.  Exploiting Structural Information for Text Classification on the WWW , 1999, IDA.

[87]  Sriram Raghavan,et al.  WebBase: a repository of Web pages , 2000, Comput. Networks.

[88]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[89]  Berthier A. Ribeiro-Neto,et al.  Combining link-based and content-based methods for web document classification , 2003, CIKM '03.

[90]  Steffen Staab,et al.  Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..

[91]  Ulf Hermjakob,et al.  Parsing and Question Classification for Question Answering , 2001, ACL 2001.

[92]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[93]  Ee-Peng Lim,et al.  Web classification using support vector machine , 2002, WIDM '02.

[94]  Hector Garcia-Molina,et al.  Web Spam Taxonomy , 2005, AIRWeb.

[95]  Nello Cristianini,et al.  Composite Kernels for Hypertext Categorisation , 2001, ICML.

[96]  Richard M. Everson,et al.  When Are Links Useful? Experiments in Text Classification , 2003, ECIR.

[97]  Zenglin Xu,et al.  Web page classification with heterogeneous data fusion , 2007, WWW '07.

[98]  Qiang Yang,et al.  A comparison of implicit and explicit links for web page classification , 2006, WWW '06.

[99]  Liming Chen,et al.  WebGuard: Web based adult content detection and filtering system , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[100]  Giuseppe Attardi,et al.  Automatic Web Page Categorization by Link and Context Analysis , 1999 .

[101]  Mounia Lalmas,et al.  A probabilistic description-oriented approach for categorizing web documents , 1999, CIKM '99.

[102]  Rayid Ghani,et al.  Combining labeled and unlabeled data for text classification with a large number of categories , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[103]  Mika Käki,et al.  Findex: search result categories help users when document ranking fails , 2005, CHI.

[104]  M. Indra Devi,et al.  Feature Selection for Web Page Classification , 2009 .

[105]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[106]  Shui-Lung Chuang,et al.  Taxonomy generation for text segments: A practical web-based approach , 2005, TOIS.

[107]  Brian D. Davison,et al.  Classifiers without borders: incorporating fielded text from neighboring web pages , 2008, SIGIR '08.

[108]  Xiaogang Peng,et al.  Automatic web page classification in a dynamic and hierarchical way , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[109]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[110]  Rayid Ghani,et al.  Combining Labeled and Unlabeled Data for MultiClass Text Categorization , 2002, ICML.

[111]  Michael J. Pazzani,et al.  Syskill & Webert: Identifying Interesting Web Sites , 1996, AAAI/IAAI, Vol. 1.

[112]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[113]  Andrei Z. Broder,et al.  Robust classification of rare queries using web knowledge , 2007, SIGIR.

[114]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[115]  Natalie S. Glance,et al.  Community search assistant , 2001, IUI '01.

[116]  Byoung-Tak Zhang,et al.  Large Scale Unstructured Document Classification Using Unlabeled Data and Syntactic Information , 2003, PAKDD.

[117]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[118]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[119]  Fabrizio Sebastiani,et al.  A Tutorial on Automated Text Categorisation , 2000 .

[120]  David M. Pennock,et al.  Inferring hierarchical descriptions , 2002, CIKM '02.

[121]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[122]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[123]  Larry Fitzpatrick,et al.  Automatic feedback using past queries: social searching? , 1997, SIGIR '97.

[124]  Qiang Yang,et al.  Reinforcing Web-object Categorization Through Interrelationships , 2006, Data Mining and Knowledge Discovery.

[125]  Tom M. Mitchell,et al.  Discovering Test Set Regularities in Relational Domains , 2000, ICML.

[126]  Thomas Hofmann,et al.  The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data , 1999, IJCAI.

[127]  Steffen Staab,et al.  Comparing ontologies - similarity measures and a comparison study , 2001 .

[128]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[129]  Bettina Berendt,et al.  Tags are not metadata, but "just more content" - to some people , 2007, ICWSM.

[130]  Brian D. Davison,et al.  Bridging link and query intent to enhance web search , 2011, HT '11.

[131]  Simone Paolo Ponzetto,et al.  Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia , 2009, IJCAI.

[132]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[133]  Ah-Hwee Tan,et al.  Text Mining: The state of the art and the challenges , 2000 .

[134]  Brian D. Davison,et al.  Measuring similarity to detect qualified links , 2007, AIRWeb '07.

[135]  Svetlana Kiritchenko,et al.  Hierarchical text categorization and its application to bioinformatics , 2006 .

[136]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[137]  Yihong Gong,et al.  Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.

[138]  Evgeniy Gabrilovich,et al.  Feature Generation for Text Categorization Using World Knowledge , 2005, IJCAI.

[139]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[140]  Vaughan R. Shanks,et al.  Fast categorisation of large document collections , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.

[141]  Raghu Krishnapuram,et al.  Automatic Taxonomy Generation: Issues and Possibilities , 2003, IFSA.

[142]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[143]  Stefan Siersdorfer,et al.  A neighborhood-based approach for clustering of linked document collections , 2006, CIKM '06.

[144]  Maarten de Rijke,et al.  Learning to Recognize Blogs: A Preliminary Exploration , 2006 .

[145]  Maarten de Rijke,et al.  Finding experts and their eetails in e-mail corpora , 2006, WWW '06.

[146]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[147]  Yiming Yang,et al.  A scalability analysis of classifiers in text categorization , 2003, SIGIR.

[148]  Tong Zhang,et al.  Linear prediction models with graph regularization for web-page categorization , 2006, KDD '06.

[149]  Joseph Kaye,et al.  Understanding how bloggers feel: recognizing affect in blog posts , 2006, CHI Extended Abstracts.

[150]  Filippo Menczer,et al.  Mapping the semantics of Web text and links , 2005, IEEE Internet Computing.

[151]  Grace Hui Yang,et al.  Web-based List Question Answering , 2004, COLING.

[152]  Abdur Chowdhury,et al.  Using titles and category names from editor-driven taxonomies for automatic evaluation , 2003, CIKM '03.

[153]  Qiang Yang,et al.  Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.

[154]  Wei-Ying Ma,et al.  OCFS: optimal orthogonal centroid feature selection for text categorization , 2005, SIGIR '05.

[155]  Jong-Hyeok Lee,et al.  Web page classification based on k-nearest neighbor approach , 2000, IRAL '00.

[156]  Paul Clough,et al.  Automatically organising images using concept hierarchies , 2005 .

[157]  Gerd Stumme,et al.  Formal Concept Analysis: foundations and applications , 2005 .

[158]  David Carmel,et al.  The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.

[159]  David M. Pennock,et al.  Using web structure for classifying and describing web pages , 2002, WWW.

[160]  Shui-Lung Chuang,et al.  Liveclassifier: creating hierarchical text classifiers through web corpora , 2004, WWW '04.

[161]  Yiming Yang,et al.  Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.

[162]  R. A. Fisher,et al.  Statistical Tables for Biological, Agricultural and Medical Research , 1956 .

[163]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[164]  Yasuhiro Suzuki,et al.  Automatically collecting, monitoring, and mining japanese weblogs , 2004, WWW Alt. '04.

[165]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[166]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[167]  Susan T. Dumais,et al.  Hierarchical classification of Web content , 2000, SIGIR '00.

[168]  Fabrizio Silvestri,et al.  Know your neighbors: web spam detection using the web topology , 2007, SIGIR.

[169]  Brian D. Davison Topical locality in the Web , 2000, SIGIR '00.

[170]  Dunja Mladenic,et al.  Turning Yahoo to Automatic Web-Page Classifier , 1998, European Conference on Artificial Intelligence.

[171]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[172]  Brian D. Davison,et al.  Web page classification: Features and algorithms , 2009, CSUR.

[173]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[174]  Johannes Fürnkranz,et al.  Link-Local Features for Hypertext Classification , 2005, EWMF/KDO.

[175]  Oren Kurland,et al.  PageRank without hyperlinks: structural re-ranking using links induced by language models , 2005, SIGIR '05.

[176]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[177]  Brian D. Davison,et al.  Knowing a web page by the company it keeps , 2006, CIKM '06.

[178]  Csaba Veres,et al.  The Language of Folksonomies: What Tags Reveal About User Classification , 2006, NLDB.

[179]  Koraljka Golub,et al.  Importance of HTML Structural Elements and Metadata in Automated Subject Classification , 2005, ECDL.

[180]  Wolfgang Nejdl,et al.  Utility analysis for topically biased PageRank , 2007, WWW '07.

[181]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[182]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[183]  Shourya Roy,et al.  A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.

[184]  Brian D. Davison The potential of the metasearch engine , 2005, ASIST.

[185]  Benno Stein,et al.  Genre Classification of Web Pages , 2004, KI.

[186]  Yiming Yang,et al.  Hypertext Categorization using Hyperlink Patterns and Meta Data , 2001, ICML.

[187]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[188]  A. Muller,et al.  The TaxGen framework: automating the generation of a taxonomy for a large document collection , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.

[189]  Dunja Mladenic,et al.  Text-learning and related intelligent agents: a survey , 1999, IEEE Intell. Syst..

[190]  Johannes Fürnkranz,et al.  Hyperlink ensembles: a case study in hypertext classification , 2002, Inf. Fusion.

[191]  Susan T. Dumais,et al.  The Combination of Text Classifiers Using Reliability Indicators , 2016, Information Retrieval.

[192]  Ophir Frieder,et al.  Using manually-built web directories for automatic evaluation of known-item retrieval , 2003, SIGIR.

[193]  Brian D. Davison,et al.  Diversifying Search Results with Popular Subtopics , 2009, TREC.

[194]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[195]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[196]  Evgeniy Gabrilovich,et al.  Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5 , 2004, ICML.

[197]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[198]  Kiyoshi Nitta,et al.  Improving taxonomies for large-scale hierarchical classifiers of web documents , 2010, CIKM.

[199]  Oren Kurland,et al.  Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models , 2006, SIGIR.