Web Page Classific ation and Hierarchy Adaptation
暂无分享,去创建一个
[1] David M. Pennock,et al. The structure of broad topics on the web , 2002, WWW.
[2] Weiming Hu,et al. A Novel Web Page Filtering System by Combining Texts and Images , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).
[3] Songbo Tan,et al. Combining error-correcting output codes and model-refinement for text categorization , 2007, SIGIR.
[4] Osmar R. Zaïane,et al. Finding Similar Queries to Satisfy Searches Based on Query Traces , 2002, OOIS Workshops.
[5] Monika Henzinger,et al. A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification , 2011, TWEB.
[6] Dell Zhang,et al. Question classification using support vector machines , 2003, SIGIR.
[7] Stephen E. Robertson,et al. Simple BM25 extension to multiple weighted fields , 2004, CIKM '04.
[8] Brian D. Davison,et al. Looking into the past to better classify web spam , 2009, AIRWeb '09.
[9] Scott Nowson. The Language of Weblogs: A study of genre and individual differences , 2006 .
[10] Javed Mostafa,et al. An application of text categorization methods to gene ontology annotation , 2005, SIGIR '05.
[11] Chris H. Q. Ding,et al. Web document clustering using hyperlink structures , 2001, Comput. Stat. Data Anal..
[12] Yiming Yang,et al. A Study of Approaches to Hypertext Categorization , 2002, Journal of Intelligent Information Systems.
[13] Xing Xie,et al. A comparative study on classifying the functions of web page blocks , 2006, CIKM '06.
[14] Ee-Peng Lim,et al. Hierarchical text classification and evaluation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[15] Foster J. Provost,et al. Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..
[16] Haym Hirsh,et al. Using LSI for text classification in the presence of background text , 2001, CIKM '01.
[17] Christopher C. Yang,et al. Web site topic-hierarchy generation based on link structure , 2009 .
[18] Azriel Rosenfeld,et al. Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.
[19] Min-Yen Kan. Web page classification without the web page , 2004, WWW Alt. '04.
[20] Jiawei Han,et al. PEBL: Web page classification without negative examples , 2004, IEEE Transactions on Knowledge and Data Engineering.
[21] Evgeniy Gabrilovich,et al. Harnessing the Expertise of 70, 000 Human Editors: Knowledge-Based Feature Generation for Text Categorization , 2007, J. Mach. Learn. Res..
[22] Qiang Yang,et al. Deep classification in large-scale text hierarchies , 2008, SIGIR '08.
[23] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[24] Filippo Menczer,et al. Algorithmic detection of semantic similarity , 2005, WWW '05.
[25] Brian D. Davison,et al. Adversarial Web Search , 2011, Found. Trends Inf. Retr..
[26] Thomas Hofmann,et al. Text classification in a hierarchical mixture model for small training sets , 2001, CIKM '01.
[27] Aixin Sun,et al. Blog Classification Using Tags: An Empirical Study , 2007, ICADL.
[28] T. Landauer,et al. Indexing by Latent Semantic Analysis , 1990 .
[29] Gerhard Weikum,et al. Query-Log Based Authority Analysis for Web Information Search , 2004, WISE.
[30] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[31] Masatoshi Yoshikawa,et al. Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages , 2003, HYPERTEXT '03.
[32] Qiang Yang,et al. Deep classifier: automatically categorizing search results into large-scale hierarchies , 2008, WSDM '08.
[33] Sung-Hyon Myaeng,et al. A practical hypertext catergorization method using links and incrementally available class information , 2000, SIGIR '00.
[34] Ji-Rong Wen,et al. Query clustering using user logs , 2002, TOIS.
[35] David A. Cohn,et al. The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.
[36] Gerhard Weikum,et al. WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .
[37] Gilad Mishne,et al. Capturing Global Mood Levels using Blog Posts , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[38] Hong Qu,et al. Automated Blog Classification: Challenges and Pitfalls , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[39] Brian D. Davison,et al. Enhancing web search with entity intent , 2011, WWW.
[40] Rohini K. Srihari,et al. Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[41] Oren Etzioni,et al. Scaling question answering to the Web , 2001, WWW '01.
[42] Arlindo L. Oliveira,et al. An Empirical Comparison of Text Categorization Methods , 2003, SPIRE.
[43] Tom M. Mitchell,et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes , 1998, ICML.
[44] William W. Cohen. Improving a Page Classifier with Anchor Extraction and Link Analysis , 2002, NIPS.
[45] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[46] Thorsten Joachims,et al. Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.
[47] Wen Gao,et al. Two-phase Web site classification based on hidden Markov tree models , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).
[48] Andreas Hotho,et al. Tag Recommendations in Folksonomies , 2007, LWA.
[49] Joydeep Ghosh,et al. Automatically learning document taxonomies for hierarchical classification , 2005, WWW '05.
[50] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.
[51] Brian D. Davison,et al. Hierarchy evolution for improved classification , 2011, CIKM '11.
[52] Grace Hui Yang,et al. Effectiveness of web page classification on finding list answers , 2004, SIGIR '04.
[53] Ellen Riloff,et al. Learning and Evaluating the Content and Structure of a Term Taxonomy , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.
[54] Matthew Richardson,et al. The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank , 2001, NIPS.
[55] Pu-Jen Cheng,et al. Query taxonomy generation for web search , 2006, CIKM '06.
[56] Paul N. Bennett,et al. Refined experts: improving classification in large taxonomies , 2009, SIGIR.
[57] Hugh E. Williams,et al. Strategies for minimising errors in hierarchical web categorisation , 2002, CIKM '02.
[58] Jong-Hyeok Lee,et al. Text categorization based on k-nearest neighbor approach for Web site classification , 2003, Inf. Process. Manag..
[59] Andrei Z. Broder,et al. A semantic approach to contextual advertising , 2007, SIGIR.
[60] Shui-Lung Chuang,et al. Automatic query taxonomy generation for information retrieval applications , 2003, Online Inf. Rev..
[61] Evgeniy Gabrilovich,et al. Parameterized generation of labeled datasets for text categorization based on a hierarchical directory , 2004, SIGIR '04.
[62] Shui-Lung Chuang,et al. Using a web-based categorization approach to generate thematic metadata from texts , 2004, TALIP.
[63] Jennifer Neville,et al. Why collective inference improves relational classification , 2004, KDD.
[64] Shui-Lung Chuang,et al. Towards automatic generation of query taxonomy: a hierarchical query clustering approach , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[65] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.
[66] Brian D. Davison,et al. Choosing your own adventure: automatic taxonomy generation to permit many paths , 2010, CIKM.
[67] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[68] Taher H. Haveliwala. Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..
[69] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..
[70] Piotr Indyk,et al. Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.
[71] Douglas Thain,et al. Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..
[72] Christoph Lindemann,et al. Coarse-grained classification of web sites by their structural properties , 2006, WIDM '06.
[73] Hans-Peter Kriegel,et al. Web site mining: a new way to spot competitors, customers and suppliers in the world wide web , 2002, KDD.
[74] Hugh E. Williams,et al. Simple and accurate feature selection for hierarchical categorisation , 2002, DocEng '02.
[75] Sanda M. Harabagiu,et al. Experiments with Open-Domain Textual Question Answering , 2000, COLING.
[76] Wei Liu,et al. Importance-Based Web Page Classification Using Cost-Sensitive SVM , 2005, WAIM.
[77] Minyi Guo,et al. A class-feature-centroid classifier for text categorization , 2009, WWW '09.
[78] John M. Pierre,et al. On the Automated Classification of Web Sites , 2001, ArXiv.
[79] Huan Liu,et al. Topic taxonomy adaptation for group profiling , 2008, TKDD.
[80] Susan T. Dumais,et al. Bringing order to the Web: automatically categorizing search results , 2000, CHI.
[81] Monika Henzinger,et al. Purely URL-based topic classification , 2009, WWW '09.
[82] Huan Liu,et al. Acclimatizing Taxonomic Semantics for Hierarchical Content Classification , 2006, KDD '06.
[83] Hugo Liu,et al. A Corpus-based Approach to Finding Happiness , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
[84] Min-Yen Kan,et al. Fast webpage classification using URL features , 2005, CIKM '05.
[85] Lalit M. Patnaik,et al. Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..
[86] Johannes Fürnkranz,et al. Exploiting Structural Information for Text Classification on the WWW , 1999, IDA.
[87] Sriram Raghavan,et al. WebBase: a repository of Web pages , 2000, Comput. Networks.
[88] Doug Beeferman,et al. Agglomerative clustering of a search engine query log , 2000, KDD '00.
[89] Berthier A. Ribeiro-Neto,et al. Combining link-based and content-based methods for web document classification , 2003, CIKM '03.
[90] Steffen Staab,et al. Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..
[91] Ulf Hermjakob,et al. Parsing and Question Classification for Question Answering , 2001, ACL 2001.
[92] Vijay V. Vazirani,et al. Approximation Algorithms , 2001, Springer Berlin Heidelberg.
[93] Ee-Peng Lim,et al. Web classification using support vector machine , 2002, WIDM '02.
[94] Hector Garcia-Molina,et al. Web Spam Taxonomy , 2005, AIRWeb.
[95] Nello Cristianini,et al. Composite Kernels for Hypertext Categorisation , 2001, ICML.
[96] Richard M. Everson,et al. When Are Links Useful? Experiments in Text Classification , 2003, ECIR.
[97] Zenglin Xu,et al. Web page classification with heterogeneous data fusion , 2007, WWW '07.
[98] Qiang Yang,et al. A comparison of implicit and explicit links for web page classification , 2006, WWW '06.
[99] Liming Chen,et al. WebGuard: Web based adult content detection and filtering system , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).
[100] Giuseppe Attardi,et al. Automatic Web Page Categorization by Link and Context Analysis , 1999 .
[101] Mounia Lalmas,et al. A probabilistic description-oriented approach for categorizing web documents , 1999, CIKM '99.
[102] Rayid Ghani,et al. Combining labeled and unlabeled data for text classification with a large number of categories , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[103] Mika Käki,et al. Findex: search result categories help users when document ranking fails , 2005, CHI.
[104] M. Indra Devi,et al. Feature Selection for Web Page Classification , 2009 .
[105] Soumen Chakrabarti,et al. Mining the web - discovering knowledge from hypertext data , 2002 .
[106] Shui-Lung Chuang,et al. Taxonomy generation for text segments: A practical web-based approach , 2005, TOIS.
[107] Brian D. Davison,et al. Classifiers without borders: incorporating fielded text from neighboring web pages , 2008, SIGIR '08.
[108] Xiaogang Peng,et al. Automatic web page classification in a dynamic and hierarchical way , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[109] Taher H. Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..
[110] Rayid Ghani,et al. Combining Labeled and Unlabeled Data for MultiClass Text Categorization , 2002, ICML.
[111] Michael J. Pazzani,et al. Syskill & Webert: Identifying Interesting Web Sites , 1996, AAAI/IAAI, Vol. 1.
[112] P. Schmitz,et al. Inducing Ontology from Flickr Tags , 2006 .
[113] Andrei Z. Broder,et al. Robust classification of rare queries using web knowledge , 2007, SIGIR.
[114] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .
[115] Natalie S. Glance,et al. Community search assistant , 2001, IUI '01.
[116] Byoung-Tak Zhang,et al. Large Scale Unstructured Document Classification Using Unlabeled Data and Syntactic Information , 2003, PAKDD.
[117] Thomas Hofmann,et al. Hierarchical document categorization with support vector machines , 2004, CIKM '04.
[118] Brian D. Davison,et al. Topical link analysis for web search , 2006, SIGIR.
[119] Fabrizio Sebastiani,et al. A Tutorial on Automated Text Categorisation , 2000 .
[120] David M. Pennock,et al. Inferring hierarchical descriptions , 2002, CIKM '02.
[121] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..
[122] W. Bruce Croft,et al. Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).
[123] Larry Fitzpatrick,et al. Automatic feedback using past queries: social searching? , 1997, SIGIR '97.
[124] Qiang Yang,et al. Reinforcing Web-object Categorization Through Interrelationships , 2006, Data Mining and Knowledge Discovery.
[125] Tom M. Mitchell,et al. Discovering Test Set Regularities in Relational Domains , 2000, ICML.
[126] Thomas Hofmann,et al. The Cluster-Abstraction Model: Unsupervised Learning of Topic Hierarchies from Text Data , 1999, IJCAI.
[127] Steffen Staab,et al. Comparing ontologies - similarity measures and a comparison study , 2001 .
[128] Simone Paolo Ponzetto,et al. Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.
[129] Bettina Berendt,et al. Tags are not metadata, but "just more content" - to some people , 2007, ICWSM.
[130] Brian D. Davison,et al. Bridging link and query intent to enhance web search , 2011, HT '11.
[131] Simone Paolo Ponzetto,et al. Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia , 2009, IJCAI.
[132] Subhash C. Bagui,et al. Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.
[133] Ah-Hwee Tan,et al. Text Mining: The state of the art and the challenges , 2000 .
[134] Brian D. Davison,et al. Measuring similarity to detect qualified links , 2007, AIRWeb '07.
[135] Svetlana Kiritchenko,et al. Hierarchical text categorization and its application to bioinformatics , 2006 .
[136] Gerhard Weikum,et al. Graph-based text classification: learn from your neighbors , 2006, SIGIR.
[137] Yihong Gong,et al. Multi-labelled classification using maximum entropy method , 2005, SIGIR '05.
[138] Evgeniy Gabrilovich,et al. Feature Generation for Text Categorization Using World Knowledge , 2005, IJCAI.
[139] Thorsten Joachims,et al. WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .
[140] Vaughan R. Shanks,et al. Fast categorisation of large document collections , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.
[141] Raghu Krishnapuram,et al. Automatic Taxonomy Generation: Issues and Possibilities , 2003, IFSA.
[142] Lise Getoor,et al. Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.
[143] Stefan Siersdorfer,et al. A neighborhood-based approach for clustering of linked document collections , 2006, CIKM '06.
[144] Maarten de Rijke,et al. Learning to Recognize Blogs: A Preliminary Exploration , 2006 .
[145] Maarten de Rijke,et al. Finding experts and their eetails in e-mail corpora , 2006, WWW '06.
[146] Thomas Hofmann,et al. Probabilistic Latent Semantic Analysis , 1999, UAI.
[147] Yiming Yang,et al. A scalability analysis of classifiers in text categorization , 2003, SIGIR.
[148] Tong Zhang,et al. Linear prediction models with graph regularization for web-page categorization , 2006, KDD '06.
[149] Joseph Kaye,et al. Understanding how bloggers feel: recognizing affect in blog posts , 2006, CHI Extended Abstracts.
[150] Filippo Menczer,et al. Mapping the semantics of Web text and links , 2005, IEEE Internet Computing.
[151] Grace Hui Yang,et al. Web-based List Question Answering , 2004, COLING.
[152] Abdur Chowdhury,et al. Using titles and category names from editor-driven taxonomies for automatic evaluation , 2003, CIKM '03.
[153] Qiang Yang,et al. Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.
[154] Wei-Ying Ma,et al. OCFS: optimal orthogonal centroid feature selection for text categorization , 2005, SIGIR '05.
[155] Jong-Hyeok Lee,et al. Web page classification based on k-nearest neighbor approach , 2000, IRAL '00.
[156] Paul Clough,et al. Automatically organising images using concept hierarchies , 2005 .
[157] Gerd Stumme,et al. Formal Concept Analysis: foundations and applications , 2005 .
[158] David Carmel,et al. The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.
[159] David M. Pennock,et al. Using web structure for classifying and describing web pages , 2002, WWW.
[160] Shui-Lung Chuang,et al. Liveclassifier: creating hierarchical text classifiers through web corpora , 2004, WWW '04.
[161] Yiming Yang,et al. Support vector machines classification with a very large-scale taxonomy , 2005, SKDD.
[162] R. A. Fisher,et al. Statistical Tables for Biological, Agricultural and Medical Research , 1956 .
[163] Thorsten Joachims,et al. Making large-scale support vector machine learning practical , 1999 .
[164] Yasuhiro Suzuki,et al. Automatically collecting, monitoring, and mining japanese weblogs , 2004, WWW Alt. '04.
[165] W. Bruce Croft,et al. Deriving concept hierarchies from text , 1999, SIGIR '99.
[166] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[167] Susan T. Dumais,et al. Hierarchical classification of Web content , 2000, SIGIR '00.
[168] Fabrizio Silvestri,et al. Know your neighbors: web spam detection using the web topology , 2007, SIGIR.
[169] Brian D. Davison. Topical locality in the Web , 2000, SIGIR '00.
[170] Dunja Mladenic,et al. Turning Yahoo to Automatic Web-Page Classifier , 1998, European Conference on Artificial Intelligence.
[171] Evgeniy Gabrilovich,et al. Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.
[172] Brian D. Davison,et al. Web page classification: Features and algorithms , 2009, CSUR.
[173] Tom M. Mitchell,et al. Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.
[174] Johannes Fürnkranz,et al. Link-Local Features for Hypertext Classification , 2005, EWMF/KDO.
[175] Oren Kurland,et al. PageRank without hyperlinks: structural re-ranking using links induced by language models , 2005, SIGIR '05.
[176] Leo Breiman,et al. Classification and Regression Trees , 1984 .
[177] Brian D. Davison,et al. Knowing a web page by the company it keeps , 2006, CIKM '06.
[178] Csaba Veres,et al. The Language of Folksonomies: What Tags Reveal About User Classification , 2006, NLDB.
[179] Koraljka Golub,et al. Importance of HTML Structural Elements and Metadata in Automated Subject Classification , 2005, ECDL.
[180] Wolfgang Nejdl,et al. Utility analysis for topically biased PageRank , 2007, WWW '07.
[181] Grigorios Tsoumakas,et al. Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..
[182] Philip Resnik,et al. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..
[183] Shourya Roy,et al. A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.
[184] Brian D. Davison. The potential of the metasearch engine , 2005, ASIST.
[185] Benno Stein,et al. Genre Classification of Web Pages , 2004, KI.
[186] Yiming Yang,et al. Hypertext Categorization using Hyperlink Patterns and Meta Data , 2001, ICML.
[187] Hector Garcia-Molina,et al. Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .
[188] A. Muller,et al. The TaxGen framework: automating the generation of a taxonomy for a large document collection , 1999, Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers.
[189] Dunja Mladenic,et al. Text-learning and related intelligent agents: a survey , 1999, IEEE Intell. Syst..
[190] Johannes Fürnkranz,et al. Hyperlink ensembles: a case study in hypertext classification , 2002, Inf. Fusion.
[191] Susan T. Dumais,et al. The Combination of Text Classifiers Using Reliability Indicators , 2016, Information Retrieval.
[192] Ophir Frieder,et al. Using manually-built web directories for automatic evaluation of known-item retrieval , 2003, SIGIR.
[193] Brian D. Davison,et al. Diversifying Search Results with Popular Subtopics , 2009, TREC.
[194] Yihong Gong,et al. Combining content and link for classification using matrix factorization , 2007, SIGIR.
[195] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[196] Evgeniy Gabrilovich,et al. Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5 , 2004, ICML.
[197] Stephen E. Robertson,et al. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.
[198] Kiyoshi Nitta,et al. Improving taxonomies for large-scale hierarchical classifiers of web documents , 2010, CIKM.
[199] Oren Kurland,et al. Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models , 2006, SIGIR.