Machine Learning in Web Mining

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  Johannes Fürnkranz,et al.  Round Robin Classification , 2002, J. Mach. Learn. Res..

[3]  Giuseppe Attardi,et al.  Automatic Web Page Categorization by Link and Context Analysis , 1999 .

[4]  Takenobu Tokunaga,et al.  Hierarchical Bayesian Clustering for Automatic Text Classification , 1995, IJCAI.

[5]  Dawid Weiss,et al.  A concept-driven algorithm for clustering search results , 2005, IEEE Intelligent Systems.

[6]  W. Bruce Croft,et al.  Generating hierarchical summaries for web searches , 2003, SIGIR '03.

[7]  Padmini Srinivasan,et al.  Automatic Text Categorization Using Neural Networks , 1997 .

[8]  Peter Jackson,et al.  Natural language processing for online applications : text retrieval, extraction and categorization , 2002 .

[9]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[12]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[13]  MladenicDunja Text-Learning and Related Intelligent Agents , 1999 .

[14]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[15]  Naftali Tishby,et al.  The Power of Word Clusters for Text Classification , 2006 .

[16]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[17]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[18]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[19]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[20]  Evgeniy Gabrilovich,et al.  Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5 , 2004, ICML.

[21]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[22]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[23]  Israel Ben-Shaul,et al.  Automatically Organizing Bookmarks per Contents , 1996, Comput. Networks.

[25]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[26]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[27]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[28]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[29]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[30]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[31]  Yiming Yang,et al.  A Study of Approaches to Hypertext Categorization , 2002, Journal of Intelligent Information Systems.

[32]  David R. Karger,et al.  Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.

[33]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[34]  David G. Stork,et al.  Pattern Classification , 1973 .

[35]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[36]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[37]  Mirjana Ivanovic,et al.  Interactions Between Document Representation and Feature Selection in Text Categorization , 2006, DEXA.

[38]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[39]  Paolo Ferragina,et al.  A personalized search engine based on Web‐snippet hierarchical clustering , 2005, WWW '05.

[40]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[41]  Asunción Gómez-Pérez,et al.  Six challenges for the Semantic Web , 2002, KR 2002.

[42]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[43]  Yiming Yang,et al.  Expert network: effective and efficient learning from human decisions in text categorization and retrieval , 1994, SIGIR '94.

[44]  Yi-fang Brook Wu,et al.  Extracting Features from Web Search Returned Hits for Hierarchical Classification , 2003, IKE.

[45]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[46]  Inderjit S. Dhillon,et al.  Enhanced word clustering for hierarchical text classification , 2002, KDD.

[47]  George Karypis,et al.  Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering , 2004, Machine Learning.

[48]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[49]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[50]  Rohini K. Srihari,et al.  Document Representation for One-Class SVM , 2004, ECML.

[51]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[52]  Hyunsoo Kim,et al.  Dimension Reduction in Text Classification with Support Vector Machines , 2005, J. Mach. Learn. Res..

[53]  Shourya Roy,et al.  Fast and accurate text classification via multiple linear discriminant projections , 2003, The VLDB Journal.

[54]  Andrew McCallum,et al.  Distributional clustering of words for text classification , 1998, SIGIR '98.

[55]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[56]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[57]  Richard Fikes,et al.  The Ontolingua Server: a tool for collaborative ontology construction , 1997, Int. J. Hum. Comput. Stud..

[58]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[59]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[60]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[61]  Wei-Ying Ma,et al.  An Evaluation on Feature Selection for Text Clustering , 2003, ICML.

[62]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[63]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[64]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[65]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[66]  Declan Butler,et al.  Souped-up search engines , 2000, Nature.

[67]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[68]  Fabrizio Sebastiani,et al.  Supervised term weighting for automated text categorization , 2003, SAC '03.

[69]  Paul Whitney,et al.  Multi-faceted insight through interoperable visual information analysis paradigms , 1998, Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258).

[70]  Geoff Holmes,et al.  Multinomial Naive Bayes for Text Categorization Revisited , 2004, Australian Conference on Artificial Intelligence.

[71]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[72]  Frank van Harmelen,et al.  Web Ontology Language , 2004 .

[73]  Susan T. Dumais,et al.  Bringing order to the Web: automatically categorizing search results , 2000, CHI.

[74]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[75]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[76]  Ian Horrocks,et al.  The Semantic Web: The Roles of XML and RDF , 2000, IEEE Internet Comput..

[77]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[78]  Ben Shneiderman,et al.  Ordered and quantum treemaps: Making effective use of 2D space to display hierarchies , 2002, TOGS.

[79]  Ran El-Yaniv,et al.  Distributional Word Clusters vs. Words for Text Categorization , 2003, J. Mach. Learn. Res..

[80]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[81]  Jörg Kindermann,et al.  Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? , 2002, Machine Learning.

[82]  C. M. Sperberg-McQueen,et al.  Extensible markup language , 1997 .

[83]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[84]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[85]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[86]  Andreas S. Weigend,et al.  Exploiting Hierarchy in Text Categorization , 1999, Information Retrieval.