Improving Researcher Homepage Classification with Unlabeled Data
暂无分享,去创建一个
Cornelia Caragea | C. Lee Giles | Prasenjit Mitra | Sujatha Das Gollapalli | P. Mitra | Cornelia Caragea
[1] Zhi-Hua Zhou,et al. The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study , 2006, Sixth International Conference on Data Mining (ICDM'06).
[2] Yang Song,et al. CiteSeerχ: a scalable autonomous scientific digital library , 2006, InfoScale '06.
[3] Maria-Florina Balcan,et al. Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.
[4] Gideon S. Mann,et al. Learning from labeled features using generalized expectation criteria , 2008, SIGIR '08.
[5] Jie Tang,et al. ArnetMiner: extraction and mining of academic social networks , 2008, KDD.
[6] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.
[7] Alexander J. Smola,et al. Like like alike: joint friendship and interest propagation in social networks , 2011, WWW.
[8] Ian Witten,et al. Data Mining , 2000 .
[9] John Langford,et al. Hash Kernels , 2009, AISTATS.
[10] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[11] Idit Keidar,et al. Do not crawl in the dust: different urls with similar text , 2006, WWW '07.
[12] Cornelia Caragea,et al. Combining Hashing and Abstraction in Sparse High Dimensional Feature Spaces , 2012, AAAI.
[13] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[14] Monika Henzinger,et al. A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification , 2011, TWEB.
[15] Ian H. Witten,et al. Chapter 1 – What's It All About? , 2011 .
[16] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[17] Kilian Q. Weinberger,et al. Feature hashing for large scale multitask learning , 2009, ICML '09.
[18] Lorenzo Blanco,et al. Highly efficient algorithms for structural clustering of large websites , 2011, WWW.
[19] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..
[20] Andrew McCallum,et al. A comparison of event models for naive bayes text classification , 1998, AAAI 1998.
[21] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[22] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[23] M. de Rijke,et al. Broad expertise retrieval in sparse data environments , 2007, SIGIR.
[24] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[25] Jun Du,et al. When Does Cotraining Work in Real Data? , 2011, IEEE Transactions on Knowledge and Data Engineering.
[26] Ulf Brefeld,et al. Co-EM support vector learning , 2004, ICML.
[27] Jiawei Han,et al. PEBL: Web page classification without negative examples , 2004, IEEE Transactions on Knowledge and Data Engineering.
[28] Xiaojin Zhu,et al. --1 CONTENTS , 2006 .
[29] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[30] S. Lawrence. Free online availability substantially increases a paper's impact , 2001, Nature.
[31] Rayid Ghani,et al. Combining Labeled and Unlabeled Data for MultiClass Text Categorization , 2002, ICML.
[32] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[33] Ian H. Witten,et al. Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .
[34] Ohad Shamir,et al. Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..
[35] Cornelia Caragea,et al. Researcher homepage classification using unlabeled data , 2013, WWW.
[36] David R. Karger,et al. Using urls and table layout for web classification tasks , 2004, WWW '04.
[37] Yixin Chen,et al. Automatic Feature Decomposition for Single View Co-training , 2011, ICML.
[38] Philip S. Yu,et al. A General Model for Multiple View Unsupervised Learning , 2008, SDM.
[39] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.
[40] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[41] George Forman,et al. Extremely fast text feature extraction for classification and indexing , 2008, CIKM '08.
[42] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.
[43] José Luis Ortega,et al. Longitudinal Study of Contents and Elements in the Scientific Web environment , 2006 .
[44] Rayid Ghani,et al. Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.
[45] Hema Swetha Koppula,et al. Learning URL patterns for webpage de-duplication , 2010, WSDM '10.
[46] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.
[47] Yuxin Wang,et al. Web Page Classification Exploiting Contents of Surrounding Pages for Building a High-Quality Homepage Collection , 2006, ICADL.
[48] Andrew McCallum,et al. Using Maximum Entropy for Text Classification , 1999 .
[49] Brian D. Davison,et al. Web page classification: Features and algorithms , 2009, CSUR.
[50] Mikhail Belkin,et al. A Co-Regularization Approach to Semi-supervised Learning with Multiple Views , 2005 .
[51] Cornelia Caragea,et al. On identifying academic homepages for digital libraries , 2011, JCDL '11.
[52] Trevor Darrell,et al. Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.
[53] Cornelia Caragea,et al. Automatic Identification of Research Articles from Crawled Documents , 2014, WSDM 2014.
[54] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.
[55] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.
[56] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .
[57] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[58] Min-Yen Kan,et al. Fast webpage classification using URL features , 2005, CIKM '05.
[59] Gideon S. Mann,et al. Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data , 2010, J. Mach. Learn. Res..
[60] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.
[61] Sebastian Thrun,et al. Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.
[62] Geert-Jan Houben,et al. Information Retrieval in Distributed Hypertexts , 1994, RIAO.
[63] George Forman,et al. An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..