Term Filtering with Bounded Error
暂无分享,去创建一个
Wei Li | Juan-Zi Li | Jie Tang | Zi Yang | Wei Li | Juan-Zi Li | Jie Tang | Zi Yang
[1] Andreas S. Weigend,et al. A neural network approach to topic spotting , 1995 .
[2] David S. Johnson. The NP-Completeness Column: An Ongoing Guide , 1986, J. Algorithms.
[3] Ruoming Jin,et al. A Topic Modeling Approach and Its Integration into the Random Walk Framework for Academic Search , 2008, 2008 Eighth IEEE International Conference on Data Mining.
[4] Nisheeth Shrivastava,et al. Graph summarization with bounded error , 2008, SIGMOD Conference.
[5] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[6] Evgeniy Gabrilovich,et al. Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5 , 2004, ICML.
[7] Alexandr Andoni,et al. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[8] Yiming Yang,et al. Using Corpus Statistics to Remove Redundant Words in Text Categorization , 1996, J. Am. Soc. Inf. Sci..
[9] Yiming Yang,et al. Noise reduction in a statistical approach to text categorization , 1995, SIGIR '95.
[10] Mohammad Al Hasan,et al. Clustering with Lower Bound on Similarity , 2009, PAKDD.
[11] Jie Tang,et al. ArnetMiner: extraction and mining of academic social networks , 2008, KDD.
[12] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[13] R. Tibshirani,et al. Sparse Principal Component Analysis , 2006 .
[14] Maosong Sun,et al. Scalable Term Selection for Text Categorization , 2007, EMNLP.
[15] C. J. van Rijsbergen,et al. Investigating the relationship between language model perplexity and IR precision-recall measures , 2003, SIGIR.
[16] Piotr Indyk,et al. Scalable Techniques for Clustering the Web , 2000, WebDB.
[17] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[18] David D. Lewis,et al. A comparison of two learning algorithms for text categorization , 1994 .
[19] George Forman,et al. An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..
[20] Heikki Mannila,et al. Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.
[21] W. Bruce Croft,et al. LDA-based document models for ad-hoc retrieval , 2006, SIGIR.
[22] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.
[23] Naftali Tishby,et al. Most informative dimension reduction , 2002, AAAI/IAAI.
[24] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.
[25] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.
[26] Anirban Dasgupta,et al. Feature selection methods for text classification , 2007, KDD '07.
[27] Patrick Pantel,et al. Randomized Algorithms and NLP: Using Locality Sensitive Hash Functions for High Speed Noun Clustering , 2005, ACL.
[28] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[29] Max Welling,et al. Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.
[30] Yin Yang,et al. Query by document , 2009, WSDM '09.
[31] Bing Liu,et al. Web Page Cleaning for Web Mining through Feature Weighting , 2003, IJCAI.
[32] Alexandre d'Aspremont,et al. Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..
[33] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[34] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.
[35] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.
[36] Kunle Olukotun,et al. Map-Reduce for Machine Learning on Multicore , 2006, NIPS.