Document clustering using character N-grams: a comparative evaluation with term-based and word-based clustering
暂无分享,去创建一个
[1] Anton Leuski,et al. Evaluating document clustering for interactive information retrieval , 2001, CIKM '01.
[2] Martin Ester,et al. Frequent term-based text clustering , 2002, KDD.
[3] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.
[4] Hideki Mima,et al. Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.
[5] Zeev Volkovich,et al. Text mining with information-theoretic clustering , 2003, Comput. Sci. Eng..
[6] Evangelos E. Milios,et al. AUTOMATIC TERM EXTRACTION AND DOCUMENT SIMILARITY IN SPECIAL TEXT CORPORA , 2003 .
[7] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[8] Fuchun Peng,et al. N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION , 2003 .
[9] W. B. Cavnar,et al. Using An N-Gram-Based Document Representation With A Vector Processing Retrieval Model , 1994, TREC.
[10] G. Karypis,et al. Clustering In A High-Dimensional Space Using Hypergraph Models , 2004 .
[11] Greg Hamerly,et al. Learning the k in k-means , 2003, NIPS.
[12] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[13] R. Mooney,et al. Impact of Similarity Measures on Web-page Clustering , 2000 .
[14] Evangelos E. Milios,et al. Term-Based Clustering and Summarization of Web Page Collections , 2004, Canadian Conference on AI.
[15] Eric Brill,et al. A Simple Rule-Based Part of Speech Tagger , 1992, HLT.
[16] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..
[17] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .
[18] David D. Lewis,et al. Reuters-21578 Text Categorization Test Collection, Distribution 1.0 , 1997 .
[19] Hans-Peter Kriegel,et al. The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.
[20] Paul S. Bradley,et al. Refining Initial Points for K-Means Clustering , 1998, ICML.
[21] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.
[22] Qiang Yang,et al. Correlation-based document clustering using web logs , 2001, Proceedings of the 34th Annual Hawaii International Conference on System Sciences.
[23] George Karypis,et al. CLUTO - A Clustering Toolkit , 2002 .
[24] Yuen Ren Chao,et al. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology , 1950 .
[25] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .
[26] Jiawei Han,et al. Data Mining: Concepts and Techniques , 2000 .