Under Consideration for Publication in Knowledge and Information Systems Generative Model-based Document Clustering: a Comparative Study
暂无分享,去创建一个
[1] Inderjit S. Dhillon,et al. Information theoretic clustering of sparse cooccurrence data , 2003, Third IEEE International Conference on Data Mining.
[2] Naftali Tishby,et al. Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.
[3] Shi Zhong,et al. A Comparative Study of Generative Models for Document Clustering , 2003 .
[4] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[5] A. Banerjee,et al. Frequency Sensitive Competitive Learning for Balanced Clustering on High-dimensional Hyperspheres , 2004 .
[6] Jeff A. Bilmes,et al. A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .
[7] Inderjit S. Dhillon,et al. Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.
[8] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[9] Byron Dom,et al. An Information-Theoretic External Cluster-Validity Measure , 2002, UAI.
[10] Piotr Indyk. A sublinear time approximation scheme for clustering in metric spaces , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).
[11] Joydeep Ghosh,et al. Frequency sensitive competitive learning for clustering on high-dimensional hyperspheres , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[12] Joydeep Ghosh,et al. Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres , 2004, IEEE Transactions on Neural Networks.
[13] Yishay Mansour,et al. An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.
[14] Inderjit S. Dhillon,et al. An information theoretic analysis of maximum likelihood mixture estimation for exponential families , 2004, ICML.
[15] Inderjit S. Dhillon,et al. Efficient Clustering of Very Large Document Collections , 2001 .
[16] Noam Slonim,et al. Maximum Likelihood and the Information Bottleneck , 2002, NIPS.
[17] Paul S. Bradley,et al. Refining Initial Points for K-Means Clustering , 1998, ICML.
[18] Jianbo Shi,et al. A Random Walks View of Spectral Segmentation , 2001, AISTATS.
[19] George Karypis,et al. A Comparison of Document Clustering Techniques , 2000 .
[20] Anil K. Jain,et al. Feature Selection in Mixture-Based Clustering , 2002, NIPS.
[21] Inderjit S. Dhillon,et al. Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.
[22] Shivakumar Vaithyanathan,et al. Model-Based Hierarchical Clustering , 2000, UAI.
[23] Kanti V. Mardia,et al. Statistics of Directional Data , 1972 .
[24] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[25] Henry Stark,et al. Probability, Random Processes, and Estimation Theory for Engineers , 1995 .
[26] Chris Buckley,et al. OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.
[27] Ricardo Baeza-Yates,et al. Information Retrieval: Data Structures and Algorithms , 1992 .
[28] Edie M. Rasmussen,et al. Clustering Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.
[29] R. Mooney,et al. Impact of Similarity Measures on Web-page Clustering , 2000 .
[30] Joydeep Ghosh,et al. Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.
[31] Inderjit S. Dhillon,et al. Generative model-based clustering of directional data , 2003, KDD '03.
[32] Huan Liu,et al. Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[33] Vipin Kumar,et al. WebACE: a Web agent for document categorization and exploration , 1998, AGENTS '98.
[34] John Langford,et al. An objective evaluation criterion for clustering , 2004, KDD.
[35] Padhraic Smyth,et al. A general probabilistic framework for clustering individuals and objects , 2000, KDD '00.
[36] Alejandro Murua,et al. Hierarchical model-based clustering of large datasets through fractionation and refractionation , 2002, Inf. Syst..
[37] Joydeep Ghosh,et al. A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..
[38] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..
[39] George Karypis,et al. Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering , 2004, Machine Learning.
[40] Jianbo Shi,et al. Learning Segmentation by Random Walks , 2000, NIPS.
[41] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[42] Joydeep Ghosh,et al. Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..
[43] K. Mardia. Statistics of Directional Data , 1972 .
[44] Andrew McCallum,et al. A comparison of event models for naive bayes text classification , 1998, AAAI 1998.
[45] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[46] Inderjit S. Dhillon,et al. Enhanced word clustering for hierarchical text classification , 2002, KDD.
[47] A. Raftery,et al. Model-based Gaussian and non-Gaussian clustering , 1993 .
[48] George Karypis,et al. CLUTO - A Clustering Toolkit , 2002 .
[49] Marina Meila,et al. An Experimental Comparison of Model-Based Clustering Methods , 2004, Machine Learning.
[50] Pavel Berkhin,et al. A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.
[51] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.
[52] Santosh S. Vempala,et al. On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[53] Inderjit S. Dhillon,et al. Iterative clustering of high dimensional text data augmented by local search , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..
[54] G. Karypis,et al. Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .
[55] Tom M. Mitchell,et al. Using unlabeled data to improve text classification , 2001 .
[56] Samuel Kaski,et al. Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..