论文信息 - Associative Clustering

Associative Clustering

Clustering by maximizing the dependency between twopaired, continuous-valued multivariate data sets is studied. The new method, associative clustering (AC), maximizes a Bayes factor between two clustering models differing only in one respect: whether the clusterings of the two data sets are dependent or independent. The model both extends Information Bottleneck (IB)-type dependency modeling to continuous-valued data and offers it a well-founded and asymptotically well-behaving criterion for small data sets: With suitable prior assumptions the Bayes factor becomes equivalent to the hypergeometric probability of a contingency table, while for large data sets it becomes the standard mutual information. An optimization algorithm is introduced, with empirical comparisons to a combination of IB and K-means, and to plain K-means. Two case studies cluster genes 1) to find dependencies between gene expression and transcription factor binding, and 2) to find dependencies between expression in different organisms.

[1] Samuel Kaski,et al. Discriminative Clustering: Optimal Contingency Tables by Learning Metrics , 2002, ECML.

[2] Samuel Kaski,et al. Clustering Based on Conditional Distributions in an Auxiliary Space , 2002, Neural Computation.

[3] Tapio Elomaa,et al. Machine Learning: ECML 2002 , 2002, Lecture Notes in Computer Science.

[4] Nicola J. Rinaldi,et al. Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[5] Thomas Hofmann,et al. Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[6] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.

[7] I. Good. On the Application of Symmetric Dirichlet Distributions and their Mixtures to Contingency Tables , 1976 .

[8] R. Tibshirani,et al. Discriminant Analysis by Gaussian Mixtures , 1996 .

[9] Inderjit S. Dhillon,et al. A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[10] Samuel Kaski,et al. Sequential information bottleneck for finite data , 2004, ICML.

[11] A. Orth,et al. Large-scale analysis of the human and mouse transcriptomes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12] Donna R. Maglott,et al. RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[13] Mokhtar S. Bazaraa,et al. Nonlinear Programming: Theory and Algorithms , 1993 .

[14] Wray L. Buntine. Variational Extensions to EM and Multinomial PCA , 2002, ECML.

[15] Bart Kosko,et al. Neural networks for signal processing , 1992 .

[16] Naftali Tishby,et al. Multivariate Information Bottleneck , 2001, Neural Computation.

[17] Jim Kay,et al. Feature discovery under contextual supervision using mutual information , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[18] Tommi S. Jaakkola,et al. Kernel Expansions with Unlabeled Examples , 2000, NIPS.

[19] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20] David J. Miller,et al. A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data , 1996, NIPS.

[21] Yudong D. He,et al. Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[22] Samuel Kaski,et al. Regularized discriminative clustering , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).