PCluster : Probabilistic Agglomerative Clustering of Gene Expression Profiles

A central problem in analysis of gene expression data is clustering of genes with similar expression profiles. In this paper, I describe an hierarchical clustering procedure that is based on simple probabilistic model. This procedure clusters genes with respect to a target classification of conditions, so that genes that are expressed similarly in each group of conditions are clustered together.

[1]  M. Degroot Optimal Statistical Decisions , 1970 .

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[4]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[5]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[6]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[8]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[10]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[13]  Roded Sharan,et al.  CLICK: A Clustering Algorithm for Gene Expression Analysis , 2000, ISMB 2000.

[14]  Nir Friedman,et al.  Learning the Dimensionality of Hidden Variables , 2001, UAI.