Advances in Cluster Analysis of Microarray Data

Clustering genes into biological meaningful groups according to their pattern of expression is a main technique of microarray data analysis, based on the assumption that similarity in gene expression implies some form of regulatory or functional similarity. We give an overview of various clustering techniques, including conventional clustering methods (such as hierarchical clustering, k-means clustering, and self-organizing maps), as well as several clustering methods specifically developed for gene expression analysis.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Kathleen Marchal,et al.  Adaptive quality-based clustering of gene expression profiles , 2002, Bioinform..

[4]  B. Wiens When Log-Normal and Gamma Models Give Different Results: A Case Study , 1999 .

[5]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[6]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[7]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[8]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[9]  G. S. Johnson,et al.  An Information-Intensive Approach to the Molecular Pharmacology of Cancer , 1997, Science.

[10]  Pascal Nsoh,et al.  Large-scale temporal gene expression mapping of central nervous system development , 2007 .

[11]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[12]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  Alfonso Valencia,et al.  A hierarchical unsupervised growing neural network for clustering gene expression patterns , 2001, Bioinform..

[16]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[17]  Bin Yu,et al.  Simultaneous Gene Clustering and Subset Selection for Sample Classification Via MDL , 2003, Bioinform..

[18]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[20]  Partha S. Vasisht Computational Analysis of Microarray Data , 2003 .

[21]  Ka Yee Yeung,et al.  Validating clustering for gene expression data , 2001, Bioinform..

[22]  Kathleen Marchal,et al.  Functional bioinformatics of microarray data: from expression to regulation , 2002, Proc. IEEE.

[23]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[24]  G. Sherlock Analysis of large-scale gene expression data. , 2000, Current opinion in immunology.

[25]  Ash A. Alizadeh,et al.  'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns , 2000, Genome Biology.

[26]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[27]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[28]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[29]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[30]  Ben Taskar,et al.  Rich probabilistic models for gene expression , 2001, ISMB.

[31]  David G. Stork,et al.  Pattern Classification , 1973 .

[32]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[33]  Kathleen Marchal,et al.  A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes , 2001, RECOMB.

[34]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[35]  David M. Rocke,et al.  Variance-stabilizing transformations for two-color microarrays , 2004, Bioinform..

[36]  Adrian E. Raftery,et al.  MCLUST: Software for Model-Based Cluster Analysis , 1999 .

[37]  M K Kerr,et al.  Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Nir Friedman,et al.  Context-Specific Bayesian Clustering for Gene Expression Data , 2002, J. Comput. Biol..

[40]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[41]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[42]  Rainer Fuchs,et al.  Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters , 2001, Bioinform..

[43]  Francisco Azuaje,et al.  A cluster validity framework for genome expression data , 2002, Bioinform..

[44]  Debashis Ghosh,et al.  Mixture modelling of gene expression data from microarray experiments , 2002, Bioinform..

[45]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[46]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[47]  Tommi S. Jaakkola,et al.  Fast optimal leaf ordering for hierarchical clustering , 2001, ISMB.

[48]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[49]  G. Casella,et al.  Explaining the Gibbs Sampler , 1992 .

[50]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[51]  Dennis E. Slice,et al.  Bioinformatics: The Machine Learning Approach. Adaptive Computation and Machine Learning.Pierre Baldi , Soren Brunak , 1998 .

[52]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[53]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Jun S. Liu,et al.  Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies , 1995 .

[55]  L. Lazzeroni Plaid models for gene expression data , 2000 .