Alignment versus variation methods for clustering microarray time-series data

In the past few years, it has been shown that traditional clustering methods do not necessarily perform well on time-series data because of the temporal relationships involved in such data — this makes it a particularly difficult problem. In this paper, we compare two clustering methods that have been introduced recently, especially for gene expression time-series data, namely, multiple-alignment (MA) clustering and variation-based co-expression detection (VCD) clustering approaches. Both approaches are based on a transformation of the data that takes into account the temporal relationships, and have been shown to effectively detect groups of co-expressed genes. We investigate the performances of the MA and VCD approaches on two microarray time-series data sets and discuss their strengths and weaknesses. Our experiments show the superior accuracy of MA over VCD when finding groups of co-expressed genes.

[1]  L. Rueda,et al.  Clustering microarray time-series data using expectation maximization and multiple profile alignment , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.

[2]  Jung-Hsien Chiang,et al.  A new fuzzy cover approach to clustering , 2004, IEEE Trans. Fuzzy Syst..

[3]  T. Jaakkola,et al.  Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[5]  Ziv Bar-Joseph,et al.  Clustering short time series gene expression data , 2005, ISMB.

[6]  L. Wong,et al.  Identification of cell cycle-regulated genes in fission yeast. , 2005, Molecular biology of the cell.

[7]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[8]  Paola Sebastiani,et al.  Cluster analysis of gene expression dynamics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Shyamal D. Peddada,et al.  Gene Selection and Clustering for Time-course and Dose-response Microarray Experiments Using Order-restricted Inference , 2003, Bioinform..

[11]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[12]  Philippe Besse,et al.  Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives , 2007, EURASIP J. Bioinform. Syst. Biol..

[13]  Laurent Bréhélin,et al.  Clustering Gene Expression Series with Prior Knowledge , 2005, WABI.

[14]  C. S. Möller-Leveta,et al.  Clustering of unevenly sampled gene expression time-series data , 2005 .

[15]  Ataul Bari,et al.  Clustering Time-Series Gene Expression Data with Unequal Time Intervals , 2008, Trans. Comp. Sys. Biology.

[16]  Alioune Ngom,et al.  Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles , 2009, PRIB.

[17]  J. Chiang,et al.  Novel Algorithm for Coexpression Detection in Time-Varying Microarray Data Sets , 2008, TCBB.

[18]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[19]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[20]  Frank Klawonn,et al.  Clustering of unevenly sampled gene expression time-series data , 2005, Fuzzy Sets Syst..

[21]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.