Efficiently Mining Time-Delayed Gene Expression Patterns

Unlike pattern-based biclustering methods that focus on grouping objects in the same subset of dimensions, in this paper, we propose a novel model of coherent clustering for time-series gene expression data, i.e., time-delayed cluster (td-cluster). Under this model, objects can be coherent in different subsets of dimensions if these objects follow a certain time-delayed relationship. Such a cluster can discover the cycle time of gene expression, which is essential in revealing gene regulatory networks. This paper is the first attempt to mine time-delayed gene expression patterns from microarray data. A novel algorithm is also presented and implemented to mine all significant td-clusters. Our experimental results show following two results: 1) the td-cluster algorithm can detect a significant amount of clusters that were missed by previous models, and these clusters are potentially of high biological significance and 2) the td-cluster model and algorithm can easily be extended to 3-D gene × sample × time data sets to identify 3-D td-clusters.

[1]  Ziv Bar-Joseph,et al.  Analyzing time series gene expression data , 2004, Bioinform..

[2]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  M. Gerstein,et al.  Genomic analysis of gene expression relationships in transcriptional regulatory networks. , 2003, Trends in genetics : TIG.

[4]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[5]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[6]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[7]  Ozgur Ozturk,et al.  A time series analysis of microarray data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[8]  M. Gerstein,et al.  Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. , 2001, Journal of molecular biology.

[9]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[10]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[11]  Steven Skiena,et al.  Analysis Techniques for Microarray Time-Series Data , 2002, J. Comput. Biol..

[12]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[13]  Steven Skiena,et al.  Analysis techniques for microarray time-series data , 2001, RECOMB.

[14]  Bud Mishra,et al.  Time-frequency feature detection for time-course microarray data , 2004, SAC '04.

[15]  Weiqi Wang,et al.  Gene ontology friendly biclustering of expression profiles , 2004 .