论文信息 - Mining Positive and Negative Co-regulation Patterns from Microarray Data

Mining Positive and Negative Co-regulation Patterns from Microarray Data

Currently, pattern-based and tendency-based models are very popular for clustering co-regulated genes. In this paper, we propose another novel model, namely g-Cluster. The proposed model has the following advantages: (1) find positive and negative co-regulated genes in a shot, (2) get away from the restriction of magnitude transformation relationship among genes, and (3) guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and two user-specified thresholds, called wave constraint threshold and regulation threshold respectively. We also design a novel tree-based clustering algorithm, FBTD, combined with efficient pruning rules to identify all maximal g-Clusters. The extensive experiments on real and synthetic datasets show that (1) our algorithm can effectively and efficiently find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and (2) our algorithm is superior to the existing approaches

[1] J. Mesirov,et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[2] Philip S. Yu,et al. Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[3] George M. Church,et al. Biclustering of Expression Data , 2000, ISMB.

[4] L. Lazzeroni. Plaid models for gene expression data , 2000 .

[5] Mohammed J. Zaki,et al. TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[6] Ya Zhang,et al. A time-series biclustering algorithm for revealing co-regulated genes , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[7] Wei Wang,et al. OP-cluster: clustering by tendency in high dimensional space , 2003, Third IEEE International Conference on Data Mining.

[8] M. Gerstein,et al. Genomic analysis of gene expression relationships in transcriptional regulatory networks. , 2003, Trends in genetics : TIG.

[9] Michael Ruogu Zhang,et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[10] Jian Pei,et al. Mining coherent gene clusters from gene-sample-time microarray data , 2004, KDD.

[11] Aidong Zhang,et al. Interrelated two-way clustering: an unsupervised approach for gene expression data analysis , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[12] Jinze Liu,et al. Biclustering in gene expression data by tendency , 2004 .

[13] Michael K. Ng,et al. On Mining Micro-array data by Order-Preserving Submatrix , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[14] Ben Taskar,et al. Rich probabilistic models for gene expression , 2001, ISMB.

[15] Ozgur Ozturk,et al. A time series analysis of microarray data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.