Discovering non-exclusive functional modules from gene expression data

Biological processes are not independent of each other as genes participate in multiple different processes. Each gene should be assigned to multiple biclusters. In real life, more than one gene is responsible for a particular type of disease. The biclustering can associate clusters with gene arrangement patterns, preserving genomic information. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome. In strict or crisp partition-based biclustering, each gene/condition belongs to exactly one functional module whereas, addressing some biological questions requires partitioning methods leading to non-exclusive functional modules. The proposed method involves a novel strategy to discover such non-exclusive pattern-based biclusters using fuzzy set approach. We have evaluated the performance of our proposed model with few existing ones and the result shows that this can be suitable for application to genomes with high genetic exchange and various conserved gene arrangements in gene regulatory networks.

[1]  Edwin Diday,et al.  Orders and overlapping clusters by pyramids , 1987 .

[2]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[3]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[4]  Michalis Vazirgiannis,et al.  Clustering validity checking methods: part II , 2002, SGMD.

[5]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[6]  Philip S. Yu,et al.  MaPle: a fast algorithm for maximal pattern-based clustering , 2003, Third IEEE International Conference on Data Mining.

[7]  Daphne Koller,et al.  Decomposing Gene Expression into Cellular Processes , 2002, Pacific Symposium on Biocomputing.

[8]  M. Gerstein,et al.  Structure and evolution of transcriptional regulatory networks. , 2004, Current opinion in structural biology.

[9]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Joydeep Ghosh,et al.  Model-based overlapping clustering , 2005, KDD '05.

[11]  Mohammed J. Zaki,et al.  MicroCluster: efficient deterministic biclustering of microarray data , 2005, IEEE Intelligent Systems.

[12]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[13]  Xiaobing Pei,et al.  An Approximate Approach to Attribute Reduction , 2006 .

[14]  Armando Blanco,et al.  Possibilistic approach for biclustering microarray data , 2007, Comput. Biol. Medicine.

[15]  Alan Wee-Chung Liew,et al.  Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization , 2008, BMC Bioinformatics.

[16]  Gang Li,et al.  Rough Overlapping Biclustering of Gene Expression Data , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[17]  S. Chattopadhyay,et al.  A Novel Biclustering Algorithm for Discovering Value-Coherent Overlapping σ-Biclusters , 2008, 2008 16th International Conference on Advanced Computing and Communications.

[18]  Guillaume Cleuziou,et al.  An extended version of the k-means method for overlapping clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[19]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[20]  Ye-In Chang,et al.  A Condition-Enumeration Tree method for mining biclusters from DNA microarray data sets , 2009, Biosyst..

[21]  P.C.H. Ma,et al.  An Iterative Data Mining Approach for Mining Overlapping Coexpression Patterns in Noisy Gene Expression Data , 2009, IEEE Transactions on NanoBioscience.

[22]  Amiya Kumar Rath,et al.  Gene expression network discovery: a pattern based biclustering approach , 2011, ICCCS '11.

[23]  Hong Yan,et al.  Finding Correlated Biclusters from Gene Expression Data , 2011, IEEE Transactions on Knowledge and Data Engineering.

[24]  Rajashree Dash,et al.  Feature selection in gene expression data using principal component analysis and rough set theory. , 2011, Advances in experimental medicine and biology.