Two-way Clustering using Fuzzy ASI for Knowledge Discovery in Microarrays

This paper presents two-way clustering of microarray data using fuzzy adaptive subspace iteration (ASI) based algorithm for knowledge discovery in microarrays. It is widely believed that each gene is involved in more than one cellular function or biological process. The proposed fuzzy ASI assigns a relevance value to each gene associated with each cluster. These functional categories are ranked based on their potential in providing maximal separation between the two tissues classes; which is an indication of differentially expressed genes (DEGs). Empirical analyses on simulated, 100 artificial microarray datasets are used to quantify the results obtained using the fuzzy-ASI algorithm. Further analyses on different microarray cancer datasets revealed several important genes that are relevant with various cancers.

[1]  Mohammed Yeasin,et al.  Visualization of High Dimensional Data using an Automated 3D Star Co-ordinate System , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[2]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[3]  R. D'Andrade U-statistic hierarchical clustering , 1978 .

[4]  D. Botstein,et al.  Variation in gene expression patterns in human gastric cancers. , 2003, Molecular biology of the cell.

[5]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[6]  Mohammed Yeasin,et al.  Functionally Classifying Genes from Microarray Data Using Linear and Non-linear Data Projection , 2006, IEEE International Conference on Computer Systems and Applications, 2006..

[7]  Stephen J. Roberts,et al.  Data-adaptive test statistics for microarray data , 2005, ECCB/JBI.

[8]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[9]  Mohammed Yeasin,et al.  Performance Evaluation of Subspace-based Algorithm in Selecting Differentially Expressed Genes and Classification of Tissue Types from Microarray Data , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[10]  M. J. van der Laan,et al.  Statistical inference for simultaneous clustering of gene expression data. , 2002, Mathematical biosciences.

[11]  Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2007, Honolulu, Hawaii, USA, April 1-5, 2007, Part of the IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2007) , 2007, CIBCB.

[12]  Aidong Zhang,et al.  Interrelated two-way clustering: an unsupervised approach for gene expression data analysis , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[13]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Seo Young Kim,et al.  Effect of data normalization on fuzzy clustering of DNA microarray data , 2005, BMC Bioinformatics.

[15]  Mohammed Yeasin,et al.  A Progressive Framework for Two-Way Clustering Using Adaptive Subspace Iteration for Functionally Classifying Genes , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[16]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[17]  Alioune Ngom,et al.  A simulated annealing approach to find the optimal parameters for fuzzy clustering microarray data , 2005, XXV International Conference of the Chilean Computer Science Society (SCCC'05).

[18]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[19]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[20]  Andrea Califano,et al.  Analysis of Gene Expression Microarrays for Phenotype Classification , 2000, ISMB.

[21]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[23]  Francisco Tirado,et al.  Two-way clustering of gene expression profiles by sparse matrix factorization , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[24]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[25]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[26]  Tao Li,et al.  Document clustering via adaptive subspace iteration , 2004, SIGIR '04.