A Novel Clustering and Verification Based Microarray Data Bi-clustering Method

Microarray data biclustering is very important for the research on gene regulatory mechanisms. Genes which exhibit similar patterns are often functionally related. In this paper a novel bicluster detection method is proposed. It makes use of one of the existing traditional clustering algorithms such as K-means as an intermediate tool to do data clustering with the submatrices created from the original data matrix. Especially, in order to save the memory storage requirement, reduce the useless clustering processing and accelerate the bicluster detection speed, a clustering and verification combined algorithm is applied. The former helps to find out the row numbers where possible biclusters lie in, while the latter efficiently speed up the detection processing. Based on a characteristic of bicluster, the biclusters are detected one by one. At the end of the paper experiment with the simulated data are presented.

[1]  Daewon Lee,et al.  An improved cluster labeling method for support vector clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[3]  N. P. Gopalan,et al.  Enhanced correlation search technique for clustering cancer gene expression data , 2006 .

[4]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[5]  Tuan D. Pham Computational biology : issues and applications in oncology , 2009 .

[6]  Bing Liu,et al.  An efficient semi-unsupervised gene selection method via spectral biclustering , 2006, IEEE Transactions on NanoBioscience.

[7]  Debashis Ghosh,et al.  Mixture modelling of gene expression data from microarray experiments , 2002, Bioinform..

[8]  Fabrício Olivetti de França,et al.  Applying Biclustering to Text Mining: An Immune-Inspired Approach , 2007, ICARIS.

[9]  Daewon Lee,et al.  Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Hong Yan,et al.  Discovering biclusters in gene expression data based on high-dimensional linear geometries , 2008, BMC Bioinformatics.

[11]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[12]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[14]  Hongya Zhao,et al.  Geometric Biclustering and Its Applications to Cancer Tissue Classification Based on DNA Microarray Gene Expression Data , 2009 .

[15]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[16]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[17]  Ming Yang,et al.  Bicluster Algorithm and Used in Market Analysis , 2009, WKDD.

[18]  V.S. Tseng,et al.  Efficiently mining gene expression data via a novel parameterless clustering method , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.