A novel one-way clustering based gene expression data biclustering method

Gene expression data biclustering is very important for the research on gene regulatory mechanisms. Especially biclustering has also been proved to be very useful to analyze data matrix other than gene expression data. Compared with the traditional clustering methods, bicluster detection is very different since the elements of one bicluster may be greatly distributed among the original data matrix. In this paper a novel one-way bicluster detection method is proposed. It makes use of the existing traditional clustering algorithms such as K-means as an intermediate tool to do data clustering. Based on the clustering results and a characteristic of bicluster, the biclusters are detected one by one. Furthermore an efficient submatrices and tables creation method is proposed to save the memory storage and accelerate the processing speed. At the end of the paper an experiment with the simulated data are presented.

[1]  N. P. Gopalan,et al.  Enhanced correlation search technique for clustering cancer gene expression data , 2006 .

[2]  V.S. Tseng,et al.  Efficiently mining gene expression data via a novel parameterless clustering method , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  Debashis Ghosh,et al.  Mixture modelling of gene expression data from microarray experiments , 2002, Bioinform..

[4]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[5]  Fabrício Olivetti de França,et al.  Applying Biclustering to Text Mining: An Immune-Inspired Approach , 2007, ICARIS.

[6]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[7]  Daewon Lee,et al.  Dynamic Characterization of Cluster Structures for Robust and Inductive Support Vector Clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[9]  Hong Yan,et al.  Discovering biclusters in gene expression data based on high-dimensional linear geometries , 2008, BMC Bioinformatics.

[10]  D. Botstein,et al.  A gene expression database for the molecular pharmacology of cancer , 2000, Nature Genetics.

[11]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[13]  Bing Liu,et al.  An efficient semi-unsupervised gene selection method via spectral biclustering , 2006, IEEE Transactions on NanoBioscience.

[14]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[15]  Ming Yang,et al.  Bicluster Algorithm and Used in Market Analysis , 2009, WKDD.

[16]  Lipo Wang,et al.  Data Mining With Computational Intelligence , 2006, IEEE Transactions on Neural Networks.

[17]  Daewon Lee,et al.  An improved cluster labeling method for support vector clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.