Robust biclustering algorithm (ROBA) for DNA microarray data analysis

Recently, biclustering algorithms have been used to extract useful information from large sets of DNA microarray experimental data. They refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. The goal is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. Almost all of the methods proposed in the literature search for one or two types of bicluster among four. Also, most of the proposed methods rely on solving an optimization problem. Therefore, the method is dependant on the optimally criterion which most of the time, is likely to miss some significant biclusters. In this study, we develop a robust biclustering algorithm (ROBA) to address some of the issues mentioned above. Our algorithm is simple because it uses basic linear algebra and arithmetic tools and there is no need to solve and optimization problem. Our algorithm is robust because it can be used to search for any type of bicluster defined by the user in a timely manner and, it is also shown to be more efficient than the ones proposed in the literature

[1]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[2]  Ron Shamir,et al.  CLICK and EXPANDER: a system for clustering and visualizing gene expression data , 2003, Bioinform..

[3]  Dimitrios Vogiatzis,et al.  Missing Value Estimation for DNA Microarrays with Mutliresolution Schemes , 2006, ICANN.

[4]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[5]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[6]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[7]  Ahmed H. Tewfik,et al.  Biclustering of DNA microarray data with early pruning , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[8]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[9]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[11]  David Botstein,et al.  Processing and modeling genome-wide expression data using singular value decomposition , 2001, SPIE BiOS.

[12]  Sven Bergmann,et al.  Iterative signature algorithm for the analysis of large-scale gene expression data. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.