A comparison of biclustering algorithms

In the past years, various microarray technologies have been used to extract useful biological information from microarray data. Microarray technologies have become a central tool in biological research. The extraction or identification of gene groups with similar expression pattern, plays an important role in the analysis of genes. The primary techniques involve clustering and biclustering methods. Besides classical clustering methods, biclustering is being preferred to analyze biological datasets, due to its ability to group both genes across conditions simultaneously. Biclustering is being practiced in a number of applications to club genes across specified conditions, used mainly in identifying sets of coregulated genes, tissue classification etc. Gene Ontology is another important area of application, where biclusters are used to presume the class of non-annotated genes. Gene Ontology database is competent of annotating and analyzing a large number of genes. Gene Ontology is a standard approach of representing the gene with their product attributes, across different species and databases. Typical annotations for the analyzed list of genes can be well understood using the BicAT and BiVisu toolbox. The toolbox provides a platform which enables us to compare different biclustering algorithms, inside the graphical tool. This paper compares different biclustering approaches used to analyze carcinoma and DLBCL (diffuse large B-cell lymphoma) microarray datasets. The algorithms were compared on the grounds of enrichment values with support from runtime analysis. The paper explains in detail the biclusters associated with each algorithm and the intellects affecting the enrichment values, leading to the best biclustering technique for the datasets mentioned above.

[1]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Eckart Zitzler,et al.  BicAT: a biclustering analysis toolbox , 2006, Bioinform..

[3]  Wan-Chi Siu,et al.  BiVisu: software tool for bicluster detection and visualization , 2007, Bioinform..

[4]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[5]  Lusheng Wang,et al.  Computing the maximum similarity bi-clusters of gene expression data , 2007, Bioinform..

[6]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[7]  T. M. Murali,et al.  Extracting Conserved Gene Expression Motifs from Gene Expression Data , 2002, Pacific Symposium on Biocomputing.

[8]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[9]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[10]  Alan Wee-Chung Liew,et al.  Biclusters visualization and detection using parallel coordinate plots , 2007 .

[11]  Sven Bergmann,et al.  Iterative signature algorithm for the analysis of large-scale gene expression data. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[13]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.