A visual analytics approach for understanding biclustering results from microarray data

BackgroundMicroarray analysis is an important area of bioinformatics. In the last few years, biclustering has become one of the most popular methods for classifying data from microarrays. Although biclustering can be used in any kind of classification problem, nowadays it is mostly used for microarray data classification. A large number of biclustering algorithms have been developed over the years, however little effort has been devoted to the representation of the results.ResultsWe present an interactive framework that helps to infer differences or similarities between biclustering results, to unravel trends and to highlight robust groupings of genes and conditions. These linked representations of biclusters can complement biological analysis and reduce the time spent by specialists on interpreting the results. Within the framework, besides other standard representations, a visualization technique is presented which is based on a force-directed graph where biclusters are represented as flexible overlapped groups of genes and conditions. This microarray analysis framework (BicOverlapper), is available at http://vis.usal.es/bicoverlapperConclusionThe main visualization technique, tested with different biclustering results on a real dataset, allows researchers to extract interesting features of the biclustering results, especially the highlighting of overlapping zones that usually represent robust groups of genes and/or conditions. The visual analytics methodology will permit biology experts to study biclustering results without inspecting an overwhelming number of biclusters individually.

[1]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[2]  Ben Shneiderman,et al.  Balancing Systematic and Flexible Exploration of Social Networks , 2006, IEEE Transactions on Visualization and Computer Graphics.

[3]  George Karypis,et al.  gCLUTO – An Interactive Clustering, Visualization, and Analysis System , 2004 .

[4]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem. , 2003 .

[5]  G. Tevzadze,et al.  Spo1, a phospholipase B homolog, is required for spindle pole body duplication during meiosis in Saccharomyces cerevisiae , 2000, Chromosoma.

[6]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Wojtek J. Krzanowski,et al.  Biclustering Models for Structured Microarray Data , 2005, TCBB.

[8]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[9]  Wan-Chi Siu,et al.  BiVisu: software tool for bicluster detection and visualization , 2007, Bioinform..

[10]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.

[11]  Ben Shneiderman,et al.  Hawkeye: an interactive visual analytics tool for genome assemblies , 2007, Genome Biology.

[12]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[13]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[14]  F. Klis,et al.  Identification of three mannoproteins in the cell wall of Saccharomyces cerevisiae , 1995, Journal of bacteriology.

[15]  Stephen G. Kobourov,et al.  Graph-Drawing Contest Report , 1994, GD.

[16]  Alfred Inselberg,et al.  The plane with parallel coordinates , 1985, The Visual Computer.

[17]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[18]  Ben Shneiderman,et al.  Understanding Hierarchical Clustering Results by Interactive Exploration of Dendrograms: A Case Study with Genomic Microarray Data , 2003 .

[19]  A. Zeng,et al.  An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs. , 2004, Nucleic acids research.

[20]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[21]  Tamara Munzner,et al.  Visual Exploration of Complex Time-Varying Graphs , 2006 .

[22]  Frederic P. Miller,et al.  Internet Movie Database , 2009 .

[23]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[24]  T. M. Murali,et al.  Automatic layout and visualization of biclusters , 2006, Algorithms for Molecular Biology.

[25]  Emden R. Gansner,et al.  Improved Force-Directed Layouts , 1998, GD.

[26]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Eckart Zitzler,et al.  BicAT: a biclustering analysis toolbox , 2006, Bioinform..

[28]  Alok J. Saldanha,et al.  Java Treeview - extensible visualization of microarray data , 2004, Bioinform..

[29]  Casey Reas,et al.  Processing: a programming handbook for visual designers and artists , 2007 .

[30]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[31]  Roberto Therón,et al.  BicOverlapper: A tool for bicluster visualization , 2008, Bioinform..

[32]  Kai Li,et al.  Visualization methods for statistical analysis of microarray clusters , 2005, BMC Bioinformatics.

[33]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[34]  Ben Shneiderman,et al.  Interactively Exploring Hierarchical Clustering Results , 2002, Computer.

[35]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[36]  Wojtek J. Krzanowski,et al.  Biclustering models for structured microarray data , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Francisco Tirado,et al.  Modulating the Expression of Disease Genes with RNA-Based Therapy , 2006, BMC Bioinformatics.

[38]  Gautam Kumar,et al.  Visual Exploration of Complex Time-Varying Graphs , 2006, IEEE Transactions on Visualization and Computer Graphics.

[39]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.