Biclustering of DNA Microarray Data: Theory, Evaluation, and Applications

In this chapter, different methods and applications of biclustering algorithms to DNA microarray data analysis that have been developed in recent years are discussed and compared. Identification of biological significant clusters of genes from microarray experimental data is a very daunting task that emerged, especially with the development of high throughput technologies. Various computational and evaluation methods based on diverse principles were introduced to identify new similarities among genes. Mathematical aspects of the models are highlighted, and applications to solve biological problems are discussed. Panayiotis V. Benos University of Pittsburgh, USA

[1]  Wan-Chi Siu,et al.  BiVisu: software tool for bicluster detection and visualization , 2007, Bioinform..

[2]  Obi L. Griffith,et al.  KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis , 2009, 2009 3rd International Conference on Bioinformatics and Biomedical Engineering.

[3]  Ahmed H. Tewfik,et al.  DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach , 2006, EURASIP J. Adv. Signal Process..

[4]  Purvesh Khatri,et al.  Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate , 2003, Nucleic Acids Res..

[5]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[6]  Timothy S Gardner,et al.  Reverse-engineering transcription control networks. , 2005, Physics of life reviews.

[7]  Michael Q. Zhang,et al.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae , 1999, Bioinform..

[8]  Roberto Therón,et al.  BicOverlapper: A tool for bicluster visualization , 2008, Bioinform..

[9]  G. ErikaJohanaSalazar,et al.  A Cluster Validity Index for Comparing Non-hierarchical Clustering Methods , 2002 .

[10]  Julio Collado-Vides,et al.  RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions , 2005, Nucleic Acids Res..

[11]  Li Teng,et al.  Order Preserving Clustering by Finding Frequent Orders in Gene Expression Data , 2007, PRIB.

[12]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[13]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[14]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[15]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..

[16]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[17]  Grier P. Page,et al.  Bioinformatic Tools for Inferring Functional Information from Plant Microarray Data II: Analysis Beyond Single Gene , 2008, International journal of plant genomics.

[18]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[19]  Sven Bergmann,et al.  Modular analysis of gene expression data with R , 2010, Bioinform..

[20]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[21]  Sieu Phan,et al.  GOAL: A software tool for assessing biological significance of genes groups , 2009, BMC Bioinformatics.

[22]  Panayiotis V. Benos,et al.  Extracting biologically significant patterns from short time series gene expression data , 2009, BMC Bioinformatics.

[23]  W. Wong,et al.  GoSurfer: a graphical interactive tool for comparative analysis of large gene sets in Gene Ontology space. , 2004, Applied bioinformatics.

[24]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[25]  Krishnarao Appasani,et al.  Experimental Design for Gene Expression Analysis , 2007, Bioarrays.

[26]  Roberto Therón,et al.  A visual analytics approach for understanding biclustering results from microarray data , 2008, BMC Bioinformatics.

[27]  Ahmed H. Tewfik,et al.  Early detection of ovarian cancer using group biomarkers , 2008, Molecular Cancer Therapeutics.

[28]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[29]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[30]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Amir Hussain,et al.  A new biclustering technique based on crossing minimization , 2006, Neurocomputing.

[32]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[33]  Wojtek J. Krzanowski,et al.  Biclustering models for structured microarray data , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[35]  Martin Kuiper,et al.  BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in Biological Networks , 2005, Bioinform..

[36]  E. Salmon Gene Expression During the Life Cycle of Drosophila melanogaster , 2002 .

[37]  David J. Reiss,et al.  Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks , 2006, BMC Bioinformatics.

[38]  P. Pardalos,et al.  Biclustering EEG data from epileptic patients treated with vagus nerve stimulation , 2007 .

[39]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[40]  Joaquín Dopazo,et al.  BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments , 2006, Nucleic Acids Res..

[41]  Martin Schader,et al.  A New Algorithm for Two-Mode Clustering , 1996 .

[42]  Hong Yan,et al.  A neural-network approach for biclustering of gene expression data based on the plaid model , 2008, 2008 International Conference on Machine Learning and Cybernetics.

[43]  E. Wit Design and Analysis of DNA Microarray Investigations , 2004, Human Genomics.

[44]  Jun S Liu,et al.  Bayesian biclustering of gene expression data , 2008, BMC Genomics.

[45]  T. Speed,et al.  Design issues for cDNA microarray experiments , 2002, Nature Reviews Genetics.

[46]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[47]  H. Bussey,et al.  Exploring genetic interactions and networks with yeast , 2007, Nature Reviews Genetics.

[48]  Pooja Jain,et al.  The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae , 2005, Nucleic Acids Res..

[49]  Andreas Zell,et al.  EDISA: extracting biclusters from multiple time-series of gene expression profiles , 2007, BMC Bioinformatics.

[50]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[51]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[52]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[53]  N. H. Shah,et al.  CLENCH: a program for calculating Cluster ENriCHment using the Gene Ontology , 2004, Bioinform..

[54]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[55]  Ahmed H. Tewfik,et al.  Biological evaluation of biclustering algorithms using Gene Ontology and chIP-chip data , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[56]  Lin Yang,et al.  Transcriptional regulatory networks in embryonic stem cells. , 2011, Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques.

[57]  T. M. Murali,et al.  Extracting Conserved Gene Expression Motifs from Gene Expression Data , 2002, Pacific Symposium on Biocomputing.

[58]  Peer Bork,et al.  KEGG Atlas mapping for global analysis of metabolic pathways , 2008, Nucleic Acids Res..

[59]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[60]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[61]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[62]  Zelmina Lubovac,et al.  Biological and statistical evaluation of clusterings of gene expression profiles , 2001 .

[63]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..