On Biclustering of Gene Expression Data

Microarray technology enables the monitoring of the expression patterns of a huge number of genes across different experimental conditions or time points simultaneously. Biclustering of microarray data is an important technique to discover a group of genes that are co-regulated in a subset of experimental conditions. Traditional clustering algorithms find groups of genes/conditions over the complete feature space. Therefore they may fail to discover the local patterns where a subset of genes has similar behaviour over a subset of conditions. Biclustering algorithms aim to discover such local patterns from the gene expression matrix, thus can be thought as simultaneous clustering of genes and conditions. In recent years, a large number of biclustering algorithms have been proposed in literature. In this article, a study has been made on various issues regarding the biclustering problem along with a comprehensive survey on available biclustering algorithms. Moreover, a survey on freely available biclustering software is also made.

[1]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[2]  Feng Liu,et al.  Biclustering of Gene Expression Data Using PSO-GA Hybrid , 2007, 2007 1st International Conference on Bioinformatics and Biomedical Engineering.

[3]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[4]  D. Botstein,et al.  The transcriptional program of sporulation in budding yeast. , 1998, Science.

[5]  Wan-Chi Siu,et al.  BiVisu: software tool for bicluster detection and visualization , 2007, Bioinform..

[6]  Ujjwal Maulik,et al.  An improved algorithm for clustering gene expression data , 2007, Bioinform..

[7]  John Quackenbush,et al.  Microarray gene expression data analysis - a beginner's guide , 2003 .

[8]  Ron Shamir,et al.  CLICK and EXPANDER: a system for clustering and visualizing gene expression data , 2003, Bioinform..

[9]  Roberto Therón,et al.  BicOverlapper: A tool for bicluster visualization , 2008, Bioinform..

[10]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[11]  Hong Yan,et al.  Fuzzy biclustering for DNA microarray data analysis , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[12]  Ya Zhang,et al.  A time-series biclustering algorithm for revealing co-regulated genes , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[13]  Christodoulos A. Floudas,et al.  Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies , 2008, BMC Bioinformatics.

[14]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[15]  Aidong Zhang,et al.  Interrelated two-way clustering: an unsupervised approach for gene expression data analysis , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[16]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[17]  Fabrício Olivetti de França,et al.  Applying Biclustering to Text Mining: An Immune-Inspired Approach , 2007, ICARIS.

[18]  Lai-Wan Chan,et al.  Biclustering Gene Expression Profiles by Alternately Sorting with Weighted Correlated Coefficient , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[19]  Jun Zhu,et al.  Genetic Algorithms Applied to Multi-Class Clustering for Gene Expression Data , 2003, Genomics, proteomics & bioinformatics.

[20]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[21]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[22]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[23]  Shiyong Lu,et al.  GFBA: A Biclustering Algorithm for Discovering Value-Coherent Biclusters , 2007, ISBRA.

[24]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[25]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[26]  T. M. Murali,et al.  Extracting Conserved Gene Expression Motifs from Gene Expression Data , 2002, Pacific Symposium on Biocomputing.

[27]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[28]  Joseph T. Chang,et al.  Spectral biclustering of microarray data: coclustering genes and conditions. , 2003, Genome research.

[29]  Zhaohui S. Qin,et al.  Clustering microarray gene expression data using weighted Chinese restaurant process , 2006, Bioinform..

[30]  Krista Rizman Zalik,et al.  Biclustering of gene expression data , 2005 .

[31]  Hitashyam Maka,et al.  Biclustering of Gene Expression Data Using Genetic Algorithm , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[32]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..

[33]  Eckart Zitzler,et al.  BicAT: a biclustering analysis toolbox , 2006, Bioinform..

[34]  Fabrício Olivetti de França,et al.  A Multi-Objective Multipopulation Approach for Biclustering , 2008, ICARIS.

[35]  Rainer Fuchs,et al.  Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters , 2001, Bioinform..

[36]  C. Müller,et al.  Large-scale clustering of cDNA-fingerprinting data. , 1999, Genome research.

[37]  Mohammed J. Zaki,et al.  MicroCluster: efficient deterministic biclustering of microarray data , 2005, IEEE Intelligent Systems.

[38]  Arlindo L. Oliveira,et al.  An Efficient Biclustering Algorithm for Finding Genes with Similar Patterns in Time-series Expression Data , 2007, APBC.

[39]  Mohamed A. Ismail,et al.  BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis , 2009 .

[40]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[41]  Simon Kasif,et al.  GEMS: a web server for biclustering analysis of expression data , 2005, Nucleic Acids Res..

[42]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Ka Yee Yeung,et al.  Validating clustering for gene expression data , 2001, Bioinform..

[44]  Bart De Moor,et al.  Biclustering microarray data by Gibbs sampling , 2003, ECCB.

[45]  Federico Divina,et al.  A multi-objective approach to discover biclusters in microarray data , 2007, GECCO '07.

[46]  Ujjwal Maulik,et al.  Unsupervised cancer classification through SVM-boosted multiobjective fuzzy clustering with majority voting ensemble , 2009, 2009 IEEE Congress on Evolutionary Computation.

[47]  Eugenio Cesario,et al.  Random walk biclustering for microarray data , 2008, Inf. Sci..

[48]  A. Chakraborty Biclustering of gene expression data by simulated annealing , 2005, Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05).

[49]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[50]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[51]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[52]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[53]  A. Mukhopadhyay,et al.  Evolving coherent and non-trivial biclusters from gene expression data: An evolutionary approach , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[54]  F. Liu,et al.  Biclustering of Gene Expression Data Based on Bucketing Technique , 2007, 2007 1st International Conference on Bioinformatics and Biomedical Engineering.

[55]  Jun S Liu,et al.  Bayesian biclustering of gene expression data , 2008, BMC Genomics.

[56]  Achuthsankar S. Nair,et al.  Biclustering of gene expression data using reactive greedy randomized adaptive search procedure , 2009, BMC Bioinformatics.

[57]  Padraig Cunningham,et al.  Biclustering of expression data using simulated annealing , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[58]  Luca Benini,et al.  Discovering Coherent Biclusters from Gene Expression Data Using Zero-Suppressed Binary Decision Diagrams , 2005, TCBB.

[59]  Ujjwal Maulik,et al.  Finding Multiple Coherent Biclusters in Microarray Data Using Variable String Length Multiobjective Genetic Algorithm , 2009, IEEE Transactions on Information Technology in Biomedicine.

[60]  Liu Juan,et al.  Biclustering of Gene Expression Data with a New Hybrid Multi-Objective Evolutionary Algorithm of NSGA-II and EDA , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[61]  Dimitris K. Tasoulis,et al.  Unsupervised Clustering of Bioinformatics Data , 2004 .

[62]  William-Chandra Tjhi,et al.  Flexible Fuzzy Co-clustering with Feature-cluster Weighting , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.

[63]  Ahmed H. Tewfik,et al.  DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach , 2006, EURASIP J. Adv. Signal Process..

[64]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[65]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Francisco Tirado,et al.  Modulating the Expression of Disease Genes with RNA-Based Therapy , 2006, BMC Bioinformatics.

[67]  Fabrício Olivetti de França,et al.  Applying Biclustering to Perform Collaborative Filtering , 2007, Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007).

[68]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[69]  Ron Shamir,et al.  EXPANDER – an integrative program suite for microarray data analysis , 2005, BMC Bioinformatics.

[70]  Joana P Gonçalves,et al.  BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data , 2009, BMC Research Notes.