A probabilistic coevolutionary biclustering algorithm for discovering coherent patterns in gene expression dataset

BackgroundBiclustering has been utilized to find functionally important patterns in biological problem. Here a bicluster is a submatrix that consists of a subset of rows and a subset of columns in a matrix, and contains homogeneous patterns. The problem of finding biclusters is still challengeable due to computational complex trying to capture patterns from two-dimensional features.ResultsWe propose a Probabilistic COevolutionary Biclustering Algorithm (PCOBA) that can cluster the rows and columns in a matrix simultaneously by utilizing a dynamic adaptation of multiple species and adopting probabilistic learning. In biclustering problems, a coevolutionary search is suitable since it can optimize interdependent subcomponents formed of rows and columns. Furthermore, acquiring statistical information on two populations using probabilistic learning can improve the ability of search towards the optimum value. We evaluated the performance of PCOBA on synthetic dataset and yeast expression profiles. The results demonstrated that PCOBA outperformed previous evolutionary computation methods as well as other biclustering methods.ConclusionsOur approach for searching particular biological patterns could be valuable for systematically understanding functional relationships between genes and other biological components at a genome-wide level.

[1]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[2]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[3]  Zhoujun Li,et al.  Dynamic biclustering of microarray data by multi-objective immune optimization , 2011, BMC Genomics.

[4]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[5]  María S. Pérez-Hernández,et al.  GA-EDA: Hybrid Evolutionary Algorithm Using Genetic and Estimation of Distribution Algorithms , 2004, IEA/AIE.

[6]  Shumeet Baluja,et al.  A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .

[7]  Vipin Kumar,et al.  Discovery of error-tolerant biclusters from noisy gene expression data , 2011, BMC Bioinformatics.

[8]  Kenneth A. De Jong,et al.  A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.

[9]  W. Daniel Hillis,et al.  Co-evolving parasites improve simulated evolution as an optimization procedure , 1990 .

[10]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[11]  Robert Axelrod,et al.  The Evolution of Strategies in the Iterated Prisoner's Dilemma , 2001 .

[12]  Lawrence Davis,et al.  Genetic Algorithms and Simulated Annealing , 1987 .

[13]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[14]  Sankar K. Pal,et al.  A MOE framework for Biclustering of Microarray Data , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Simon Kasif,et al.  GEMS: a web server for biclustering analysis of expression data , 2005, Nucleic Acids Res..

[16]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[18]  Moshe Sipper,et al.  Coevolving solutions to the shortest common superstring problem. , 2004, Bio Systems.

[19]  Jianmin Wu,et al.  PINA v2.0: mining interactome modules , 2011, Nucleic Acids Res..

[20]  Qingfu Zhang,et al.  An evolutionary algorithm with guided mutation for the maximum clique problem , 2005, IEEE Transactions on Evolutionary Computation.

[21]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[22]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[23]  Kathleen Marchal,et al.  An ensemble biclustering approach for querying gene expression compendia with experimental lists , 2011, Bioinform..

[24]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[25]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[26]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[27]  Yie-Hwa Chang,et al.  N‐terminal methionine removal and methionine metabolism in Saccharomyces cerevisiae , 2003, Journal of cellular biochemistry.