A novel evolutionary algorithm for bi-clustering of gene expression data based on the Order Preserving Sub-Matrix (OPSM) constraint

Biclustering is a popular method which can reveal unknown genetic pathways. However, even though many algorithms have been suggested, no overwhelming algorithm has been suggested, due to its significant search space, until now. In this respect, several evolutionary algorithms tried to address this problem utilizing the powerful search capability of Evolutionary Computation (EC). However, most algorithms focused on exploiting the Mean Square Residue (MSR) measure which was proposed by Cheng and Church. The Order Preserving Sub-Matrix (OPSM) constraint was rarely considered even though it promises more biologically relevant biclusters than the MSR measure. The goal of this paper is to design an EC algorithm which ensures biologically significant biclusters by using the OPSM constraint and better biclusters than the original OPSM algorithm. We designed a novel encoding method and evolutionary operators suitable for the OPSM constraint. To efficiently explore the search space, we modulized our evolutionary algorithm and applied the co-evolution concept. Through a set of experiments, it was confirmed that our algorithm outperformed a representative EC biclustering algorithm based on CC and the original OPSM algorithm.

[1]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[2]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[3]  Miguel Toro,et al.  Evolutionary learning of hierarchical decision rules , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[6]  Graham F. Spencer,et al.  Automatic Generation of Programs for Crawling and Walking , 1993, International Conference on Genetic Algorithms.

[7]  David B. Fogel,et al.  Evolving Behaviors in the Iterated Prisoner's Dilemma , 1993, Evolutionary Computation.

[8]  Jiong Yang,et al.  Biclustering in gene expression data by tendency , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[9]  Peter J. Bentley,et al.  CREATIVE EVOLUTIONARY SYSTEMS , 2001 .

[10]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[11]  Federico Divina,et al.  Evolutionary Concept Learning , 2002, GECCO.

[12]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[13]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[14]  Shiyong Lu,et al.  GFBA: A Biclustering Algorithm for Discovering Value-Coherent Biclusters , 2007, ISBRA.

[15]  Takeshi Yamada,et al.  A Genetic Algorithm Applicable to Large-Scale Job-Shop Problems , 1992, PPSN.

[16]  ThieleLothar,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006 .

[17]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[18]  Federico Divina,et al.  Evolutionary Search of Biclusters by Minimal Intrafluctuation , 2007, 2007 IEEE International Fuzzy Systems Conference.

[19]  Peter Ross,et al.  Fast Practical Evolutionary Timetabling , 1994, Evolutionary Computing, AISB Workshop.

[20]  David E. Goldberg,et al.  Alleles, loci and the traveling salesman problem , 1985 .

[21]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[22]  James E. Baker,et al.  Adaptive Selection Methods for Genetic Algorithms , 1985, International Conference on Genetic Algorithms.

[23]  Isabel M. Ramos,et al.  An evolutionary approach to estimating software development projects , 2001, Inf. Softw. Technol..

[24]  Eckart Zitzler,et al.  Order Preserving Clustering over Multiple Time Course Experiments , 2005, EvoWorkshops.

[25]  J. Orlin Contentment in graph theory: Covering graphs with cliques , 1977 .

[26]  Hitashyam Maka,et al.  Biclustering of Gene Expression Data Using Genetic Algorithm , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[27]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[28]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[29]  Gennady M Verkhivker,et al.  Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolutionary programming. , 1995, Chemistry & biology.