A multi-objective approach to discover biclusters in microarray data

The main motivation for using a multi-objective evolutionary algorithm for finding biclusters in gene expression data is motivated by the fact that when looking for biclusters in gene expression matrix, several objectives have to be optimized simultaneously, and often these objectives are in conflict with each other. Moreover, the use of evolutionary computation is justified by the huge dimensionality of the search space, since it is known that evolutionary algorithms have great exploration power. We focus our attention on finding biclusters of high quality with large variation. This is because, in expression data analysis, the most important goal may not be finding biclusters containing many genes and conditions, as it might be more interesting to find a set of genes showing similar behavior under a set of conditions. Experimental results confirm the validity of the proposed technique.

[1]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[2]  Federico Divina,et al.  Evolutionary Biclustering of Microarray Data , 2005, EvoWorkshops.

[3]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[4]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[5]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[6]  P. J. Fleming,et al.  The good of the many outweighs the good of the one: evolutionary multi-objective optimization , 2003 .

[7]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[8]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[9]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[10]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Roded Sharan,et al.  Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[12]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[13]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[14]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[15]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[16]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[17]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Jesús S. Aguilar-Ruiz,et al.  Natural Encoding for Evolutionary Supervised Learning , 2007, IEEE Transactions on Evolutionary Computation.

[19]  Philip S. Yu,et al.  Clustering by pattern similarity in large data sets , 2002, SIGMOD '02.

[20]  Eckart Zitzler,et al.  Order Preserving Clustering over Multiple Time Course Experiments , 2005, EvoWorkshops.

[21]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.