Evolutionary metaheuristic for biclustering based on linear correlations among genes

A new measure to evaluate the quality of a bicluster is proposed in this paper. This measure is based on correlations among genes. Moreover, a new evolutionary metaheuristic based on Scatter Search, which uses this measure as the fitness function, is presented to obtain biclusters that contain groups de highly-correlated genes. Later, an analysis of the correlation matrix of these biclusters is made to select these groups of genes that define new biclusters with shifting and scaling patterns. Experimental results from human Bcell lymphoma are presented.

[1]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[3]  Ana Gabriela Maguitman,et al.  Biclustering in data mining using a memetic multi-objective evolutionary algorithm , 2008 .

[4]  Concha Bielza,et al.  Machine Learning in Bioinformatics , 2008, Encyclopedia of Database Systems.

[5]  Rafael Martí,et al.  Scatter Search: Diseño Básico y Estrategias avanzadas , 2002, Inteligencia Artif..

[6]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[7]  Panos M. Pardalos,et al.  Biclustering in data mining , 2008, Comput. Oper. Res..

[8]  Padraig Cunningham,et al.  Biclustering of expression data using simulated annealing , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[9]  Hong Yan,et al.  A new geometric biclustering algorithm based on the Hough transform for analysis of large-scale microarray data. , 2008, Journal of theoretical biology.

[10]  Juan A. Nepomuceno,et al.  Biclusters Evaluation Based on Shifting and Scaling Patterns , 2007, IDEAL.

[11]  Hong Yan,et al.  Discovering biclusters in gene expression data based on high-dimensional linear geometries , 2008, BMC Bioinformatics.

[12]  Robert J. Beaver,et al.  An Introduction to Probability Theory and Mathematical Statistics , 1977 .

[13]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[14]  Armando Blanco,et al.  Possibilistic approach for biclustering microarray data , 2007, Comput. Biol. Medicine.

[15]  Federico Divina,et al.  Virtual Error: A New Measure for Evolutionary Biclustering , 2007, EvoBIO.

[16]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Robert M. Haralick,et al.  Exploiting the Geometry of Gene Expression Patterns for Unsupervised Learning , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[18]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[19]  Robert M. Haralick,et al.  Mining Subspace Correlations , 2007, 2007 IEEE Symposium on Computational Intelligence and Data Mining.