A novel probabilistic encoding for EAs applied to biclustering of microarray data

In this paper we propose a novel representation scheme, called probabilistic encoding. In this representation, each gene of an individual represents the probability that a certain trait of a given problem has to belong to the solution. This allows to deal with uncertainty that can be present in an optimization problem, and grant more exploration capability to an evolutionary algorithm. With this encoding, the search is not restricted to points of the search space. Instead, whole regions are searched, with the aim of individuating a promising region, i.e., a region that contains the optimal solution. This implies that a strategy for searching the individuated region has to be adopted. In this paper we incorporate the probabilistic encoding into a multi-objective and multi-modal evolutionary algorithm. The algorithm returns a promising region, which is then searched by using simulated annealing. We apply our proposal to the problem of discovering biclusters in microarray data. Results confirm the validity of our proposal.

[1]  Eckart Zitzler,et al.  An EA framework for biclustering of gene expression data , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[2]  Xiaodong Li,et al.  This article has been accepted for inclusion in a future issue. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 1 Locating and Tracking Multiple Dynamic Optima by a Particle Swarm Model Using Speciation , 2022 .

[3]  Peter J. Fleming,et al.  Multiobjective optimization and multiple constraint handling with evolutionary algorithms. I. A unified formulation , 1998, IEEE Trans. Syst. Man Cybern. Part A.

[4]  D. Fogel,et al.  Basic Algorithms and Operators , 1999 .

[5]  David E. Goldberg,et al.  A niched Pareto genetic algorithm for multiobjective optimization , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.

[6]  Shengxiang Yang,et al.  Evolutionary Computation in Dynamic and Uncertain Environments , 2007, Studies in Computational Intelligence.

[7]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[8]  Werner Dubitzky,et al.  A Practical Approach to Microarray Data Analysis , 2003, Springer US.

[9]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[10]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[11]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[12]  Federico Divina,et al.  Biclustering of expression data with evolutionary computation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[13]  Sushmita Mitra,et al.  Multi-objective evolutionary biclustering of gene expression data , 2006, Pattern Recognit..

[14]  Hans-Georg Beyer,et al.  A general noise model and its effects on evolution strategy performance , 2006, IEEE Transactions on Evolutionary Computation.

[15]  Fabrício Olivetti de França,et al.  Multi-Objective Biclustering: When Non-dominated Solutions are not Enough , 2009, J. Math. Model. Algorithms.

[16]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Jürgen Branke,et al.  Efficient search for robust solutions by means of evolutionary algorithms and fitness approximation , 2006, IEEE Transactions on Evolutionary Computation.

[18]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[19]  Padraig Cunningham,et al.  Biclustering of expression data using simulated annealing , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[20]  Federico Divina,et al.  A multi-objective approach to discover biclusters in microarray data , 2007, GECCO '07.

[21]  David B. Fogel,et al.  Evolution-ary Computation 1: Basic Algorithms and Operators , 2000 .

[22]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[23]  Philip S. Yu,et al.  Enhanced biclustering on expression data , 2003, Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings..

[24]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[25]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[26]  Kathleen Marchal,et al.  ProBic: identification of overlapping biclusters using Probabilistic Relational Models, applied to simulated gene expression data. , 2001 .

[27]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[28]  Liu Juan,et al.  Biclustering of Gene Expression Data with a New Hybrid Multi-Objective Evolutionary Algorithm of NSGA-II and EDA , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.