Convergence analysis of some multiobjective evolutionary algorithms when discovering motifs

An important issue in multiobjective optimization is the study of the convergence speed of algorithms. An optimization problem must be defined as simple as possible to minimize the computational cost required to solve it. In this work, we study the convergence speed of seven multiobjective evolutionary algorithms: DEPT, MO-VNS, MOABC, MO-GSA, MO-FA, NSGA-II, and SPEA2; when solving an important biological problem: the motif discovery problem. We have used twelve instances of four different organisms as benchmark, analyzing the number of fitness function evaluations required by each algorithm to achieve reasonable quality solutions. We have used the hypervolume indicator to evaluate the solutions discovered by each algorithm, measuring its quality every 100 evaluations. This methodology also allows us to study the hit rates of the algorithms over 30 independent runs. Moreover, we have made a deeper study in the more complex instance of each organism. In this study, we observe the increase of the archive (number of non-dominated solutions) and the spread of the Pareto fronts obtained by the algorithm in the median execution. As we will see, our study reveals that DEPT, MOABC, and MO-FA provide the best convergence speeds and the highest hit rates.

[1]  Kathleen Marchal,et al.  A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling , 2001, Bioinform..

[2]  Z. Weng,et al.  Finding functional sequence elements by multiple local alignment. , 2004, Nucleic acids research.

[3]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[4]  Hossein Nezamabadi-pour,et al.  BGSA: binary gravitational search algorithm , 2010, Natural Computing.

[5]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[6]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[7]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[8]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[9]  Miguel A. Vega-Rodríguez,et al.  Applying a Multiobjective Gravitational Search Algorithm (MO-GSA) to Discover Motifs , 2011, IWANN.

[10]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[11]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[12]  Saurabh Sinha,et al.  YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation , 2003, Nucleic Acids Res..

[13]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[14]  Eleazar Eskin,et al.  Finding composite regulatory patterns in DNA sequences , 2002, ISMB.

[15]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[16]  Charles Elkan,et al.  Unsupervised learning of multiple motifs in biopolymers using expectation maximization , 1995, Mach. Learn..

[17]  G. Stormo,et al.  ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[18]  G. Fogel,et al.  Discovery of sequence motifs related to coexpression of genes using evolutionary computation. , 2004, Nucleic acids research.

[19]  Mikhail S. Gelfand,et al.  A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length , 2005, Bioinform..

[20]  W. J. Kent,et al.  Environmentally Induced Foregut Remodeling by PHA-4/FoxA and DAF-12/NHR , 2004, Science.

[21]  Miguel A. Vega-Rodríguez,et al.  Comparing Multiobjective Artificial Bee Colony Adaptations for Discovering DNA Motifs , 2012, EvoBIO.

[22]  Dipankar Dasgupta,et al.  Motif discovery in upstream sequences of coordinately expressed genes , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[23]  Miguel A. Vega-Rodríguez,et al.  Solving the motif discovery problem by using Differential Evolution with Pareto Tournaments , 2010, IEEE Congress on Evolutionary Computation.

[24]  Yuehui Chen,et al.  Bacterial Foraging Optimization Algorithm Integrating Tabu Search for Motif Discovery , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[25]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[26]  P. D’haeseleer What are DNA sequence motifs? , 2006, Nature Biotechnology.

[27]  El-Ghazali Talbi,et al.  Metaheuristics - From Design to Implementation , 2009 .

[28]  Pavel A. Pevzner,et al.  Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[29]  Mireille Régnier,et al.  Rare Events and Conditional Events on Random Strings , 2004, Discret. Math. Theor. Comput. Sci..

[30]  Yuehui Chen,et al.  Motif Discovery Using Evolutionary Algorithms , 2009, 2009 International Conference of Soft Computing and Pattern Recognition.

[31]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[32]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[33]  Khaled Ghédira,et al.  Estimating nadir point in multi-objective optimization using mobile reference points , 2010, IEEE Congress on Evolutionary Computation.

[34]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[35]  Khaled Ghédira,et al.  The r-Dominance: A New Dominance Relation for Interactive Evolutionary Multicriteria Decision Making , 2010, IEEE Transactions on Evolutionary Computation.

[36]  Miguel A. Vega-Rodríguez,et al.  Finding Motifs in DNA Sequences Applying a Multiobjective Artificial Bee Colony (MOABC) Algorithm , 2011, EvoBio.

[37]  Khaled Rasheed,et al.  MDGA: motif discovery using a genetic algorithm , 2005, GECCO '05.

[38]  Miguel A. Vega-Rodríguez,et al.  Predicting DNA Motifs by Using Evolutionary Multiobjective Optimization , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[39]  Rong-Ming Chen,et al.  FMGA: finding motifs by genetic algorithm , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[40]  Hossein Nezamabadi-pour,et al.  GSA: A Gravitational Search Algorithm , 2009, Inf. Sci..

[41]  J. Collado-Vides,et al.  Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. , 2000, Nucleic acids research.

[42]  Nicola Beume,et al.  Pareto-, Aggregation-, and Indicator-Based Methods in Many-Objective Optimization , 2007, EMO.

[43]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[44]  Eckart Zitzler,et al.  Indicator-Based Selection in Multiobjective Search , 2004, PPSN.

[45]  Mehmet Kaya,et al.  MOGAMOD: Multi-objective genetic algorithm for motif discovery , 2009, Expert Syst. Appl..

[46]  Miguel A. Vega-Rodríguez,et al.  Comparing multiobjective swarm intelligence metaheuristics for DNA motif discovery , 2013, Eng. Appl. Artif. Intell..

[47]  Gary B. Fogel,et al.  Evolutionary computation for discovery of composite transcription factor binding sites , 2008, Nucleic acids research.

[48]  Janez Brest,et al.  An improved self-adaptive differential evolution algorithm in single objective constrained real-parameter optimization , 2010, IEEE Congress on Evolutionary Computation.

[49]  Yang Xin-She マルチモーダル最適化のためのFireflyアルゴリズム | 文献情報 | J-GLOBAL 科学技術総合リンクセンター , 2009 .

[50]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[51]  Leping Li,et al.  GADEM: A Genetic Algorithm Guided Formation of Spaced Dyads Coupled with an EM Algorithm for Motif Discovery , 2009, J. Comput. Biol..

[52]  Miguel A. Vega-Rodríguez,et al.  A Multiobjective Variable Neighborhood Search for Solving the Motif Discovery Problem , 2010, SOCO.

[53]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[54]  Graziano Pesole,et al.  An algorithm for finding signals of unknown length in DNA sequences , 2001, ISMB.

[55]  Andrew M. Tyrrell,et al.  Regulatory Motif Discovery Using a Population Clustering Evolutionary Algorithm , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[56]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.