Using MOEA with Redistribution and Consensus Branches to Infer Phylogenies

In recent years, to infer phylogenies, which are NP-hard problems, more and more research has focused on using metaheuristics. Maximum Parsimony and Maximum Likelihood are two effective ways to conduct inference. Based on these methods, which can also be considered as the optimal criteria for phylogenies, various kinds of multi-objective metaheuristics have been used to reconstruct phylogenies. However, combining these two time-consuming methods results in those multi-objective metaheuristics being slower than a single objective. Therefore, we propose a novel, multi-objective optimization algorithm, MOEA-RC, to accelerate the processes of rebuilding phylogenies using structural information of elites in current populations. We compare MOEA-RC with two representative multi-objective algorithms, MOEA/D and NAGA-II, and a non-consensus version of MOEA-RC on three real-world datasets. The result is, within a given number of iterations, MOEA-RC achieves better solutions than the other algorithms.

[1]  Hideo Matsuda,et al.  Construction of Phylogenetic Trees from Amino Acid Sequences using a Genetic Algorithm , 1995 .

[2]  Clare Bates Congdon Gaphyl: An Evolutionary Algorithms Approach For The Study Of Natural Evolution , 2002, GECCO.

[3]  Temple F. Smith,et al.  On the similarity of dendrograms. , 1978, Journal of theoretical biology.

[4]  A. Lemmon,et al.  The metapopulation genetic algorithm: An efficient solution for the problem of large phylogeny estimation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[6]  Tamir Tuller,et al.  Maximum likelihood of evolutionary trees: hardness and approximation , 2005, ISMB.

[7]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[8]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[9]  S. Tavaré Some probabilistic and statistical problems in the analysis of DNA sequences , 1986 .

[10]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[11]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[12]  James F. Smith Phylogenetics of seed plants : An analysis of nucleotide sequences from the plastid gene rbcL , 1993 .

[13]  Miguel A. Vega-Rodríguez,et al.  A hybrid approach to parallelize a fast non‐dominated sorting genetic algorithm for phylogenetic inference , 2015, Concurr. Comput. Pract. Exp..

[14]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[15]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[16]  Michel C. Milinkovitch,et al.  MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics , 2010, BMC Bioinformatics.

[17]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[18]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[19]  Arndt von Haeseler,et al.  pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies , 2005, Bioinform..

[20]  Hisao Ishibuchi,et al.  Comparison between Single-Objective and Multi-Objective Genetic Algorithms: Performance Comparison and Performance Measures , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[21]  S. Jeffery Evolution of Protein Molecules , 1979 .

[22]  Leon Poladian,et al.  Multi-objective evolutionary algorithms and phylogenetic inference with multiple data sets , 2006, Soft Comput..

[23]  Bui Quang Minh,et al.  Parallel Reconstruction of Large Maximum Likelihood Phylogenies , 2005 .

[24]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[25]  Carlos A. Coello Coello,et al.  Advances in Multi-Objective Nature Inspired Computing , 2010, Advances in Multi-Objective Nature Inspired Computing.

[26]  W. Cancino,et al.  A Multi-Criterion Evolutionary Approach Applied to Phylogenetic Reconstruction , 2010 .

[27]  Atte Moilanen,et al.  Searching for Most Parsimonious Trees with Simulated Evolutionary Optimization , 1999 .

[28]  Harold L. Drake,et al.  Hitherto Unknown [Fe-Fe]-Hydrogenase Gene Diversity in Anaerobes and Anoxic Enrichments from a Moderately Acidic Fen , 2010, Applied and Environmental Microbiology.

[29]  Wing-Kin Sung,et al.  Improved Algorithms for Constructing Consensus Trees , 2013, SODA.

[30]  Miguel A. Vega-Rodríguez,et al.  Parallel Multiobjective Metaheuristics for Inferring Phylogenies on Multicore Clusters , 2015, IEEE Transactions on Parallel and Distributed Systems.

[31]  J. Gordon Burleigh,et al.  Assessing Systematic Error in the Inference of Seed Plant Phylogeny , 2007, International Journal of Plant Sciences.

[32]  J. Macey,et al.  Plethodontid salamander mitochondrial genomics: A parsimony evaluation of character conflict and implications for historical biogeography , 2005, Cladistics : the international journal of the Willi Hennig Society.

[33]  Max Ingman,et al.  mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences , 2005, Nucleic Acids Res..

[34]  Miguel A. Vega-Rodríguez,et al.  On the design of shared memory approaches to parallelize a multiobjective bee-inspired proposal for phylogenetic reconstruction , 2015, Inf. Sci..

[35]  Pablo A. Goloboff,et al.  TNT, a free program for phylogenetic analysis , 2008 .

[36]  Aravind Seshadri,et al.  A FAST ELITIST MULTIOBJECTIVE GENETIC ALGORITHM: NSGA-II , 2000 .

[37]  Derrick J. Zwickl Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion , 2006 .

[38]  P. Lewis,et al.  A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. , 1998, Molecular biology and evolution.