Adaptive memory programming: local search parallel algorithms for phylogenetic tree construction

One of the most important aspect of molecular and computational biology is the reconstruction of evolutionary relationships. The area is well explored after decades of intensive research. Despite this fact there remains a need for good and efficient algorithms that are capable of reconstructing the evolutionary relationship in reasonable time.Since the problem is computationally intractable, exact algorithms are used only for small groups of species. In the Maximum Parsimony approach the time of computation grows so fast when number of sequences increases, that in practice it is possible to find the optimal solution for instances containing about 20 sequences only.It is this reason that in practical applications, heuristic methods are used. In this paper, parallel adaptive memory programming algorithms based on Maximum Parsimony and some known neighborhood search methods for phylogenetic tree construction are proposed, and the results of computational experiments are presented. The proposed algorithms achieve a superlinear speedup and find solutions of good quality.

[1]  Alexandros Stamatakis,et al.  Distributed and parallel algorithms and systems for inference of huge phylogenetic trees based on the maximum likelihood method , 2004 .

[2]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[3]  Fred W. Glover,et al.  Tabu Search - Part I , 1989, INFORMS J. Comput..

[4]  M. Nei,et al.  MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. , 2007, Molecular biology and evolution.

[5]  Jeremy T. Fineman,et al.  Reconstruction of Evolutionary Trees , 2011, Encyclopedia of Parallel Computing.

[6]  K. Lange Reconstruction of Evolutionary Trees , 1997 .

[7]  Jin-Kao Hao,et al.  Local Search for the Maximum Parsimony Problem , 2005, ICNC.

[8]  Daniel Barker,et al.  LVB: parsimony and simulated annealing in the search for phylogenetic trees , 2000, Bioinform..

[9]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[10]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[11]  Luca Maria Gambardella,et al.  Adaptive memory programming: A unified view of metaheuristics , 1998, Eur. J. Oper. Res..

[12]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[13]  Bernard M. E. Moret,et al.  Rec-I-DCM3: a fast algorithmic technique for reconstructing phylogenetic trees , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[14]  A. Oskooi Molecular Evolution and Phylogenetics , 2008 .

[15]  Jin-Kao Hao,et al.  Progressive Tree Neighborhood applied to the Maximum Parsimony Problem , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  M. Luckow,et al.  AN EMPIRICAL COMPARISON OF NUMERICAL WAGNER COMPUTER PROGRAMS , 1985, Cladistics : the international journal of the Willi Hennig Society.

[17]  K. Nixon,et al.  The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis , 1999, Cladistics : the international journal of the Willi Hennig Society.

[18]  A. Edwards,et al.  The reconstruction of evolution , 1963 .

[19]  Celso C. Ribeiro,et al.  Heuristics for the Phylogeny Problem , 2002, J. Heuristics.

[20]  Douglas C. Schmidt,et al.  Object-oriented application frameworks , 1997, CACM.

[21]  D. Penny,et al.  Branch and bound algorithms to determine minimal evolutionary trees , 1982 .

[22]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[23]  Carlos Eduardo Ferreira,et al.  Parallelisation of a multi-neighbourhood local search heuristic for a phylogeny problem , 2009, Int. J. Bioinform. Res. Appl..

[24]  Jacek Blazewicz,et al.  Computational methods in diagnostics of chronic hepatitis C , 2005 .

[25]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[26]  P. Goloboff Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima , 1999, Cladistics : the international journal of the Willi Hennig Society.

[27]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[28]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[29]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[30]  Jacek Blazewicz,et al.  Parallel Algorithms for Evolutionary History Reconstruction , 2003, PPAM.

[31]  David S. Johnson,et al.  The computational complexity of inferring rooted phylogenies by parsimony , 1986 .

[32]  Tandy J. Warnow,et al.  Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees , 2004, IEEE Computer Society Computational Systems Bioinformatics Conference.

[33]  Folker Meyer,et al.  Rose: generating sequence families , 1998, Bioinform..

[34]  Pierre Hansen,et al.  Variable Neighbourhood Search , 2003 .

[35]  Celso C. Ribeiro,et al.  A GRASP/VND heuristic for the phylogeny problem using a new neighborhood structure , 2005, Int. Trans. Oper. Res..

[36]  P. Meisel Margaret O. Dayhoff: Atlas of Protein Sequence and Structure 1969 (Volume 4) XXIV u. 361 S., 21 Ausklapptafeln, 68 Abb. und zahlreiche Tabellen. National Biomedical Research Foundation, Silver Spring/Maryland 1969. Preis $ 12,50 , 1971 .

[37]  Fred Glover,et al.  Tabu Search and Adaptive Memory Programming — Advances, Applications and Challenges , 1997 .

[38]  Shu-Cherng Fang,et al.  A tabu search algorithm for maximum parsimony phylogeny inference , 2007, Eur. J. Oper. Res..

[39]  Tandy J. Warnow,et al.  The Effect of the Guide Tree on Multiple Sequence Alignments and Subsequent Phylogenetic Analysis , 2007, Pacific Symposium on Biocomputing.