Binary Particle Swarm Optimization Versus Hybrid Genetic Algorithm for Inferring Well Supported Phylogenetic Trees

The amount of completely sequenced chloroplast genomes increases rapidly every day, leading to the possibility to build large-scale phylogenetic trees of plant species. Considering a subset of close plant species defined according to their chloroplasts, the phylogenetic tree that can be inferred by their core genes is not necessarily well supported, due to the possible occurrence of problematic genes (i.e., homoplasy, incomplete lineage sorting, horizontal gene transfers, etc.) which may blur the phylogenetic signal. However, a trustworthy phylogenetic tree can still be obtained provided such a number of blurring genes is reduced. The problem is thus to determine the largest subset of core genes that produces the best-supported tree. To discard problematic genes and due to the overwhelming number of possible combinations, this article focuses on how to extract the largest subset of sequences in order to obtain the most supported species tree. Due to computational complexity, a distributed Binary Particle Swarm Optimization (BPSO) is proposed in sequential and distributed fashions. Obtained results from both versions of the BPSO are compared with those computed using an hybrid approach embedding both genetic algorithms and statistical tests. The proposal has been applied to different cases of plant families, leading to encouraging results for these families.

[1]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[2]  Jacques M. Bahi,et al.  Finding the Core-Genes of Chloroplasts , 2014, ArXiv.

[3]  Robert K. Jansen,et al.  Automatic annotation of organellar genomes with DOGMA , 2004, Bioinform..

[4]  Masami Hasegawa,et al.  CONSEL: for assessing the confidence of phylogenetic tree selection , 2001, Bioinform..

[5]  Jacques M. Bahi,et al.  Hybrid Genetic Algorithm and Lasso Test Approach for Inferring Well Supported Phylogenetic Trees Based on Subsets of Chloroplastic Core Genes , 2015, AlCoB.

[6]  Jacques M. Bahi,et al.  Gene similarity-based approaches for determining core-genes of chloroplasts , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[7]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[8]  M. A. Khanesar,et al.  A novel binary particle swarm optimization , 2007, 2007 Mediterranean Conference on Control & Automation.

[9]  M. Clerc,et al.  The swarm and the queen: towards a deterministic and adaptive particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[10]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[11]  K. Premalatha,et al.  Hybrid PSO and GA for Global Maximization , 2009 .

[12]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[13]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[14]  Ellips Masehian,et al.  Particle Swarm Optimization Methods, Taxonomy and Applications , 2009 .