DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies

DeCoSTAR is a software that aims at reconstructing the organization of ancestral genes or genomes in the form of sets of neighborhood relations (adjacencies) between pairs of ancestral genes or gene domains. It can also improve the assembly of fragmented genomes by proposing evolutionary-induced adjacencies between scaffolding fragments. Ancestral genes or domains are deduced from reconciled phylogenetic trees under an evolutionary model that considers gains, losses, speciations, duplications, and transfers as possible events for gene evolution. Reconciliations are either given as input or computed with the ecceTERA package, into which DeCoSTAR is integrated. DeCoSTAR computes adjacency evolutionary scenarios using a scoring scheme based on a weighted sum of adjacency gains and breakages. Solutions, both optimal and near-optimal, are sampled according to the Boltzmann–Gibbs distribution centered around parsimonious solutions, and statistical supports on ancestral and extant adjacencies are provided. DeCoSTAR supports the features of previously contributed tools that reconstruct ancestral adjacencies, namely DeCo, DeCoLT, ART-DeCo, and DeClone. In a few minutes, DeCoSTAR can reconstruct the evolutionary history of domains inside genes, of gene fusion and fission events, or of gene order along chromosomes, for large data sets including dozens of whole genomes from all kingdoms of life. We illustrate the potential of DeCoSTAR with several applications: ancestral reconstruction of gene orders for Anopheles mosquito genomes, multidomain proteins in Drosophila, and gene fusion and fission detection in Actinobacteria. Availability: http://pbil.univ-lyon1.fr/software/DeCoSTAR (Last accessed April 24, 2017).

[1]  É. Tannier,et al.  The Inference of Gene Trees with Species Trees , 2013, Systematic biology.

[2]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[3]  Pierre Brézellec,et al.  Gene fusion/fission is a major contributor to evolution of multi-domain bacterial proteins , 2006, Bioinform..

[4]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[5]  Manolis Kellis,et al.  TreeFix: Statistically Informed Gene Tree Error Correction Using Species Trees , 2012, Systematic biology.

[6]  Manolis Kellis,et al.  Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss , 2012, Bioinform..

[7]  Annie Chateau,et al.  Ancestral gene synteny reconstruction improves extant species scaffolding , 2015, bioRxiv.

[8]  Cedric Chauve,et al.  Evolution of genes neighborhood within reconciled phylogenies: an ensemble approach , 2015, bioRxiv.

[9]  Maureen Stolzer,et al.  Event inference in multidomain families with phylogenetic reconciliation , 2015, BMC Bioinformatics.

[10]  N. El-Mabrouk,et al.  Efficient gene tree correction guided by species and synteny evolution , 2015 .

[11]  Nadia El-Mabrouk,et al.  Efficient Gene Tree Correction Guided by Genome Evolution , 2016, PloS one.

[12]  Manolis Kellis,et al.  Evolution at the Subgene Level: Domain Rearrangements in the Drosophila Phylogeny , 2011, Molecular biology and evolution.

[13]  Yann Ponty,et al.  ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony , 2016, Bioinform..

[14]  J. Lagergren,et al.  Simultaneous Bayesian gene tree reconstruction and reconciliation analysis , 2009, Proceedings of the National Academy of Sciences.

[15]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[16]  Murray Patterson,et al.  Lateral gene transfer, rearrangement, reconciliation , 2013, BMC Bioinformatics.

[17]  Ján Manuch,et al.  Linearization of ancestral multichromosomal genomes , 2012, BMC Bioinformatics.

[18]  Guy Perrière,et al.  Databases of homologous gene families for comparative genomics , 2009, BMC Bioinformatics.

[19]  Gergely J. Szöllősi,et al.  Lateral Gene Transfer from the Dead , 2012, Systematic biology.

[20]  J. McInerney,et al.  A Pluralistic Account of Homology: Adapting the Models to the Data , 2013, Molecular biology and evolution.

[21]  Dannie Durand,et al.  Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees , 2012, Bioinform..

[22]  Gergely J. Szöllosi,et al.  Evolution of gene neighborhoods within reconciled phylogenies , 2012, Bioinform..

[23]  Sarah A Teichmann,et al.  Relative rates of gene fusion and fission in multi-domain proteins. , 2005, Trends in genetics : TIG.

[24]  James E. Allen,et al.  Highly evolvable malaria vectors: The genomes of 16 Anopheles mosquitoes , 2014, Science.

[25]  Nicolas C. Rochette,et al.  Bio++: efficient extensible libraries and tools for computational molecular evolution. , 2013, Molecular biology and evolution.

[26]  Sandeep Koranne,et al.  Boost C++ Libraries , 2011 .

[27]  James E. Allen,et al.  ModelOMatic: fast and automated model selection between RY, nucleotide, amino acid, and codon substitution models. , 2015, Systematic biology.