Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent

The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation.

[1]  Loren H. Rieseberg,et al.  Hybrid Origins of Plant Species , 1997 .

[2]  L. Nakhleh,et al.  Computational approaches to species phylogeny inference and gene tree reconciliation. , 2013, Trends in ecology & evolution.

[3]  L. Nakhleh,et al.  ALGORITHMIC STRATEGIES FOR ESTIMATING THE AMOUNT OF RETICULATION FROM A COLLECTION OF GENE TREES , 2010 .

[4]  A. Drummond,et al.  Bayesian Inference of Species Trees from Multilocus Data , 2009, Molecular biology and evolution.

[5]  P. Green,et al.  Trans-dimensional Markov chain Monte Carlo , 2000 .

[6]  Luay Nakhleh,et al.  Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting. , 2011, Systematic biology.

[7]  D. Morrison,et al.  Networks in phylogenetic analysis: new tools for population biology. , 2005, International journal for parasitology.

[8]  Yufeng Wu,et al.  Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees , 2010, Bioinform..

[9]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[10]  Laura Salter Kubatko,et al.  STEM: species tree estimation using maximum likelihood for gene trees under coalescence , 2009, Bioinform..

[11]  N. Barton The role of hybridization in evolution , 2001, Molecular ecology.

[12]  Luay Nakhleh,et al.  The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection , 2012, PLoS genetics.

[13]  James Mallet,et al.  How reticulated are species? , 2015, BioEssays : news and reviews in molecular, cellular and developmental biology.

[14]  R. Page,et al.  From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. , 1997, Molecular phylogenetics and evolution.

[15]  W. Doolittle,et al.  Prokaryotic evolution in light of gene transfer. , 2002, Molecular biology and evolution.

[16]  Yun Yu,et al.  A maximum pseudo-likelihood approach for phylogenetic networks , 2015, BMC Genomics.

[17]  W. Maddison Gene Trees in Species Trees , 1997 .

[18]  Luay Nakhleh,et al.  PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships , 2008, BMC Bioinformatics.

[19]  Scott V Edwards,et al.  Estimating phylogenetic trees from genome‐scale data , 2015, Annals of the New York Academy of Sciences.

[20]  Steven Kelk,et al.  Networks: expanding evolutionary thinking. , 2013, Trends in genetics : TIG.

[21]  M. Holder,et al.  Hastings ratio of the LOCAL proposal used in Bayesian phylogenetics. , 2005, Systematic biology.

[22]  Liang Liu,et al.  BEST: Bayesian estimation of species trees under the coalescent model , 2008, Bioinform..

[23]  F. Delsuc,et al.  Phylogenomics and the reconstruction of the tree of life , 2005, Nature Reviews Genetics.

[24]  Claudia R. Solís-Lemus,et al.  Inferring Phylogenetic Networks with Maximum Pseudolikelihood under Incomplete Lineage Sorting , 2015, PLoS genetics.

[25]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[26]  Yun Yu,et al.  Fast algorithms and heuristics for phylogenomics under ILS and hybridization , 2013, BMC Bioinformatics.

[27]  David Bryant,et al.  Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. , 2009, Molecular biology and evolution.

[28]  E. Koonin,et al.  Horizontal gene transfer in prokaryotes: quantification and classification. , 2001, Annual review of microbiology.

[29]  Manuel Spannagl,et al.  Ancient hybridizations among the ancestral genomes of bread wheat , 2014, Science.

[30]  Luay Nakhleh,et al.  Parsimonious inference of hybridization in the presence of incomplete lineage sorting. , 2013, Systematic biology.

[31]  Luay Nakhleh,et al.  Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis , 2016, Molecular ecology.

[32]  Marc A Suchard,et al.  Unifying vertical and nonvertical evolution: a stochastic ARG-based framework. , 2010, Systematic biology.

[33]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[34]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[35]  J. Mallet Hybridization as an invasion of the genome. , 2005, Trends in ecology & evolution.

[36]  K. Holsinger,et al.  Polytomies and Bayesian phylogenetic inference. , 2005, Systematic biology.

[37]  Yufeng Wu,et al.  COALESCENT‐BASED SPECIES TREE INFERENCE FROM GENE TREE TOPOLOGIES UNDER INCOMPLETE LINEAGE SORTING BY MAXIMUM LIKELIHOOD , 2012, Evolution; international journal of organic evolution.

[38]  Daniel H. Huson,et al.  Fast computation of minimum hybridization networks , 2012, Bioinform..

[39]  Michael DeGiorgio,et al.  Robustness to divergence time underestimation when inferring species trees from estimated gene trees. , 2014, Systematic biology.

[40]  J. Mallet Hybrid speciation , 2007, Nature.

[41]  M. Arnold Natural Hybridization and Evolution , 1997 .

[42]  Andrew G. Clark,et al.  Conundrum of jumbled mosquito genomes , 2015, Science.

[43]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[44]  Kevin J. Liu,et al.  Maximum likelihood inference of reticulate evolutionary histories , 2014, Proceedings of the National Academy of Sciences.

[45]  Xiaofang Jiang,et al.  Extensive introgression in a malaria vector species complex revealed by phylogenomics , 2015, Science.

[46]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[47]  J. Felsenstein,et al.  EVOLUTIONARY TREES FROM GENE FREQUENCIES AND QUANTITATIVE CHARACTERS: FINDING MAXIMUM LIKELIHOOD ESTIMATES , 1981, Evolution; international journal of organic evolution.

[48]  Scott V Edwards,et al.  Coalescent methods for estimating phylogenetic trees. , 2009, Molecular phylogenetics and evolution.