Integrated likelihood for phylogenomics under a no-common-mechanism model

The availability of genome-wide sequence data from a large number of species as well as data from multiple individuals within a species has ushered in the era of phylogenomics. In this era, species phylogeny inference is based on models of sequence evolution on gene trees as well as models of gene tree evolution within the branches of species phylogenies. Parsimony, likelihood, Bayesian, and distance methods have been introduced for species phylogeny inference based on such models. All methods, except for the parsimony ones, assume a common mechanism across all loci as captured by a single value of each branch length of the species phylogeny. In this paper, we propose a “no common mechanism” (NCM) model, where every gene tree evolves according to its own parameters of the species phylogeny. An analogous model was proposed and explored, both mathematically and experimentally, for sites, or characters, in a sequence alignment in the context of the classical phylogeny problem. For example, a famous equivalence between the maximum parsimony and maximum likelihood phylogeny estimates was established under certain NCM models by Tuffley and Steel. Here we derive an analytically integrated likelihood of both species trees and networks given the gene trees of multiple loci under an NCM model. We demonstrate the performance of inference under this integrated likelihood on both simulated and biological data. The model presented here will afford opportunities for exploring connections among various methods for estimating species phylogenies from multiple, independent loci.

[1]  Luay Nakhleh,et al.  Species Tree Inference by Minimizing Deep Coalescences , 2009, PLoS Comput. Biol..

[2]  Luay Nakhleh,et al.  Co-estimating Reticulate Phylogenies and Gene Trees from Multi-locus Sequence Data , 2017, bioRxiv.

[3]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[4]  Luay Nakhleh,et al.  Reticulate evolutionary history and extensive introgression in mosquito species revealed by phylogenetic network analysis , 2016, Molecular ecology.

[5]  J. Huelsenbeck,et al.  A Bayesian perspective on a non-parsimonious parsimony model. , 2008, Systematic biology.

[6]  Marc A Suchard,et al.  Biologically inspired phylogenetic models strongly outperform the no common mechanism model. , 2011, Systematic biology.

[7]  M. Holder,et al.  The akaike information criterion will not choose the no common mechanism model. , 2010, Systematic biology.

[8]  Manolis Kellis,et al.  Unified modeling of gene duplication, loss, and coalescence using a locus tree. , 2012, Genome research.

[9]  Luay Nakhleh,et al.  Parsimonious inference of hybridization in the presence of incomplete lineage sorting. , 2013, Systematic biology.

[10]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[11]  Kevin J. Liu,et al.  Maximum likelihood inference of reticulate evolutionary histories , 2014, Proceedings of the National Academy of Sciences.

[12]  Luay Nakhleh,et al.  Supplementary Information : Co-estimating Reticulate Phylogenies and Gene Trees from Multi-locus Sequence Data , 2017 .

[13]  Xiaofang Jiang,et al.  Extensive introgression in a malaria vector species complex revealed by phylogenomics , 2015, Science.

[14]  L. Nakhleh Evolutionary Phylogenetic Networks: Models and Issues , 2010 .

[15]  Luay Nakhleh,et al.  PhyloNet: a software package for analyzing and reconstructing reticulate evolutionary relationships , 2008, BMC Bioinformatics.

[16]  Luay Nakhleh,et al.  Inferring Phylogenetic Networks Using PhyloNet , 2017, bioRxiv.

[17]  M Steel,et al.  Links between maximum likelihood and maximum parsimony under a simple model of site substitution. , 1997, Bulletin of mathematical biology.

[18]  Alan M. Moses,et al.  Widespread Discordance of Gene Trees with Species Tree in Drosophila: Evidence for Incomplete Lineage Sorting , 2006, PLoS genetics.

[19]  Luay Nakhleh,et al.  The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection , 2012, PLoS genetics.

[20]  D Penny,et al.  Parsimony, likelihood, and the role of models in molecular phylogenetics. , 2000, Molecular biology and evolution.

[21]  L. Nakhleh,et al.  A Metric on the Space of Reduced Phylogenetic Networks , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[22]  W. Maddison Gene Trees in Species Trees , 1997 .

[23]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[24]  Albert J. Vilella,et al.  Insights into hominid evolution from the gorilla genome sequence , 2012, Nature.

[25]  Mike Steel,et al.  Can we avoid "SIN" in the house of "no common mechanism"? , 2009, Systematic biology.