Statistical inference of allopolyploid species networks in the presence of incomplete lineage sorting.

Polyploidy is an important speciation mechanism, particularly in land plants. Allopolyploid species are formed after hybridization between otherwise intersterile parental species. Recent theoretical progress has led to successful implementation of species tree models that take population genetic parameters into account. However, these models have not included allopolyploid hybridization and the special problems imposed when species trees of allopolyploids are inferred. Here, 2 new models for the statistical inference of the evolutionary history of allopolyploids are evaluated using simulations and demonstrated on 2 empirical data sets. It is assumed that there has been a single hybridization event between 2 diploid species resulting in a genomic allotetraploid. The evolutionary history can be represented as a species network or as a multilabeled species tree, in which some pairs of tips are labeled with the same species. In one of the models (AlloppMUL), the multilabeled species tree is inferred directly. This is the simplest model and the most widely applicable, since fewer assumptions are made. The second model (AlloppNET) incorporates the hybridization event explicitly which means that fewer parameters need to be estimated. Both models are implemented in the BEAST framework. Simulations show that both models are useful and that AlloppNET is more accurate if the assumptions it is based on are valid. The models are demonstrated on previously analyzed data from the genera Pachycladon (Brassicaceae) and Silene (Caryophyllaceae).

[1]  C. Campbell,et al.  Ancient allopolyploid speciation in Geinae (Rosaceae): evidence from nuclear granule-bound starch synthase (GBSSI) gene sequences. , 2003, Systematic biology.

[2]  Zhi-Zhong Chen,et al.  HybridNET: a tool for constructing hybridization networks , 2010, Bioinform..

[3]  M. Suchard,et al.  Bayesian Phylogenetics with BEAUti and the BEAST 1.7 , 2012, Molecular biology and evolution.

[4]  Luay Nakhleh,et al.  The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection , 2012, PLoS genetics.

[5]  Itay Mayrose,et al.  The frequency of polyploid speciation in vascular plants , 2009, Proceedings of the National Academy of Sciences.

[6]  Vincent Moulton,et al.  Reconstructing the evolutionary history of polyploids from multilabeled trees. , 2006, Molecular biology and evolution.

[7]  Robert J Henry,et al.  Plant Diversity and Evolution: Genotypic and Phenotypic Variation in Higher Plants , 2004 .

[8]  A. Drummond,et al.  Bayesian Inference of Species Trees from Multilocus Data , 2009, Molecular biology and evolution.

[9]  P. Lockhart,et al.  A Pleistocene inter-tribal allopolyploidization event precedes the species radiation of Pachycladon (Brassicaceae) in New Zealand. , 2009, Molecular phylogenetics and evolution.

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Hayley C. Lanier,et al.  Is recombination a problem for species-tree analyses? , 2012, Systematic biology.

[12]  David Gerard,et al.  Estimating hybridization in the presence of coalescence using phylogenetic intraspecific sampling , 2011, BMC Evolutionary Biology.

[13]  B. Rannala,et al.  Bayesian species delimitation using multilocus sequence data , 2010, Proceedings of the National Academy of Sciences.

[14]  B. Oxelman,et al.  Phylogenetic relationships within Silene (Caryophyllaceae) section Physolychnis , 2011 .

[15]  M A Newton,et al.  Bayesian Phylogenetic Inference via Markov Chain Monte Carlo Methods , 1999, Biometrics.

[16]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[17]  K. T. Huber,et al.  Phylogenetic networks from multi-labelled trees , 2006, Journal of mathematical biology.

[18]  L. Kubatko Identifying hybridization events in the presence of coalescence via model selection. , 2009, Systematic biology.

[19]  Vincent Moulton,et al.  Inferring polyploid phylogenies from multiply-labeled gene trees , 2009, BMC Evolutionary Biology.

[20]  D. Soltis,et al.  Polyploidy in Plants , 2005 .

[21]  Craig Moritz,et al.  Coalescent-based species delimitation in an integrative taxonomy. , 2012, Trends in ecology & evolution.

[22]  R. Henry,et al.  Polyploidy and evolution in plants. , 2005 .

[23]  Katharina T. Huber,et al.  PADRE: a package for analyzing and displaying reticulate evolution , 2009, Bioinform..

[24]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[25]  Bengt Oxelman,et al.  Inferring Species Networks from Gene Trees in High-Polyploid North American and Hawaiian Violets (Viola, Violaceae) , 2011, Systematic biology.

[26]  Ziheng Yang,et al.  Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. , 2003, Genetics.

[27]  Vincent Moulton,et al.  Untangling complex histories of genome mergings in high polyploids. , 2007, Systematic biology.

[28]  A. Rambaut,et al.  BEAST: Bayesian evolutionary analysis by sampling trees , 2007, BMC Evolutionary Biology.

[29]  Zhi-Zhong Chen,et al.  Algorithms for Reticulate Networks of Multiple Phylogenetic Trees , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[30]  Tanja Gernhard,et al.  The conditioned reconstructed process. , 2008, Journal of theoretical biology.

[31]  M. Bonierbale,et al.  Single copy nuclear gene analysis of polyploidy in wild potatoes (Solanum section Petota) , 2012, BMC Evolutionary Biology.

[32]  Bengt Oxelman,et al.  Origin and Evolution of a Circumpolar Polyploid Species Complex in Silene (Caryophyllaceae) Inferred from Low Copy Nuclear RNA Polymerase Introns, rDNA, and Chloroplast DNA , 2005 .