The Challenges of Resolving a Rapid, Recent Radiation: Empirical and Simulated Phylogenomics of Philippine Shrews.

Phylogenetic relationships in recent, rapid radiations can be difficult to resolve due to incomplete lineage sorting and reliance on genetic markers that evolve slowly relative to the rate of speciation. By incorporating hundreds to thousands of unlinked loci, phylogenomic analyses have the potential to mitigate these difficulties. Here, we attempt to resolve phylogenetic relationships among eight shrew species (genus Crocidura) from the Philippines, a phylogenetic problem that has proven intractable with small (< 10 loci) data sets. We sequenced hundreds of ultraconserved elements and whole mitochondrial genomes in these species and estimated phylogenies using concatenation, summary coalescent, and hierarchical coalescent methods. The concatenated approach recovered a maximally supported and fully resolved tree. In contrast, the coalescent-based approaches produced similar topologies, but each had several poorly supported nodes. Using simulations, we demonstrate that the concatenated tree could be positively misleading. Our simulations also show that the tree shape we tend to infer, which involves a series of short internal branches, is difficult to resolve, even if substitution models are known and multiple individuals per species are sampled. As such, the low support we obtained for backbone relationships in our coalescent-based inferences reflects a real and appropriate lack of certainty. Our results illuminate the challenges of estimating a bifurcating tree in a rapid and recent radiation, providing a rare empirical example of a nearly simultaneous series of speciation events in a terrestrial animal lineage as it spreads across an oceanic archipelago.

[1]  M. Sanderson,et al.  Age and rate of diversification of the Hawaiian silversword alliance (Compositae). , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[2]  S. Churchfield The Natural History of Shrews , 1991 .

[3]  J. Diamond,et al.  Biogeographic umbilici and the origin of the Philippine avifauna , 1983 .

[4]  G. Miller Descriptions of two new genera and sixteen new species of mammals from the Philippine Islands , 1910 .

[5]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[6]  Remco R. Bouckaert,et al.  DensiTree: making sense of sets of phylogenetic trees , 2010, Bioinform..

[7]  V. Savolainen,et al.  Biogeography of Sulawesian shrews: testing for their origin with a parametric bootstrap on molecular data. , 1998, Molecular phylogenetics and evolution.

[8]  T. Price,et al.  Adaptive radiation, nonadaptive radiation, ecological speciation and nonecological speciation. , 2009, Trends in ecology & evolution.

[9]  L. Duret,et al.  Is RAD-seq suitable for phylogenetic inference? An in silico assessment and optimization , 2013, Ecology and evolution.

[10]  R. Ree,et al.  Inferring Phylogenies from RAD Sequence Data , 2012, PloS one.

[11]  D. Weisrock,et al.  Rapid lineage accumulation in a non-adaptive radiation: phylogenetic analysis of diversification rates in eastern North American woodland salamanders (Plethodontidae: Plethodon) , 2006, Proceedings of the Royal Society B: Biological Sciences.

[12]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[13]  L Lacey Knowles,et al.  Estimating species trees: methods of phylogenetic analysis when there is incongruence across genes. , 2009, Systematic biology.

[14]  S. Maher,et al.  Species Interactions during Diversification and Community Assembly in an Island Radiation of Shrews , 2011, PloS one.

[15]  L. Knowles,et al.  Unforeseen Consequences of Excluding Missing Data from Next-Generation Sequences: Simulation Study of RAD Sequences. , 2016, Systematic biology.

[16]  L. Knowles,et al.  What is the danger of the anomaly zone for empirical phylogenetics? , 2009, Systematic biology.

[17]  Dong Xie,et al.  BEAST 2: A Software Platform for Bayesian Evolutionary Analysis , 2014, PLoS Comput. Biol..

[18]  N. Rosenberg,et al.  Discordance of Species Trees with Their Most Likely Gene Trees , 2006, PLoS genetics.

[19]  Brant C. Faircloth,et al.  CloudForest: Bug fix in parallel bootstrapping , 2014 .

[20]  Tandy J. Warnow,et al.  Naive binning improves phylogenomic analyses , 2013, Bioinform..

[21]  D. Schluter Ecological Causes of Adaptive Radiation , 1996, The American Naturalist.

[22]  Charles W. Linkem,et al.  A hybrid phylogenetic-phylogenomic approach for species tree estimation in African Agama lizards with applications to biogeography, character evolution, and diversification. , 2014, Molecular phylogenetics and evolution.

[23]  Qixin He,et al.  Sources of error inherent in species-tree estimation: impact of mutational and coalescent effects on accuracy and implications for choosing among different methods. , 2010, Systematic biology.

[24]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[25]  John Gatesy,et al.  Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum. , 2014, Molecular phylogenetics and evolution.

[26]  Deren A. R. Eaton,et al.  Inferring Phylogeny and Introgression using RADseq Data: An Example from Flowering Plants (Pedicularis: Orobanchaceae) , 2013, Systematic biology.

[27]  Michael C Whitlock,et al.  The incomplete natural history of mitochondria , 2004, Molecular ecology.

[28]  Laura Salter Kubatko,et al.  STEM: species tree estimation using maximum likelihood for gene trees under coalescence , 2009, Bioinform..

[29]  R. Mittermeier,et al.  Biodiversity hotspots for conservation priorities , 2000, Nature.

[30]  R. Timm,et al.  Do Geological or Climatic Processes Drive Speciation in Dynamic Archipelagos? the Tempo and Mode of Diversification in Southeast Asian Shrews , 2009, Evolution; international journal of organic evolution.

[31]  Tandy Warnow,et al.  Evaluating Summary Methods for Multilocus Species Tree Estimation in the Presence of Incomplete Lineage Sorting. , 2016, Systematic biology.

[32]  J. Diamond,et al.  Explosive Pleistocene diversification and hemispheric expansion of a “great speciator” , 2009, Proceedings of the National Academy of Sciences.

[33]  L. Stein,et al.  Species trees from highly incongruent gene trees in rice. , 2009, Systematic biology.

[34]  L. Lacey Knowles,et al.  Resolving Species Phylogenies of Recent Evolutionary Radiations1 , 2008 .

[35]  Sen Song,et al.  Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model , 2012, Proceedings of the National Academy of Sciences.

[36]  Jacob A. Esselstyn,et al.  Colonization of the Philippines from Taiwan: a multi‐locus test of the biogeographic and phylogenetic relationships of isolated populations of shrews , 2010 .

[37]  A. Lemmon,et al.  Anchored hybrid enrichment for massively high-throughput phylogenomics. , 2012, Systematic biology.

[38]  L. Kubatko,et al.  Inconsistency of phylogenetic estimates from concatenated data under coalescence. , 2007, Systematic biology.

[39]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[40]  David Fernández-Baca,et al.  Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance , 2012, Algorithms for Molecular Biology.

[41]  A. Achmadi,et al.  Carving out turf in a biodiversity hotspot: multiple, previously unrecognized shrew species co‐occur on Java Island, Indonesia , 2013, Molecular ecology.

[42]  A. Drummond,et al.  Bayesian Inference of Species Trees from Multilocus Data , 2009, Molecular biology and evolution.

[43]  O. Gascuel,et al.  A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. , 2003, Systematic biology.

[44]  Tandy J. Warnow,et al.  ASTRAL: genome-scale coalescent-based species tree estimation , 2014, Bioinform..

[45]  B. Faircloth,et al.  Not All Sequence Tags Are Created Equal: Designing and Validating Sequence Identification Tags Robust to Indels , 2012, PloS one.

[46]  Travis C Glenn,et al.  Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. , 2012, Systematic biology.

[47]  J. Maguire,et al.  Solution Hybrid Selection with Ultra-long Oligonucleotides for Massively Parallel Targeted Sequencing , 2009, Nature Biotechnology.

[48]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[49]  Rafe M. Brown,et al.  The role of repeated sea-level fluctuations in the generation of shrew (Soricidae: Crocidura) diversity in the Philippine Archipelago. , 2009, Molecular phylogenetics and evolution.

[50]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[51]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[52]  S. Goodman,et al.  New species of shrew (Soricidae: Crocidura) from Sibuyan Island, Philippines , 2010 .

[53]  A. Peterson,et al.  Integrating phylogenetic and taxonomic evidence illuminates complex biogeographic patterns along Huxley’s modification of Wallace’s Line , 2010 .

[54]  D. Pearl,et al.  High-resolution species trees without concatenation , 2007, Proceedings of the National Academy of Sciences.

[55]  L. Knowles,et al.  How low can you go? The effects of mutation rate on the accuracy of species-tree estimation. , 2014, Molecular phylogenetics and evolution.

[56]  J. Slowinski,et al.  Molecular polytomies. , 2001, Molecular phylogenetics and evolution.

[57]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[58]  Ziheng Yang,et al.  Unguided Species Delimitation Using DNA Sequence Data from Multiple Loci , 2014, Molecular biology and evolution.

[59]  Jeet Sukumaran,et al.  DendroPy: a Python library for phylogenetic computing , 2010, Bioinform..

[60]  T. Buckley,et al.  Model misspecification and probabilistic tests of topology: evidence from empirical data sets. , 2002, Systematic biology.

[61]  J. Gatesy,et al.  The supermatrix approach to systematics. , 2007, Trends in ecology & evolution.

[62]  W. Maddison,et al.  Inferring phylogeny despite incomplete lineage sorting. , 2006, Systematic biology.

[63]  Sergei L. Kosakovsky Pond,et al.  Statistics and truth in phylogenomics. , 2012, Molecular biology and evolution.

[64]  Qixin He,et al.  Full modeling versus summarizing gene-tree uncertainty: method choice and species-tree accuracy. , 2012, Molecular phylogenetics and evolution.

[65]  Scott V Edwards,et al.  A maximum pseudo-likelihood approach for estimating species trees under the coalescent model , 2010, BMC Evolutionary Biology.

[66]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[67]  E. Braun,et al.  POLYTOMIES, THE POWER OF PHYLOGENETIC INFERENCE, AND THE STOCHASTIC NATURE OF MOLECULAR EVOLUTION: A COMMENT ON WALSH ET AL. (1999) , 2001, Evolution; international journal of organic evolution.

[68]  A. Lemmon,et al.  Effectiveness of phylogenomic data and coalescent species-tree methods for resolving difficult nodes in the phylogeny of advanced snakes (Serpentes: Caenophidia). , 2014, Molecular phylogenetics and evolution.

[69]  Travis C Glenn,et al.  Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics. , 2013, Systematic biology.

[70]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[71]  Travis C. Glenn,et al.  A Phylogeny of Birds Based on Over 1,500 Loci Collected by Target Enrichment and High-Throughput Sequencing , 2012, PloS one.

[72]  Astrid Cruaud,et al.  Empirical assessment of RAD sequencing for interspecific phylogeny. , 2014, Molecular biology and evolution.

[73]  David Bryant,et al.  Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis. , 2009, Molecular biology and evolution.

[74]  John Gatesy,et al.  Land plant origins and coalescence confusion. , 2014, Trends in plant science.

[75]  A. Brelsford,et al.  The biogeography of mitochondrial and nuclear discordance in animals , 2012, Molecular ecology.

[76]  John E McCormack,et al.  Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design. , 2009, Systematic biology.

[77]  W. Maddison Gene Trees in Species Trees , 1997 .

[78]  B. Rannala,et al.  Bayesian species delimitation using multilocus sequence data , 2010, Proceedings of the National Academy of Sciences.

[79]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[80]  B. Faircloth,et al.  Target capture and massively parallel sequencing of ultraconserved elements for comparative studies at shallow evolutionary time scales. , 2013, Systematic biology.

[81]  Jeffrey P Townsend,et al.  Profiling phylogenetic informativeness. , 2007, Systematic biology.

[82]  D. Reich,et al.  Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture , 2012, Genome research.

[83]  L. Bachmann,et al.  Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads—a baiting and iterative mapping approach , 2013, Nucleic acids research.

[84]  R. Lanfear,et al.  Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. , 2012, Molecular biology and evolution.