Measuring community similarity with phylogenetic networks.

Environmental drivers of biodiversity can be identified by relating patterns of community similarity to ecological factors. Community variation has traditionally been assessed by considering changes in species composition and more recently by incorporating phylogenetic information to account for the relative similarity of taxa. Here, we describe how an important class of measures including Bray-Curtis, Canberra, and UniFrac can be extended to allow community variation to be computed on a phylogenetic network. We focus on phylogenetic split systems, networks that are produced by the widely used median network and neighbor-net methods, which can represent incongruence in the evolutionary history of a set of taxa. Calculating β diversity over a split system provides a measure of community similarity averaged over uncertainty or conflict in the available phylogenetic signal. Our freely available software, Network Diversity, provides 11 qualitative (presence-absence, unweighted) and 14 quantitative (weighted) network-based measures of community similarity that model different aspects of community richness and evenness. We demonstrate the broad applicability of network-based diversity approaches by applying them to three distinct data sets: pneumococcal isolates from distinct geographic regions, human mitochondrial DNA data from the Indonesian island of Nias, and proteorhodopsin sequences from the Sargasso and Mediterranean Seas. Our results show that major expected patterns of variation for these data sets are recovered using network-based measures, which indicates that these patterns are robust to phylogenetic uncertainty and conflict. Nonetheless, network-based measures of community similarity can differ substantially from measures ignoring phylogenetic relationships or from tree-based measures when incongruent signals are present in the underlying data. Network-based measures provide a methodology for assessing the robustness of β-diversity results in light of incongruent phylogenetic signal and allow β diversity to be calculated over widely used network structures such as median networks.

[1]  K. Holsinger,et al.  Genetics in geographically structured populations: defining, estimating and interpreting FST , 2009, Nature Reviews Genetics.

[2]  S. Allison,et al.  Drivers of bacterial β-diversity depend on spatial scale , 2011, Proceedings of the National Academy of Sciences.

[3]  A. Kerkhoff,et al.  Microbes on mountainsides: Contrasting elevational patterns of bacterial and plant diversity , 2008, Proceedings of the National Academy of Sciences.

[4]  E. Koonin,et al.  Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. , 2000, Science.

[5]  O. Lao,et al.  Unexpected island effects at an extreme: reduced Y chromosome and mitochondrial DNA diversity in Nias. , 2011, Molecular biology and evolution.

[6]  V. Moulton,et al.  Neighbor-net: an agglomerative method for the construction of phylogenetic networks. , 2002, Molecular biology and evolution.

[7]  V. Moulton,et al.  Computing Phylogenetic Diversity for Split Systems , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Kevin J. Gaston,et al.  Measuring beta diversity for presence–absence data , 2003 .

[9]  David Levinson,et al.  Encyclopedia of world cultures , 1991 .

[10]  Katharina T. Huber,et al.  Imputing Supertrees and Supernetworks from Quartets , 2006, WABI.

[11]  O. Pybus,et al.  Correlating viral phenotypes with phylogeny: accounting for phylogenetic uncertainty. , 2008, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[12]  B. Spratt,et al.  A multilocus sequence typing scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. , 1998, Microbiology.

[13]  S. Whelan,et al.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. , 2001, Molecular biology and evolution.

[14]  Donovan H Parks,et al.  Measures of phylogenetic differentiation provide robust and complementary insights into microbial communities , 2012, The ISME Journal.

[15]  A. Dress,et al.  A canonical decomposition theory for metrics on a finite set , 1992 .

[16]  V Moulton,et al.  Pruned median networks: a technique for reducing the complexity of median networks. , 2001, Molecular phylogenetics and evolution.

[17]  Frédéric Delsuc,et al.  Visualizing conflicting evolutionary hypotheses in large collections of trees: using consensus networks to study the origins of placentals and hexapods. , 2005, Systematic biology.

[18]  R. Knight,et al.  UniFrac: a New Phylogenetic Method for Comparing Microbial Communities , 2005, Applied and Environmental Microbiology.

[19]  Daniel H. Huson,et al.  Phylogenetic super-networks from partial trees , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Laurent Excoffier,et al.  Arlequin (version 3.0): An integrated software package for population genetics data analysis , 2005, Evolutionary bioinformatics online.

[21]  D. Faith,et al.  Resemblance in phylogenetic diversity among ecological assemblages , 2010 .

[22]  R. Knight,et al.  Quantitative and Qualitative β Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities , 2007, Applied and Environmental Microbiology.

[23]  C. Graham,et al.  Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time. , 2008, Ecology letters.

[24]  William A. Walters,et al.  QIIME allows analysis of high-throughput community sequencing data , 2010, Nature Methods.

[25]  R. Gray,et al.  Untangling long branches: identifying conflicting phylogenetic signals using spectral analysis, neighbor-net, and consensus networks. , 2005, Systematic biology.

[26]  R. Knight,et al.  Bacterial Community Variation in Human Body Habitats Across Space and Time , 2009, Science.

[27]  S. Clarke,et al.  Clonal analysis of invasive pneumococcal isolates in Scotland and coverage of serotypes by the licensed conjugate polysaccharide pneumococcal vaccine: possible implications for UK vaccine policy , 2005, European Journal of Clinical Microbiology and Infectious Diseases.

[28]  Jin Ok Yang,et al.  Mapping Human Genetic Diversity in Asia , 2009, Science.

[29]  R. Knight,et al.  The convergence of carbohydrate active gene repertoires in human gut microbes , 2008, Proceedings of the National Academy of Sciences.

[30]  Bui Quang Minh,et al.  Taxon Selection under Split Diversity. , 2009, Systematic biology.

[31]  Hans-Jürgen Bandelt,et al.  A Relational Approach to Split Decomposition , 1993 .

[32]  E. Delong,et al.  Proteorhodopsin genes are distributed among divergent marine bacterial taxa , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  R. Knight,et al.  Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data , 2009, The ISME Journal.

[34]  C. Muñoz-Almagro,et al.  Streptococcus pneumoniae serotype 1 causing invasive disease among children in Barcelona over a 20-year period (1989-2008). , 2011, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[35]  D. Morrison,et al.  Networks in phylogenetic analysis: new tools for population biology. , 2005, International journal for parasitology.

[36]  Jacqueline L. Whalley,et al.  Access the most recent version at doi: 10.1101/gr.095612.109 Supplemental Material P , 2009 .

[37]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[38]  H. Bandelt,et al.  Mitochondrial portraits of human populations using median networks. , 1995, Genetics.

[39]  R. Knight,et al.  Soil bacterial and fungal communities across a pH gradient in an arable soil , 2010, The ISME Journal.

[40]  B. Spratt,et al.  Geographic Distribution and Clonal Diversity of Streptococcus pneumoniae Serotype 1 Isolates , 2003, Journal of Clinical Microbiology.

[41]  J. Zhou,et al.  Dynamics of penicillin-susceptible clones in invasive pneumococcal disease. , 2001, The Journal of infectious diseases.

[42]  Andrew P. Martin,et al.  Testing for Differentiation of Microbial Communities Using Phylogenetic Methods: Accounting for Uncertainty of Phylogenetic Inference and Character State Mapping , 2006, Microbial Ecology.

[43]  Daniel H. Huson,et al.  Constructing splits graphs , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[44]  Campbell O. Webb,et al.  Bioinformatics Applications Note Phylocom: Software for the Analysis of Phylogenetic Community Structure and Trait Evolution , 2022 .

[45]  D. Turnbull,et al.  Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. , 2002, American journal of human genetics.

[46]  J. Spudich,et al.  New Insights into Metabolic Properties of Marine Bacteria Encoding Proteorhodopsins , 2005, PLoS biology.

[47]  P. Simmonds,et al.  Edinburgh Research Explorer Identification of shared populations of human immunodeficiency virus type 1 infecting microglia and tissue macrophages outside the central nervous system , 2022 .

[48]  A. Templeton,et al.  Root probabilities for intraspecific gene trees under neutral coalescent theory. , 1994, Molecular phylogenetics and evolution.

[49]  W. Doolittle,et al.  Actinorhodopsins: proteorhodopsin-like gene sequences found predominantly in non-marine environments. , 2008, Environmental microbiology.

[50]  Mark Stoneking,et al.  The impact of the Austronesian expansion: evidence from mtDNA and Y chromosome diversity in the Admiralty Islands of Melanesia. , 2008, Molecular biology and evolution.

[51]  Daniel H. Huson,et al.  Phylogenetic Networks: Introduction to phylogenetic networks , 2010 .

[52]  R. Knight,et al.  Microbial community resemblance methods differ in their ability to detect biologically relevant patterns , 2010, Nature Methods.

[53]  J. Gilbert,et al.  Comparison of multiple metagenomes using phylogenetic networks based on ecological indices , 2010, The ISME Journal.

[54]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[55]  Vincent Moulton,et al.  Spectronet: a package for computing spectra and median networks. , 2002, Applied bioinformatics.

[56]  C. Byington,et al.  Temporal trends of invasive disease due to Streptococcus pneumoniae among children in the intermountain west: emergence of nonvaccine serogroups. , 2005, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[57]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[58]  J. Elith,et al.  Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment , 2007 .

[59]  Andrew P. Martin Phylogenetic Approaches for Describing and Comparing the Diversity of Microbial Communities , 2002, Applied and Environmental Microbiology.

[60]  Simon J. Greenhill,et al.  Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement , 2009, Science.

[61]  Marion Leclerc,et al.  Proteorhodopsin phototrophy in the ocean , 2001, Nature.

[62]  D. Huson,et al.  A Survey of Combinatorial Methods for Phylogenetic Networks , 2010, Genome biology and evolution.

[63]  R. Knight,et al.  Moving pictures of the human microbiome , 2011, Genome Biology.

[64]  O. Gascuel,et al.  Neighbor-joining revealed. , 2006, Molecular biology and evolution.

[65]  C. Muñoz-Almagro,et al.  Pediatric Parapneumonic Empyema, Spain , 2008, Emerging infectious diseases.

[66]  M. Slatkin,et al.  Estimation of levels of gene flow from DNA sequence data. , 1992, Genetics.

[67]  Oded Béjà,et al.  Diversification and spectral tuning in marine proteorhodopsins , 2003, The EMBO journal.

[68]  G. Pluschke,et al.  An outbreak of serotype 1 Streptococcus pneumoniae meningitis in northern Ghana with features that are characteristic of Neisseria meningitidis meningitis epidemics. , 2005, The Journal of infectious diseases.

[69]  M. Stoneking,et al.  Mitochondrial DNA and human evolution , 1987, Nature.

[70]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[71]  S. Clarke,et al.  Identification of Invasive Serotype 1 Pneumococcal Isolates That Express Nonhemolytic Pneumolysin , 2006, Journal of Clinical Microbiology.

[72]  H. Žemličková,et al.  Serotype-specific invasive disease potential of Streptococcus pneumoniae in Czech children. , 2010, Journal of medical microbiology.

[73]  S. Ho,et al.  Tracing the decay of the historical signal in biological sequence data. , 2004, Systematic biology.

[74]  Jonathan M. Chase,et al.  Navigating the multiple meanings of β diversity: a roadmap for the practicing ecologist. , 2011, Ecology letters.

[75]  O. Béjà,et al.  Adaptation and spectral tuning in divergent marine proteorhodopsins from the eastern Mediterranean and the Sargasso Seas , 2007, The ISME Journal.

[76]  Vincent Moulton,et al.  Consistency of the Neighbor-Net Algorithm , 2007, Algorithms for Molecular Biology.