Identifying Eukaryotes and Factors Influencing Their Biogeography in Drinking Water Metagenomes

The biogeography of eukaryotes in drinking water systems is poorly understood relative to that of prokaryotes or viruses, limiting the understanding of their role and management. A challenge with studying complex eukaryotic communities is that metagenomic analysis workflows are currently not as mature as those that focus on prokaryotes or viruses. In this study, we benchmarked different strategies to recover eukaryotic sequences and genomes from metagenomic data and applied the best-performing workflow to explore the factors affecting the relative abundance and diversity of eukaryotic communities in drinking water distribution systems (DWDSs). We developed an ensemble approach exploiting k-mer- and reference-based strategies to improve eukaryotic sequence identification and identified MetaBAT2 as the best-performing binning approach for their clustering. Applying this workflow to the DWDS metagenomes showed that eukaryotic sequences typically constituted small proportions (i.e., <1%) of the overall metagenomic data with higher relative abundances in surface water-fed or chlorinated systems with high residuals. The α and β diversities of eukaryotes were correlated with those of prokaryotic and viral communities, highlighting the common role of environmental/management factors. Finally, a co-occurrence analysis highlighted clusters of eukaryotes whose members' presence and abundance in DWDSs were affected by disinfection strategies, climate conditions, and source water types.

[1]  E. Prest,et al.  (Micro)Biological Sediment Formation in a Non-Chlorinated Drinking Water Distribution System , 2023, Water.

[2]  K. Goodwin,et al.  Long-Read Sequencing Improves Recovery of Picoeukaryotic Genomes and Zooplankton Marker Genes from Marine Metagenomes , 2022, mSystems.

[3]  M. Bengtsson,et al.  DNA extraction bias is more pronounced for microbial eukaryotes than for prokaryotes , 2022, MicrobiologyOpen.

[4]  Chen-yan Hu,et al.  Occurrence of fungal spores in drinking water: A review of pathogenicity, odor, chlorine resistance and control strategies. , 2022, The Science of the total environment.

[5]  W. Ahmed,et al.  Free-Living Amoeba and Associated Pathogenic Bacteria in Well-Chlorinated Drinking Water Storage Tanks , 2022, ACS ES&amp;T Water.

[6]  F. Hammes,et al.  Potential probiotic approaches to control Legionella in engineered aquatic ecosystems , 2022, FEMS microbiology ecology.

[7]  Austin D. Swafford,et al.  A comparison of six DNA extraction protocols for 16S, ITS, and shotgun metagenomic sequencing of microbial communities , 2022, bioRxiv.

[8]  Corinne Da Silva,et al.  Functional repertoire convergence of distantly related eukaryotic plankton lineages abundant in the sunlit ocean , 2022, Cell genomics.

[9]  Luis Pedro Coelho,et al.  A deep siamese neural network improves metagenome-assembled genomes in microbiome datasets across different environments , 2022, Nature Communications.

[10]  J. Fuhrman,et al.  Contrasting diversity patterns of prokaryotes and protists over time and depth at the San-Pedro Ocean Time series , 2022, ISME Communications.

[11]  F. Altermatt,et al.  Removal of Waterborne Viruses by Tetrahymena pyriformis Is Virus-Specific and Coincides with Changes in Protist Swimming Speed , 2022, Environmental science & technology.

[12]  Yang Liu,et al.  METABOLIC: high-throughput profiling of microbial genomes for functional traits, metabolism, biogeochemistry, and community-scale functional networks , 2022, Microbiome.

[13]  D. Vaulot,et al.  metaPR2 : A database of eukaryotic 18S rRNA metabarcodes with an emphasis on protists , 2022, bioRxiv.

[14]  Vijini Mallawaarachchi,et al.  RepBin: Constraint-based Graph Representation Learning for Metagenomic Binning , 2021, AAAI.

[15]  K. Reinert,et al.  Critical Assessment of Metagenome Interpretation: the second round of challenges , 2021, Nature Methods.

[16]  I. Miettinen,et al.  Bacterial Genes Encoding Resistance Against Antibiotics and Metals in Well-Maintained Drinking Water Distribution Systems in Finland , 2022, Frontiers in Microbiology.

[17]  M. Medema,et al.  Whokaryote: distinguishing eukaryotic and prokaryotic contigs in metagenomes based on gene structure , 2021, bioRxiv.

[18]  L. Raskin,et al.  A Snapshot of the Global Drinking Water Virome: Diversity and Metabolic Potential Vary with Residual Disinfectant Use , 2021, bioRxiv.

[19]  I. Padilla,et al.  Spatial-temporal targeted and non-targeted surveys to assess microbiological composition of drinking water in Puerto Rico following Hurricane Maria , 2021, bioRxiv.

[20]  H. Drost,et al.  Sensitive protein alignments at tree-of-life scale using DIAMOND , 2021, Nature Methods.

[21]  L. Debarbieux,et al.  Viral Host Range database, an online tool for recording, analyzing and disseminating virus–host interactions , 2021, Bioinform..

[22]  A. Karnkowska,et al.  Tiara: deep learning-based classification system for eukaryotic sequences , 2021, bioRxiv.

[23]  S. Rasmussen,et al.  Improved metagenome binning and assembly using deep variational autoencoders , 2021, Nature Biotechnology.

[24]  N. Gunde-Cimerman,et al.  Water-Transmitted Fungi Are Involved in Degradation of Concrete Drinking Water Storage Tanks , 2021, Microorganisms.

[25]  F. G. Figueiras,et al.  Evaluation of DNA Extraction Methods and Bioinformatic Pipelines for Marine Nano- and Pico-Eukaryotic Plankton Analysis , 2021, Frontiers in Marine Science.

[26]  D. Bass,et al.  Protist taxonomic and functional diversity in soil, freshwater and marine ecosystems. , 2020, Environment international.

[27]  Vincent J. Denef,et al.  Host specificity of microbiome assembly and its fitness effects in phytoplankton , 2020, The ISME Journal.

[28]  T. Walsh,et al.  Naegleria fowleri in drinking water distribution systems , 2020 .

[29]  Yi Yue,et al.  Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets , 2020, BMC Bioinformatics.

[30]  R. Nijland,et al.  Biases in bulk: DNA metabarcoding of marine communities and the methodology involved , 2020, Molecular ecology.

[31]  Bryan D. Martin,et al.  Estimating diversity in networked ecological communities , 2020, Biostatistics.

[32]  Dominique Gravel,et al.  Co-occurrence is not evidence of ecological interactions. , 2020, Ecology letters.

[33]  Lin Ye,et al.  Metagenomic profiling of antibiotic resistance genes and their associations with bacterial community during multiple disinfection regimes in a full-scale drinking water treatment plant. , 2020, Water research.

[34]  D. Chicco,et al.  The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation , 2020, BMC Genomics.

[35]  Eli Levy Karin,et al.  MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics , 2019, Microbiome.

[36]  M. Gołębiewski,et al.  Generating amplicon reads for microbial community assessment with next‐generation sequencing , 2020, Journal of applied microbiology.

[37]  S. Sunagawa,et al.  Expanding Tara Oceans Protocols for Underway, Ecosystemic Sampling of the Ocean-Atmosphere Interface During Tara Pacific Expedition (2016–2018) , 2019, Front. Mar. Sci..

[38]  B. La Scola,et al.  Giant virus vs amoeba: fight for supremacy , 2019, Virology Journal.

[39]  A. M. Eren,et al.  Disinfection exhibits systematic impacts on the drinking water microbiome , 2019, Microbiome.

[40]  R. Colwell,et al.  Drinking Water Microbiome Project: Is it Time? , 2019, Trends in microbiology.

[41]  I. Miettinen,et al.  Active eukaryotes in drinking water distribution systems of ground and surface waterworks , 2019, Microbiome.

[42]  A. Marchetti,et al.  Estimation of 18S Gene Copy Number in Marine Eukaryotic Plankton Using a Next-Generation Sequencing Approach , 2019, Front. Mar. Sci..

[43]  Tong Zhang,et al.  New insights into antibiotic resistome in drinking water and management perspectives: A metagenomic based study of small-sized microbes. , 2019, Water research.

[44]  Feng Li,et al.  MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies , 2019, PeerJ.

[45]  Bas E. Dutilh,et al.  Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT , 2019, Genome Biology.

[46]  Vincent A. Traag,et al.  From Louvain to Leiden: guaranteeing well-connected communities , 2018, Scientific Reports.

[47]  N. Boon,et al.  Drinking water bacterial communities exhibit specific and selective necrotrophic growth , 2018, npj Clean Water.

[48]  A. Berg,et al.  Present and future Köppen-Geiger climate classification maps at 1-km resolution , 2018, Scientific Data.

[49]  M. Albertsen,et al.  Investigation of Detection Limits and the Influence of DNA Extraction and Primer Choice on the Observed Microbial Communities in Drinking Water Samples Using 16S rRNA Gene Amplicon Sequencing , 2018, Front. Microbiol..

[50]  V. Souza,et al.  Nutrient Dependent Cross-Kingdom Interactions: Fungi and Bacteria From an Oligotrophic Desert Oasis , 2018, Front. Microbiol..

[51]  Matthew Z. DeMaere,et al.  CAMISIM: simulating metagenomes and microbial communities , 2018, bioRxiv.

[52]  G. Van Domselaar,et al.  Impact of sequencing depth on the characterization of the microbiome and resistome , 2018, Scientific Reports.

[53]  Jia Gu,et al.  fastp: an ultra-fast all-in-one FASTQ preprocessor , 2018, bioRxiv.

[54]  D. Kirchman Processes in Microbial Ecology , 2012, Oxford Scholarship Online.

[55]  Alice C. McHardy,et al.  AMBER: Assessment of Metagenome BinnERs , 2017, bioRxiv.

[56]  Tong Zhang,et al.  Catalogue of antibiotic resistome and host-tracking in drinking water deciphered by a large scale survey , 2017, Microbiome.

[57]  E. Garner,et al.  Stormwater loadings of antibiotic resistance genes in an urban stream. , 2017, Water research.

[58]  N. Segata,et al.  Shotgun metagenomics, from sampling to analysis , 2017, Nature Biotechnology.

[59]  Francisco M. Cornejo-Castillo,et al.  Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition , 2017, Scientific Data.

[60]  Brian C. Thomas,et al.  Genome-reconstruction for eukaryotes from complex natural microbial communities , 2017, bioRxiv.

[61]  P. Pevzner,et al.  metaSPAdes: a new versatile metagenomic assembler. , 2017, Genome research.

[62]  Tuqiao Zhang,et al.  An ignored and potential source of taste and odor (T&O) issues—biofilms in drinking water distribution system (DWDS) , 2017, Applied Microbiology and Biotechnology.

[63]  J. Quílez,et al.  Occurrence of Cryptosporidium and Giardia in raw and finished drinking water in north-eastern Spain. , 2017, The Science of the total environment.

[64]  John Vollmers,et al.  Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist’s Perspective - Not Only Size Matters! , 2017, PloS one.

[65]  Yan Li,et al.  SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation , 2016, PloS one.

[66]  L. Moulin,et al.  Environmental factors shaping cultured free-living amoebae and their associated bacterial community within drinking water network. , 2016, Water research.

[67]  Anders Krogh,et al.  Fast and sensitive taxonomic classification for metagenomics with Kaiju , 2016, Nature Communications.

[68]  M. V. van Loosdrecht,et al.  Biological Stability of Drinking Water: Controlling Factors, Methods, and Challenges , 2016, Front. Microbiol..

[69]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[70]  Brian D. Ondov,et al.  Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.

[71]  Karoline Faust,et al.  CoNet app: inference of biological association networks using Cytoscape , 2016, F1000Research.

[72]  Tong Zhang,et al.  Bacterial Community Shift Drives Antibiotic Resistance Promotion during Drinking Water Chlorination. , 2015, Environmental science & technology.

[73]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[74]  L. Hansen,et al.  Comparison of three DNA extraction methods for recovery of soil protist DNA. , 2015, Journal of microbiological methods.

[75]  N. Ashbolt,et al.  Molecular survey of occurrence and quantity of Legionella spp., Mycobacterium spp., Pseudomonas aeruginosa and amoeba hosts in municipal drinking water storage tank sediments , 2015, Journal of applied microbiology.

[76]  Jie Xiong,et al.  Taxonomic Resolutions Based on 18S rRNA Genes: A Case Study of Subclass Copepoda , 2015, PloS one.

[77]  Peer Bork,et al.  Open science resources for the discovery and analysis of Tara Oceans data , 2015, Scientific Data.

[78]  J. Bunge,et al.  Estimating diversity via frequency ratios , 2014, Biometrics.

[79]  Anders F. Andersson,et al.  Binning metagenomic contigs by coverage and composition , 2014, Nature Methods.

[80]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[81]  P. Goldschmidt,et al.  Enzymatic Treatment of Specimens before DNA Extraction Directly Influences Molecular Detection of Infectious Agents , 2014, PloS one.

[82]  K. Konstantinidis,et al.  Strengths and Limitations of 16S rRNA Gene Amplicon Sequencing in Revealing Temporal Microbial Community Dynamics , 2014, PloS one.

[83]  I. Thompson,et al.  Diversity and dynamics of microbial communities at each step of treatment plant for potable water generation. , 2014, Water research.

[84]  M. Rodier,et al.  Sensitivity of free-living amoeba trophozoites and cysts to water disinfectants. , 2014, International journal of hygiene and environmental health.

[85]  Inna Dubchak,et al.  The genome portal of the Department of Energy Joint Genome Institute: 2014 updates , 2013, Nucleic Acids Res..

[86]  Tong Zhang,et al.  Metagenomic analysis reveals significant changes of microbial compositions and protective functions during drinking water treatment , 2013, Scientific Reports.

[87]  M. T. Crespo,et al.  Free chlorine inactivation of fungi in drinking water sources. , 2013, Water research.

[88]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[89]  Tong Zhang,et al.  Metagenomic insights into chlorination effects on microbial antibiotic resistance in drinking water. , 2013, Water Research.

[90]  W. Hoogenboezem,et al.  Variability of invertebrate abundance in drinking water distribution systems in the Netherlands in relation to biostability and sediment volumes. , 2012, Water research.

[91]  Andreas Scheidegger,et al.  Predation influences the structure of biofilm developed on ultrafiltration membranes. , 2012, Water research.

[92]  Alison S. Waller,et al.  Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data , 2012, PloS one.

[93]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[94]  T. Bell,et al.  Closely related protist strains have different grazing impacts on natural bacterial communities. , 2010, Environmental microbiology.

[95]  L. Farinelli,et al.  The complete sequence of the smallest known nuclear genome from the microsporidian Encephalitozoon intestinalis , 2010, Nature communications.

[96]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[97]  Miriam L. Land,et al.  Trace: Tennessee Research and Creative Exchange Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification Recommended Citation Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification , 2022 .

[98]  M. T. Crespo,et al.  Occurrence of filamentous fungi and yeasts in three different drinking water sources. , 2009, Water research.

[99]  K. Küsel,et al.  Protists with different feeding modes change biofilm morphology. , 2009, FEMS microbiology ecology.

[100]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[101]  J. Losos Phylogenetic niche conservatism, phylogenetic signal and the relationship between phylogenetic relatedness and ecological similarity among species. , 2008, Ecology letters.

[102]  M. Horn Chlamydiae as symbionts in eukaryotes. , 2008, Annual review of microbiology.

[103]  A. Zeileis,et al.  Regression Models for Count Data in R , 2008 .

[104]  G. Bakker,et al.  Sampling and quantifying invertebrates from drinking water distribution mains. , 2004, Water research.

[105]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.