Genomes from uncultivated prokaryotes: a comparison of metagenome-assembled and single-amplified genomes

BackgroundProkaryotes dominate the biosphere and regulate biogeochemical processes essential to all life. Yet, our knowledge about their biology is for the most part limited to the minority that has been successfully cultured. Molecular techniques now allow for obtaining genome sequences of uncultivated prokaryotic taxa, facilitating in-depth analyses that may ultimately improve our understanding of these key organisms.ResultsWe compared results from two culture-independent strategies for recovering bacterial genomes: single-amplified genomes and metagenome-assembled genomes. Single-amplified genomes were obtained from samples collected at an offshore station in the Baltic Sea Proper and compared to previously obtained metagenome-assembled genomes from a time series at the same station. Among 16 single-amplified genomes analyzed, seven were found to match metagenome-assembled genomes, affiliated with a diverse set of taxa. Notably, genome pairs between the two approaches were nearly identical (average 99.51% sequence identity; range 98.77–99.84%) across overlapping regions (30–80% of each genome). Within matching pairs, the single-amplified genomes were consistently smaller and less complete, whereas the genetic functional profiles were maintained. For the metagenome-assembled genomes, only on average 3.6% of the bases were estimated to be missing from the genomes due to wrongly binned contigs.ConclusionsThe strong agreement between the single-amplified and metagenome-assembled genomes emphasizes that both methods generate accurate genome information from uncultivated bacteria. Importantly, this implies that the research questions and the available resources are allowed to determine the selection of genomics approach for microbiome studies.

[1]  Philip D. Blood,et al.  Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.

[2]  Elmar Pruesse,et al.  SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes , 2012, Bioinform..

[3]  R. Knight,et al.  Global patterns in the biogeography of bacterial taxa. , 2011, Environmental microbiology.

[4]  X. Xie,et al.  Single-cell whole-genome analyses by Linear Amplification via Transposon Insertion (LIANTI) , 2017, Science.

[5]  Daniela M. Witten,et al.  Classification and clustering of sequencing data using a poisson model , 2011, 1202.6201.

[6]  E. Delong,et al.  Potential for Chemolithoautotrophy Among Ubiquitous Bacteria Lineages in the Dark Ocean , 2011, Science.

[7]  Wendy S. Schackwitz,et al.  One Bacterial Cell, One Complete Genome , 2010, PloS one.

[8]  R. Knight,et al.  The Human Microbiome Project , 2007, Nature.

[9]  D. Mende,et al.  Improved Environmental Genomes via Integration of Metagenomic and Single-Cell Assemblies , 2016, Front. Microbiol..

[10]  Donovan H. Parks,et al.  Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life , 2017, Nature Microbiology.

[11]  Metagenome-assembled genomes uncover a global brackish microbiome , 2015 .

[12]  Thijs J. G. Ettema,et al.  Asgard archaea illuminate the origin of eukaryotic cellular complexity , 2017, Nature.

[13]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[14]  Anders F. Andersson,et al.  Pyrosequencing reveals contrasting seasonal dynamics of taxa within Baltic Sea bacterioplankton communities , 2010, The ISME Journal.

[15]  Sitao Wu,et al.  WebMGA: a customizable web server for fast metagenomic sequence analysis , 2011, BMC Genomics.

[16]  P. Pevzner,et al.  metaSPAdes: a new versatile metagenomic assembler. , 2017, Genome research.

[17]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[18]  Marcy Yann,et al.  ヒト口腔からの微量の培養されないTM7微生物の単一細胞遺伝分析による生物学的「不明な物体」の詳細な分析 , 2007 .

[19]  Stefan Bertilsson,et al.  Cryptosporidium as a testbed for single cell genome characterization of unicellular eukaryotes , 2016, BMC Genomics.

[20]  Thijs J. G. Ettema,et al.  Complex archaea that bridge the gap between prokaryotes and eukaryotes , 2015, Nature.

[21]  E. Delong,et al.  The Microbial Engines That Drive Earth's Biogeochemical Cycles , 2008, Science.

[22]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[23]  Brian C. Thomas,et al.  Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization , 2013, Genome research.

[24]  W. Whitman,et al.  Prokaryotes: the unseen majority. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Luis Pedro Coelho,et al.  Structure and function of the global ocean microbiome , 2015, Science.

[26]  D. Söll,et al.  UGA is an additional glycine codon in uncultured SR1 bacteria from the human microbiota , 2013, Proceedings of the National Academy of Sciences.

[27]  Ali Bashashati,et al.  Robust high-performance nanoliter-volume single-cell multiple displacement amplification on planar substrates , 2016, Proceedings of the National Academy of Sciences.

[28]  Anders F. Andersson,et al.  Disentangling seasonal bacterioplankton population dynamics by high-frequency sampling. , 2015, Environmental microbiology.

[29]  S. Hallam,et al.  Phylogeny and physiology of candidate phylum ‘Atribacteria’ (OP9/JS1) inferred from cultivation-independent genomics , 2015, The ISME Journal.

[30]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[31]  Nikos Kyrpides,et al.  Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements , 2016, Nucleic Acids Res..

[32]  Donovan Parks,et al.  GroopM: an automated tool for the recovery of population genomes from related metagenomes , 2014, PeerJ.

[33]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[34]  Natalia N. Ivanova,et al.  Microbial species delineation using whole genome sequences , 2015, Nucleic acids research.

[35]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[36]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[37]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[38]  P. Hugenholtz,et al.  Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes , 2013, Nature Biotechnology.

[39]  Roger S Lasken,et al.  Mechanism of chimera formation during the Multiple Displacement Amplification reaction , 2007 .

[40]  N. Kashtan,et al.  Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus , 2014, Science.

[41]  Tanja Woyke,et al.  Reconstructing each cell's genome within complex microbial communities—dream or reality? , 2014, Front. Microbiol..

[42]  Adina Howe,et al.  Strategies to improve reference databases for soil microbiomes , 2016, The ISME Journal.

[43]  Stephen R Quake,et al.  Optofluidic cell selection from complex microbial communities for single-genome analysis. , 2013, Methods in enzymology.

[44]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[45]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[46]  Natalia N. Ivanova,et al.  Correction: Corrigendum: Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea , 2018, Nature Biotechnology.

[47]  Natalia N. Ivanova,et al.  A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea , 2009, Nature.

[48]  R. Sandberg,et al.  Capturing whole-genome characteristics in short sequences using a naïve Bayesian classifier. , 2001, Genome research.

[49]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[50]  G. Braus,et al.  One Juliet and four Romeos: VeA and its methyltransferases , 2015, Front. Microbiol..

[51]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[52]  R. Fleischmann,et al.  Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii , 1996, Science.

[53]  R. Stepanauskas Single cell genomics: an individual look at microbes. , 2012, Current opinion in microbiology.

[54]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[55]  S. Tringe,et al.  Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill , 2012, The ISME Journal.

[56]  S. Quake,et al.  Single-Cell-Genomics-Facilitated Read Binning of Candidate Phylum EM19 Genomes from Geothermal Spring Metagenomes , 2015, Applied and Environmental Microbiology.

[57]  P. Vandamme,et al.  DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. , 2007, International journal of systematic and evolutionary microbiology.

[58]  J. Gilbert,et al.  Recovering complete and draft population genomes from metagenome datasets , 2016, Microbiome.

[59]  Miriam L. Land,et al.  Trace: Tennessee Research and Creative Exchange Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification Recommended Citation Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification , 2022 .

[60]  Nikos Kyrpides,et al.  The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification , 2014, Nucleic Acids Res..

[61]  Natalia N. Ivanova,et al.  Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea , 2017, Nature Biotechnology.

[62]  R. Rosselló-Móra,et al.  Shifting the genomic gold standard for the prokaryotic species definition , 2009, Proceedings of the National Academy of Sciences.

[63]  Stijn van Dongen,et al.  Using MCL to extract clusters from networks. , 2012, Methods in molecular biology.

[64]  K. Pollard,et al.  An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography , 2016, Genome research.

[65]  Frederick Albert Matsen IV,et al.  PhyloSift: phylogenetic analysis of genomes and metagenomes , 2014, PeerJ.

[66]  Stephen P. Dearth,et al.  Cryptic carbon and sulfur cycling between surface ocean plankton , 2014, Proceedings of the National Academy of Sciences.

[67]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[68]  Tanja Woyke,et al.  Obtaining genomes from uncultivated environmental microorganisms using FACS–based single-cell genomics , 2014, Nature Protocols.

[69]  K. Schleifer,et al.  Phylogenetic identification and in situ detection of individual microbial cells without cultivation. , 1995, Microbiological reviews.

[70]  Johannes Alneberg,et al.  DESMAN: a new tool for de novo extraction of strains from metagenomes , 2017, Genome Biology.

[71]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[72]  Tom O. Delmont,et al.  Anvi’o: an advanced analysis and visualization platform for ‘omics data , 2015, PeerJ.

[73]  Françoise Munaut,et al.  A European Database of Fusarium graminearum and F. culmorum Trichothecene Genotypes , 2016, Front. Microbiol..

[74]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[75]  Ramon Rosselló-Móra,et al.  Classifying the uncultivated microbial majority: A place for metagenomic data in the Candidatus proposal. , 2015, Systematic and applied microbiology.

[76]  Anders F. Andersson,et al.  Binning metagenomic contigs by coverage and composition , 2014, Nature Methods.

[77]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[78]  Anders F. Andersson,et al.  Transitions in bacterial communities along the 2000 km salinity gradient of the Baltic Sea , 2011, The ISME Journal.

[79]  K. Konstantinidis,et al.  Genomic insights that advance the species definition for prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[80]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[81]  S. Quake,et al.  Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth , 2007, Proceedings of the National Academy of Sciences.

[82]  R. Stepanauskas,et al.  Comparative single-cell genomics reveals potential ecological niches for the freshwater acI Actinobacteria lineage , 2014, The ISME Journal.

[83]  F. Raymond,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Ray Meta: scalable de novo metagenome assembly and profiling , 2012 .

[84]  Natalia N. Ivanova,et al.  Insights into the phylogeny and coding potential of microbial dark matter , 2013, Nature.

[85]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[86]  D. Torrents,et al.  Tuning fresh: radiation through rewiring of central metabolism in streamlined bacteria , 2016, The ISME Journal.

[87]  Brian D. Ondov,et al.  Mash: fast genome and metagenome distance estimation using MinHash , 2015, Genome Biology.

[88]  Anders F. Andersson,et al.  Functional Tradeoffs Underpin Salinity-Driven Divergence in Microbial Community Composition , 2014, PloS one.

[89]  G. Church,et al.  Sequencing genomes from single cells by polymerase cloning , 2006, Nature Biotechnology.

[90]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[91]  Alison S. Waller,et al.  Genomic variation landscape of the human gut microbiome , 2012, Nature.

[92]  Brian C. Thomas,et al.  Community-wide analysis of microbial genome sequence signatures , 2009, Genome Biology.

[93]  W. Koh,et al.  Single-cell genome sequencing: current state of the science , 2016, Nature Reviews Genetics.

[94]  Alexander Sczyrba,et al.  Decontamination of MDA Reagents for Single Cell Whole Genome Amplification , 2011, PloS one.

[95]  R. Dewhurst,et al.  Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen , 2018, Nature Communications.

[96]  N. Segata,et al.  Shotgun metagenomics, from sampling to analysis , 2017, Nature Biotechnology.