MuDoGeR: Multi-Domain Genome Recovery from metagenomes made easy

Several computational frameworks and workflows that recover genomes from prokaryotes, eukaryotes, and viruses from metagenomes exist. However, it is difficult for scientists with little bioinformatics experience to evaluate quality, annotate genes, dereplicate, assign taxonomy and calculate relative abundance and coverage of genomes belonging to different domains. MuDoGeR is a user-friendly tool accessible for non-bioinformaticians that make it easy to recover genomes of prokaryotes, eukaryotes, and viruses from metagenomes, either alone or in combination. We tested MuDoGer using 24 individual-isolated genomes and 574 metagenomes, demonstrating the applicability for a few samples and high throughput. MuDoGeR is open-source software available at https://github.com/mdsufz/MuDoGeR.

[1]  P. Stadler,et al.  PredicTF: prediction of bacterial transcription factors in complex microbial communities using deep learning , 2022, Environmental Microbiome.

[2]  B. Liu Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture , 2020, Microbiome.

[3]  U. N. da Rocha,et al.  Metagenomes, metatranscriptomes and microbiomes of naturally decomposing deadwood , 2021, Scientific Data.

[4]  P. Stadler,et al.  OrtSuite: from genomes to prediction of microbial interactions within targeted ecosystem processes , 2021, Life Science Alliance.

[5]  Michael J. Tisza,et al.  A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases , 2021, Proceedings of the National Academy of Sciences.

[6]  S. Atashgahi,et al.  High biodiversity in a benzene-degrading nitrate-reducing culture is sustained by a few primary consumers , 2021, Communications biology.

[7]  N. Kyrpides,et al.  Metagenomic insights into the taxonomy, function, and dysbiosis of prokaryotic communities in octocorals , 2021, Microbiome.

[8]  F. Centler,et al.  Mining Synergistic Microbial Interactions: A Roadmap on How to Integrate Multi-Omics Data , 2021, Microorganisms.

[9]  Tom O. Delmont,et al.  VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses , 2021, Microbiome.

[10]  T. Cajthaml,et al.  Complementary Roles of Wood-Inhabiting Fungi and Bacteria Facilitate Deadwood Decomposition , 2021, mSystems.

[11]  Peter F. Stadler,et al.  HumanMetagenomeDB: a public repository of curated and standardized metadata for human metagenomes , 2020, Nucleic Acids Res..

[12]  N. Kyrpides,et al.  CheckV assesses the quality and completeness of metagenome-assembled viral genomes , 2020, Nature Biotechnology.

[13]  R. Finn,et al.  Estimating the quality of eukaryotic genomes recovered from metagenomic analysis with EukCC , 2020, Genome biology.

[14]  P. Baldrian,et al.  Metagenomics and stable isotope probing reveal the complementary contribution of fungal and bacterial communities in the recycling of dead biomass in forest soil , 2020 .

[15]  J. Emerson,et al.  Viromes outperform total metagenomes in revealing the spatiotemporal patterns of agricultural soil viral communities , 2020, The ISME Journal.

[16]  Alexander J. Probst,et al.  uBin – a manual refining tool for metagenomic bins designed for educational purposes , 2020, bioRxiv.

[17]  Karthik Anantharaman,et al.  VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences , 2020, Microbiome.

[18]  Nikki E. Freed,et al.  Testing the advantages and disadvantages of short- and long- read eukaryotic metagenomics using simulated reads , 2019, BMC Bioinformatics.

[19]  Donovan H Parks,et al.  GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database , 2019, Bioinform..

[20]  P. Stadler,et al.  TerrestrialMetagenomeDB: a public repository of curated and standardized metadata for terrestrial metagenomes , 2019, bioRxiv.

[21]  I. Rocha,et al.  iDS372, a Phenotypically Reconciled Model for the Metabolism of Streptococcus pneumoniae Strain R6 , 2019, Front. Microbiol..

[22]  Eleazar Eskin,et al.  Challenges and recommendations to improve the installability and archival stability of omics computational tools , 2019, PLoS biology.

[23]  M. Marz,et al.  Evaluation of Sequencing Library Preparation Protocols for Viral Metagenomic Analysis from Pristine Aquifer Groundwaters , 2019, Viruses.

[24]  Evelien M. Adriaenssens,et al.  Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks , 2019, Nature Biotechnology.

[25]  J. DiRuggiero,et al.  MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis , 2018, Microbiome.

[26]  Renan Valieris,et al.  Bioconda: sustainable and comprehensive software distribution for the life sciences , 2018, Nature Methods.

[27]  Alexander J Probst,et al.  Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy , 2017, Nature Microbiology.

[28]  James Taylor,et al.  MetaWRAP—a flexible pipeline for genome-resolved metagenomic data analysis , 2018, Microbiome.

[29]  R. König,et al.  Combination of Classifiers Identifies Fungal-Specific Activation of Lysosome Genes in Human Monocytes , 2017, Front. Microbiol..

[30]  Donovan H. Parks,et al.  Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life , 2017, Nature Microbiology.

[31]  Robert M. Waterhouse,et al.  BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics , 2017, bioRxiv.

[32]  R. König,et al.  Fungal biomarker discovery by integration of classifiers , 2017, BMC Genomics.

[33]  Brian C. Thomas,et al.  Genome-reconstruction for eukaryotes from complex natural microbial communities , 2017, bioRxiv.

[34]  Jonathan Vincent,et al.  WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs , 2017, Bioinform..

[35]  Yang Young Lu,et al.  VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data , 2017, Microbiome.

[36]  P. Pevzner,et al.  metaSPAdes: a new versatile metagenomic assembler. , 2017, Genome research.

[37]  Hing-Fung Ting,et al.  MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices. , 2016, Methods.

[38]  Tom O. Delmont,et al.  Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies , 2016, PeerJ.

[39]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[40]  Blake A. Simmons,et al.  MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets , 2016, Bioinform..

[41]  Donovan H. Parks,et al.  Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics , 2015, Science.

[42]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[43]  Connor T. Skennerton,et al.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes , 2015, Genome research.

[44]  Anders F. Andersson,et al.  Binning metagenomic contigs by coverage and composition , 2014, Nature Methods.

[45]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[46]  P. Hugenholtz,et al.  Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes , 2013, Nature Biotechnology.

[47]  Susan Holmes,et al.  phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data , 2013, PloS one.

[48]  Brian C. Thomas,et al.  Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization , 2013, Genome research.

[49]  J. Spouge,et al.  CBOL Protist Working Group: Barcoding Eukaryotic Richness beyond the Animal, Plant, and Fungal Kingdoms , 2012, PLoS biology.

[50]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[51]  C. Delwiche,et al.  Phylogeny and Molecular Evolution of the Green Algae , 2012 .

[52]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[53]  Sergey Koren,et al.  Bambus 2: scaffolding metagenomes , 2011, Bioinform..

[54]  Hideaki Tanaka,et al.  MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads , 2011, BCB '11.

[55]  R. Milo,et al.  Central carbon metabolism as a minimal biochemical walk between precursors for biomass and energy. , 2010, Molecular cell.

[56]  John C. Wooley,et al.  Metagenomics: Facts and Artifacts, and Computational Challenges , 2010, Journal of Computer Science and Technology.

[57]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[58]  M. Borodovsky,et al.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. , 2001, Nucleic acids research.