AMBER: Assessment of Metagenome BinnERs

Abstract Reconstructing the genomes of microbial community members is key to the interpretation of shotgun metagenome samples. Genome binning programs deconvolute reads or assembled contigs of such samples into individual bins. However, assessing their quality is difficult due to the lack of evaluation software and standardized metrics. Here, we present Assessment of Metagenome BinnERs (AMBER), an evaluation package for the comparative assessment of genome reconstructions from metagenome benchmark datasets. It calculates the performance metrics and comparative visualizations used in the first benchmarking challenge of the initiative for the Critical Assessment of Metagenome Interpretation (CAMI). As an application, we show the outputs of AMBER for 11 binning programs on two CAMI benchmark datasets. AMBER is implemented in Python and available under the Apache 2.0 license on GitHub.

[1]  Anders F. Andersson,et al.  Binning metagenomic contigs by coverage and composition , 2014, Nature Methods.

[2]  Ting Chen,et al.  COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO‐alignment and paired‐end read LinkAge , 2016, Bioinform..

[3]  Philip D. Blood,et al.  Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.

[4]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[5]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[6]  Elaina D. Graham,et al.  BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation , 2016, bioRxiv.

[7]  Alexander Sczyrba,et al.  CAMISIM: simulating metagenomes and microbial communities , 2018, bioRxiv.

[8]  Yu-Chieh Liao,et al.  Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes , 2016, Scientific Reports.

[9]  Alexander Sczyrba,et al.  Bioboxes: standardised containers for interchangeable bioinformatics software , 2015, GigaScience.

[10]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[11]  Connor T. Skennerton,et al.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes , 2015, Genome research.

[12]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[13]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[14]  Alexey A. Gurevich,et al.  MetaQUAST: evaluation of metagenome assemblies , 2016, Bioinform..

[15]  Blake A. Simmons,et al.  MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets , 2016, Bioinform..

[16]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[17]  Donovan H. Parks,et al.  Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life , 2017, Nature Microbiology.

[18]  M. Strous,et al.  The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures , 2012, Front. Microbio..

[19]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[20]  Alexander J Probst,et al.  Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy , 2017, Nature Microbiology.