Spherical: an iterative workflow for assembling metagenomic datasets

The consensus emerging from microbiome studies is that they are far more complex than previously thought, requiring deep sequencing. As deep sequenced datasets provide greater coverage than previous datasets, recovering a higher proportion of reads to the assembly is still a challenge. To tackle this issue, we set of to identify if multiple iterations of assembly would allow for otherwise lost contigs to be formed and studied and if so, how successful is such an avenue at improving the current methodology. A simulated metagenomic dataset was initially used to identify if multiple iterations of assembly produce useable contigs or mis-assembled artefacts were produced. Once we had confirmed that the secondary iterations were producing both accurate contigs without a reduction in contig quality we applied this methodology in the form of Spherical to 3 metagenomic studies. The additional contigs produced by Spherical increased the number of reads aligning to an identified gene by 11–109% compared to the initial iterations assembly. As the size of the dataset increased, as did the amount of data multiple iterations were able to add. Availability Spherical is implemented in Python 2.7 and available for use under a MIT licence agreement at: https://github.com/thh32/Spherical

[1]  Huzefa Rangwala,et al.  Evaluation of short read metagenomic assembly , 2011, BMC Genomics.

[2]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[3]  M. Pignatelli,et al.  The oral metagenome in health and disease , 2011, The ISME Journal.

[4]  Folker Meyer,et al.  37. The Metagenomics RAST Server: A Public Resource for the Automatic Phylogenetic and Functional Analysis of Metagenomes , 2011 .

[5]  Monzoorul Haque Mohammed,et al.  SPHINX - an algorithm for taxonomic binning of metagenomic sequences , 2011, Bioinform..

[6]  Florent E. Angly,et al.  Comparative Metagenomics Reveals Host Specific Metavirulomes and Horizontal Gene Transfer Elements in the Chicken Cecum Microbiome , 2008, PloS one.

[7]  Yongan Zhao,et al.  RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data , 2011, Bioinform..

[8]  Yang Li,et al.  A de novo metagenomic assembly program for shotgun DNA reads , 2012, Bioinform..

[9]  J. Collins,et al.  Antibiotic Treatment Expands the Resistance Reservoir and Ecological Network of the Phage Metagenome , 2013, Nature.

[10]  Alice Carolyn McHardy,et al.  Taxonomic binning of metagenome samples generated by next-generation sequencing technologies , 2012, Briefings Bioinform..

[11]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[12]  Pavel A Pevzner,et al.  How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.

[13]  S. Tringe,et al.  Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.

[14]  Anne Bergeron,et al.  Divide and Conquer: Enriching Environmental Sequencing Data , 2007, PloS one.

[15]  Bernd Wemheuer,et al.  Metagenome Survey of a Multispecies and Alga-Associated Biofilm Revealed Key Elements of Bacterial-Algal Interactions in Photobioreactors , 2013, Applied and Environmental Microbiology.

[16]  Alison S. Waller,et al.  Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data , 2012, PloS one.

[17]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[18]  Siu-Ming Yiu,et al.  MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample , 2012, Bioinform..

[19]  Axel Visel,et al.  the sheep rumen microbiome Methane yield phenotypes linked to differential gene expression in , 2014 .

[20]  F. Collart,et al.  Environment sensing and response mediated by ABC transporters , 2011, BMC Genomics.

[21]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[22]  Susannah G. Tringe,et al.  The Metagenome of an Anaerobic Microbial Community Decomposing Poplar Wood Chips , 2012, PloS one.

[23]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[24]  Susannah G. Tringe,et al.  The metagenome of an anaerobic microbial community decomposing poplar 1 wood chips 2 3 , 2012 .

[25]  M. Pop,et al.  Sequence assembly demystified , 2013, Nature Reviews Genetics.

[26]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[27]  Stefano Lonardi,et al.  De novo meta-assembly of ultra-deep sequencing data , 2015, Bioinform..

[28]  Hideaki Tanaka,et al.  MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads , 2011, BCB '11.