WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads

BackgroundMetagenomics is a new field of research on natural microbial communities. High-throughput sequencing techniques like 454 or Solexa-Illumina promise new possibilities as they are able to produce huge amounts of data in much shorter time and with less efforts and costs than the traditional Sanger technique. But the data produced comes in even shorter reads (35-100 basepairs with Illumina, 100-500 basepairs with 454-sequencing). CARMA is a new software pipeline for the characterisation of species composition and the genetic potential of microbial samples using short, unassembled reads.ResultsIn this paper, we introduce WebCARMA, a refined version of CARMA available as a web application for the taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities. In addition, we have analysed the applicability of ultra-short reads in metagenomics.ConclusionsWe show that unassembled reads as short as 35 bp can be used for the taxonomic classification of a metagenome. The web application is freely available at http://webcarma.cebitec.uni-bielefeld.de.

[1]  Nancy Knowlton,et al.  Baselines and Degradation of Coral Reefs in the Northern Line Islands , 2008, PloS one.

[2]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[3]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[4]  Dmitry Pushkarev,et al.  Single-molecule sequencing of an individual human genome , 2009, Nature Biotechnology.

[5]  Naryttza N. Diaz,et al.  TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach , 2009, BMC Bioinformatics.

[6]  Hideaki Sugawara,et al.  Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples. , 2005, DNA research : an international journal for rapid publication of reports on genes and genomes.

[7]  Frank Oliver Glöckner,et al.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences , 2004, BMC Bioinformatics.

[8]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[10]  S Karlin,et al.  Compositional biases of bacterial genomes and evolutionary implications , 1997, Journal of bacteriology.

[11]  Jaysheel D. Bhavsar,et al.  Metagenomics: Read Length Matters , 2008, Applied and Environmental Microbiology.

[12]  S. Giovannoni,et al.  Genetic diversity in Sargasso Sea bacterioplankton , 1990, Nature.

[13]  Jo Handelsman,et al.  Metagenomics for studying unculturable microorganisms: cutting the Gordian knot , 2005, Genome Biology.

[14]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[15]  C. Woese,et al.  Bacterial evolution , 1987, Microbiological reviews.

[16]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[17]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[18]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[19]  I. Rigoutsos,et al.  Accurate phylogenetic classification of variable-length DNA fragments , 2007, Nature Methods.

[20]  B. Andresen,et al.  Genomic analysis of uncultured marine viral communities , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Hans Lehrach,et al.  Characterizing the mouse ES cell transcriptome with Illumina sequencing. , 2008, Genomics.

[22]  S. Karlin,et al.  Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[23]  J. Banfield,et al.  Community structure and metabolism through reconstruction of microbial genomes from the environment , 2004, Nature.

[24]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[25]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[26]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .

[27]  Andreas Tauch,et al.  Taxonomic composition and gene content of a methane-producing microbial community isolated from a biogas reactor. , 2008, Journal of biotechnology.

[28]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[29]  Naryttza N. Diaz,et al.  The metagenome of a biogas-producing microbial community of a production-scale biogas plant fermenter analysed by the 454-pyrosequencing technology. , 2008, Journal of biotechnology.

[30]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[31]  Florent E. Angly,et al.  Microbial Ecology of Four Coral Atolls in the Northern Line Islands , 2008, PloS one.

[32]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[33]  Naryttza N. Diaz,et al.  Phylogenetic classification of short environmental DNA fragments , 2008, Nucleic acids research.

[34]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[35]  R. Daniel,et al.  Rapid Identification of Genes Encoding DNA Polymerases by Function-Based Screening of Metagenomic Libraries Derived from Glacial Ice , 2009, Applied and Environmental Microbiology.