dbCAN: a web resource for automated carbohydrate-active enzyme annotation

Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (http://csbl.bmb.uga.edu/dbCAN/annotate.php), to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.

[1]  I-Min A. Chen,et al.  The integrated microbial genomes system: an expanding comparative analysis resource , 2009, Nucleic Acids Res..

[2]  Qi Wu,et al.  Evidence of cellulose metabolism by the giant panda gut microbiome , 2011, Proceedings of the National Academy of Sciences.

[3]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[4]  P. Bork,et al.  A human gut microbial gene catalogue established by metagenomic sequencing , 2010, Nature.

[5]  J. Clemente,et al.  Diet Drives Convergence in Gut Microbiome Functions Across Mammalian Phylogeny and Within Humans , 2011, Science.

[6]  J. Handelsman Metagenomics: Application of Genomics to Uncultured Microorganisms , 2004, Microbiology and Molecular Biology Reviews.

[7]  Raphael Lamed,et al.  From cellulosomes to cellulosomics. , 2008, Chemical record.

[8]  Christina A. Cuomo,et al.  Obligate biotrophy features unraveled by the genomic analysis of rust fungi , 2011, Proceedings of the National Academy of Sciences.

[9]  Christina A. Cuomo,et al.  Obligate Biotrophy Features Unraveled by the Genomic Analysis of the Rust Fungi, Melampsora larici-populina and Puccinia graminis f. sp. tritici , 2011 .

[10]  Sun-Shin Cha,et al.  Approaches for novel enzyme discovery from marine environments. , 2010, Current opinion in biotechnology.

[11]  O. White,et al.  Environmental Genome Shotgun Sequencing of the Sargasso Sea , 2004, Science.

[12]  A. Godzik Metagenomics and the protein universe. , 2011, Current opinion in structural biology.

[13]  P. D’haeseleer,et al.  Targeted Discovery of Glycoside Hydrolases from a Switchgrass-Adapted Compost Community , 2010, PloS one.

[14]  Manuel Ferrer,et al.  Metagenomic era for biocatalyst identification. , 2010, Current opinion in biotechnology.

[15]  Safiyh Taghavi,et al.  Bioprospecting metagenomes: glycosyl hydrolases for converting biomass , 2009, Biotechnology for biofuels.

[16]  K. Nelson,et al.  Gene-centric metagenomics of the fiber-adherent bovine rumen microbiome reveals forage specific glycoside hydrolases , 2009, Proceedings of the National Academy of Sciences.

[17]  E. Uberbacher,et al.  CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. , 2010, Glycobiology.

[18]  Benjamin J. Raphael,et al.  The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families , 2007, PLoS biology.

[19]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[20]  Brandi L. Cantarel,et al.  The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics , 2008, Nucleic Acids Res..

[21]  Jonathan Kennedy,et al.  Marine metagenomics: strategies for the discovery of novel enzymes with biotechnological applications from marine environments , 2008, Microbial cell factories.

[22]  J. Doré,et al.  Functional metagenomics to mine the human gut microbiome for dietary fiber catabolic enzymes. , 2010, Genome research.

[23]  S. Tringe,et al.  Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.

[24]  Narmada Thanki,et al.  CDD: specific functional annotation with the Conserved Domain Database , 2008, Nucleic Acids Res..

[25]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[26]  Robert D. Finn,et al.  InterPro: the integrative protein signature database , 2008, Nucleic Acids Res..