Metagenomics and CAZyme Discovery.

Microorganisms play a primary role in regulating biogeochemical cycles and are a valuable source of enzymes that have biotechnological applications, such as carbohydrate-active enzymes (CAZymes). However, the inability to culture the majority of microorganisms that exist in natural ecosystems using common culture-dependent techniques restricts access to potentially novel cellulolytic bacteria and beneficial enzymes. The development of molecular-based culture-independent methods such as metagenomics enables researchers to study microbial communities directly from environmental samples, and presents a platform from which enzymes of interest can be sourced. We outline key methodological stages that are required as well as describe specific protocols that are currently used for metagenomic projects dedicated to CAZyme discovery.

[1]  Brandi L. Cantarel,et al.  The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics , 2008, Nucleic Acids Res..

[2]  A. Mackenzie,et al.  Do Rumen Bacteroidetes Utilize an Alternative Mechanism for Cellulose Degradation? , 2014, mBio.

[3]  Douglas H. Turner,et al.  Microarrays for identifying binding sites and probing structure of RNAs , 2014, Nucleic acids research.

[4]  G. Gloor,et al.  High throughput sequencing methods and analysis for microbiome research. , 2013, Journal of microbiological methods.

[5]  Alla Lapidus,et al.  A Bioinformatician's Guide to Metagenomics , 2008, Microbiology and Molecular Biology Reviews.

[6]  Rob Knight,et al.  Using QIIME to Analyze 16S rRNA Gene Sequences from Microbial Communities , 2011, Current protocols in bioinformatics.

[7]  A. Conesa,et al.  Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package , 2015, Nucleic acids research.

[8]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[9]  Eric C. Martens,et al.  Complex Glycan Catabolism by the Human Gut Microbiota: The Bacteroidetes Sus-like Paradigm , 2009, The Journal of Biological Chemistry.

[10]  Connor T. Skennerton,et al.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes , 2015, Genome research.

[11]  D. Jiménez,et al.  Unveiling the metabolic potential of two soil-derived microbial consortia selected on wheat straw , 2015, Scientific Reports.

[12]  S. Tangphatsornruang,et al.  Comparative analysis of sugarcane bagasse metagenome reveals unique and conserved biomass-degrading enzymes among lignocellulolytic microbial communities , 2015, Biotechnology for Biofuels.

[13]  Alice C. McHardy,et al.  From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer , 2016, mSystems.

[14]  C. Ponting,et al.  Sequencing depth and coverage: key considerations in genomic analyses , 2014, Nature Reviews Genetics.

[15]  Zhongwen Xie,et al.  A novel efficient β-glucanase from a paddy soil microbial metagenome with versatile activities , 2016, Biotechnology for Biofuels.

[16]  Alice Carolyn McHardy,et al.  Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods , 2014, Bioinform..

[17]  Owen White,et al.  The TIGRFAMs database of protein families , 2003, Nucleic Acids Res..

[18]  Frank Oliver Glöckner,et al.  TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences , 2004, BMC Bioinformatics.

[19]  Dominique Lavenier,et al.  Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software , 2017, bioRxiv.

[20]  Emily S. Charlson,et al.  Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications , 2011, Nature Biotechnology.

[21]  M. Borodovsky,et al.  Ab initio gene identification in metagenomic sequences , 2010, Nucleic acids research.

[22]  S. Hallam,et al.  Forest harvesting reduces the soil metagenomic potential for biomass decomposition , 2015, The ISME Journal.

[23]  Xin Chen,et al.  dbCAN: a web resource for automated carbohydrate-active enzyme annotation , 2012, Nucleic Acids Res..

[24]  Tom O. Delmont,et al.  Anvi’o: an advanced analysis and visualization platform for ‘omics data , 2015, PeerJ.

[25]  Jing Chen,et al.  MGAviewer: a desktop visualization tool for analysis of metagenomics alignment data , 2013, Bioinform..

[26]  David Laehnemann,et al.  Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction , 2015, Briefings Bioinform..

[27]  E. Uberbacher,et al.  CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. , 2010, Glycobiology.

[28]  Jun Wang,et al.  Omics-based interpretation of synergism in a soil-derived cellulose-degrading microbial community , 2014, Scientific Reports.

[29]  Chien-Chi Lo,et al.  Improved Assemblies Using a Source-Agnostic Pipeline for MetaGenomic Assembly by Merging (MeGAMerge) of Contigs , 2014, Scientific Reports.

[30]  J. Korlach,et al.  Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing , 2016, mBio.

[31]  O. López-López,et al.  Metagenomics of an Alkaline Hot Spring in Galicia (Spain): Microbial Diversity Analysis and Screening for Novel Lipolytic Enzymes , 2015, Front. Microbiol..

[32]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[33]  S. Tringe,et al.  Metagenomic Discovery of Biomass-Degrading Genes and Genomes from Cow Rumen , 2011, Science.

[34]  Donovan Parks,et al.  GroopM: an automated tool for the recovery of population genomes from related metagenomes , 2014, PeerJ.

[35]  Lauren M. Bragg,et al.  Metagenomics using next-generation sequencing. , 2014, Methods in molecular biology.

[36]  S. Tringe,et al.  MaxBin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm , 2014, Microbiome.

[37]  Anders F. Andersson,et al.  Binning metagenomic contigs by coverage and composition , 2014, Nature Methods.

[38]  P. B. Pope,et al.  Metagenomics of the Svalbard Reindeer Rumen Microbiome Reveals Abundance of Polysaccharide Utilization Loci , 2012, PloS one.

[39]  Pedro M. Coutinho,et al.  The carbohydrate-active enzymes database (CAZy) in 2013 , 2013, Nucleic Acids Res..

[40]  P. B. Pope,et al.  Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data , 2015, Scientific Reports.

[41]  Frederik Schulz,et al.  Prediction of microbial phenotypes based on comparative genomics , 2015, BMC Bioinformatics.

[42]  Chris F. Taylor,et al.  The minimum information about a genome sequence (MIGS) specification , 2008, Nature Biotechnology.

[43]  M. Morrison,et al.  Analysis of the bovine rumen microbiome reveals a diversity of Sus-like polysaccharide utilization loci from the bacterial phylum Bacteroidetes , 2014, Journal of Industrial Microbiology & Biotechnology.

[44]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[45]  B. Henrissat,et al.  Discovery and characterization of a new family of lytic polysaccharide mono-oxygenases , 2013, Nature chemical biology.

[46]  Peter Meinicke,et al.  Predicting phenotypic traits of prokaryotes from protein domain frequencies , 2010, BMC Bioinformatics.

[47]  S. Denman,et al.  High-Yield and Phylogenetically Robust Methods of DNA Recovery for Analysis of Microbial Biofilms Adherent to Plant Biomass in the Herbivore Gut , 2011, Microbial Ecology.

[48]  P. Hugenholtz,et al.  Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes , 2013, Nature Biotechnology.

[49]  A. Klieve,et al.  Characterization of culturable anaerobic bacteria from the forestomach of an eastern grey kangaroo, Macropus giganteus , 2005, Letters in applied microbiology.

[50]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[51]  Alice C McHardy,et al.  PhyloPythiaS+: a self-training method for the rapid reconstruction of low-ranking taxonomic bins from metagenomes , 2014, PeerJ.

[52]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[53]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[54]  Rob Knight,et al.  The Earth Microbiome project: successes and aspirations , 2014, BMC Biology.

[55]  A. Mchardy,et al.  Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders , 2014, bioRxiv.

[56]  Juan Liu,et al.  Cloning and functional characterization of a novel endo-β-1,4-glucanase gene from a soil-derived metagenomic library , 2011, Applied Microbiology and Biotechnology.

[57]  Peer Bork,et al.  Systematic Association of Genes to Phenotypes by Genome and Literature Mining , 2005, PLoS biology.

[58]  Duu-Jong Lee,et al.  Enrichment strategy to select functional consortium from mixed cultures: Consortium from rumen liquor for simultaneous cellulose degradation and hydrogen production , 2010 .

[59]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[60]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[61]  Alice C McHardy,et al.  What's in the mix: phylogenetic classification of metagenome sequence samples. , 2007, Current opinion in microbiology.

[62]  I-Min A. Chen,et al.  IMG/M 4 version of the integrated metagenome comparative analysis system , 2013, Nucleic Acids Res..

[63]  Torsten Thomas,et al.  Selective Extraction of Bacterial DNA from the Surfaces of Macroalgae , 2008, Applied and Environmental Microbiology.

[64]  Thomas Wetter,et al.  Genome Sequence Assembly Using Trace Signals and Additional Sequence Information , 1999, German Conference on Bioinformatics.

[65]  Natalia N. Ivanova,et al.  Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite , 2007, Nature.

[66]  T. Shinkai,et al.  Metagenomic analysis of the rumen microbial community following inhibition of methane formation by a halogenated methane analog , 2015, Front. Microbiol..

[67]  J.-F. Cheng,et al.  Adaptation to herbivory by the Tammar wallaby includes bacterial and glycoside hydrolase profiles different from other herbivores , 2010, Proceedings of the National Academy of Sciences.

[68]  Pelin Yilmaz,et al.  The genomic standards consortium: bringing standards to life for microbial ecology , 2011, The ISME Journal.

[69]  Mark Gerstein,et al.  An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits , 2006, PLoS Comput. Biol..

[70]  Ian Clark,et al.  Metagenomic comparison of direct and indirect soil DNA extraction approaches. , 2011, Journal of microbiological methods.

[71]  P. B. Pope,et al.  De novo prediction of the genomic components and capabilities for microbial plant biomass degradation from (meta-)genomes , 2013, Biotechnology for Biofuels.

[72]  M. Pop,et al.  Sequence assembly demystified , 2013, Nature Reviews Genetics.

[73]  Narmada Thanki,et al.  CDD: NCBI's conserved domain database , 2014, Nucleic Acids Res..

[74]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[75]  J. Neufeld,et al.  Something from (almost) nothing: the impact of multiple displacement amplification on microbial ecology , 2008, The ISME Journal.

[76]  Dongwan D. Kang,et al.  MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities , 2015, PeerJ.

[77]  M. Schatz,et al.  Hybrid error correction and de novo assembly of single-molecule sequencing reads , 2012, Nature Biotechnology.

[78]  Daniel J. Nasko,et al.  Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome , 2014, Microbiome.

[79]  Miriam L. Land,et al.  Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences , 2014, Bioinform..

[80]  S. Denman,et al.  Plant biomass degradation by gut microbiomes: more of the same or something new? , 2009, Current opinion in biotechnology.

[81]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[82]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .