The MACADAM database: a MetAboliC pAthways DAtabase for Microbial taxonomic groups for mining potential metabolic capacities of archaeal and bacterial taxonomic groups

Abstract Progress in genome sequencing and bioinformatics opens up new possibilities, including that of correlating genome annotations with functional information such as metabolic pathways. Thanks to the development of functional annotation databases, scientists are able to link genome annotations with functional annotations. We present MetAboliC pAthways DAtabase for Microbial taxonomic groups (MACADAM) here, a user-friendly database that makes it possible to find presence/absence/completeness statistics for metabolic pathways at a given microbial taxonomic position. For each prokaryotic ‘RefSeq complete genome’, MACADAM builds a pathway genome database (PGDB) using Pathway Tools software based on MetaCyc data that includes metabolic pathways as well as associated metabolites, reactions and enzymes. To ensure the highest quality of the genome functional annotation data, MACADAM also contains MicroCyc, a manually curated collection of PGDBs; Functional Annotation of Prokaryotic Taxa (FAPROTAX), a manually curated functional annotation database; and the IJSEM phenotypic database. The MACADAM database contains 13 509 PGDBs (13 195 bacterial and 314 archaeal), 1260 unique metabolic pathways, completed with 82 functional annotations from FAPROTAX and 16 from the IJSEM phenotypic database. MACADAM contains a total of 7921 metabolites, 592 enzymatic reactions, 2134 EC numbers and 7440 enzymes. MACADAM can be queried at any rank of the NCBI taxonomy (from phyla to species). It provides the possibility to explore functional information completed with metabolites, enzymes, enzymatic reactions and EC numbers. MACADAM returns a tabulated file containing a list of pathways with two scores (pathway score and pathway frequency score) that are present in the queried taxa. The file also contains the names of the organisms in which the pathways are found and the metabolic hierarchy associated with the pathways. Finally, MACADAM can be downloaded as a single file and queried with SQLite or python command lines or explored through a web interface.

[1]  Wolfgang Ludwig,et al.  Road map of the phyla Bacteroidetes , Spirochaetes , Tenericutes ( Mollicutes ), Acidobacteria , Fibrobacteres , Fusobacteria , Dictyoglomi , Gemmatimonadetes , Lentisphaerae , Verrucomicrobia , Chlamydiae , and Planctomycetes , 2015 .

[2]  Alexandre Renaux,et al.  MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes , 2016, Nucleic Acids Res..

[3]  Peter D. Karp,et al.  A systematic comparison of the MetaCyc and KEGG pathway databases , 2013, BMC Bioinformatics.

[4]  Rida Assaf,et al.  Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center , 2016, Nucleic Acids Res..

[5]  Stefan Engelen,et al.  MicroScope: a platform for microbial genome annotation and comparative genomics , 2009, Database J. Biol. Databases Curation.

[6]  W. D. de Vos,et al.  Production of butyrate from lysine and the Amadori product fructoselysine by a human gut commensal , 2015, Nature Communications.

[7]  R. E. Buchanan,et al.  Bergey's Manual of Determinative Bacteriology. , 1975 .

[8]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[9]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[10]  Peter D. Karp,et al.  The EcoCyc Database , 2002, Nucleic Acids Res..

[11]  S. T. Cowan Bergey's Manual of Determinative Bacteriology , 1948, Nature.

[12]  Eric P. Nawrocki,et al.  NCBI prokaryotic genome annotation pipeline , 2016, Nucleic acids research.

[13]  Ryan Miller,et al.  WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research , 2017, Nucleic Acids Res..

[14]  C. Médigue,et al.  MaGe: a microbial genome annotation system supported by synteny results , 2006, Nucleic acids research.

[15]  C. Claudel-Renard,et al.  Enzyme-specific profiles for genome annotation: PRIAM. , 2003, Nucleic acids research.

[16]  Peter D. Karp,et al.  Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology , 2016, Briefings Bioinform..

[17]  George M. Garrity,et al.  The Archaea and the deeply branching and phototrophic bacteria , 2001 .

[18]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[19]  K. Schleifer,et al.  Bergey's manual of systematic bacteriology Volume Four, The bacteroidetes, spirochaetes, tenericutes (mollicutes), acidobacteria, fibrobacteres, fusobacteria, dictyoglomi, gemmatimonadetes, lentisphaerae, verrucomicrobia, chlamudiae, and planctomycetes / , 2010 .

[20]  Tatiana A. Tatusova,et al.  RefSeq microbial genomes database: new representation and annotation strategy , 2013, Nucleic Acids Res..

[21]  Markus Krummenacker,et al.  The MetaCyc database of metabolic pathways and enzymes , 2017, Nucleic acids research.

[22]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.

[23]  M. Doebeli,et al.  Decoupling function and taxonomy in the global ocean microbiome , 2016, Science.

[24]  D. Huson,et al.  SILVA, RDP, Greengenes, NCBI and OTT — how do these taxonomies compare? , 2017, BMC Genomics.

[25]  G. Kowalchuk,et al.  The Ecology of Acidobacteria: Moving beyond Genes and Genomes , 2016, Front. Microbiol..

[26]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[27]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[28]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[29]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[30]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[31]  G. Garrity Bergey’s Manual® of Systematic Bacteriology , 2012, Springer New York.

[32]  Lincoln D. Stein,et al.  Impact of outdated gene annotations on pathway enrichment analysis , 2016, Nature Methods.

[33]  David S. Wishart,et al.  HMDB 4.0: the human metabolome database for 2018 , 2017, Nucleic Acids Res..

[34]  Mahendra Mariadassou,et al.  FROGS: Find, Rapidly, OTUs with Galaxy Solution , 2018, Bioinform..

[35]  T. Hansen Bergey's Manual of Systematic Bacteriology , 2005 .

[36]  Suzanne M. Paley,et al.  The BioCyc collection of microbial genomes and metabolic pathways , 2019, Briefings Bioinform..

[37]  N. Fierer,et al.  Hiding in Plain Sight: Mining Bacterial Species Records for Phenotypic Trait Information , 2017, mSphere.