BiG-FAM: the biosynthetic gene cluster families database

Abstract Computational analysis of biosynthetic gene clusters (BGCs) has revolutionized natural product discovery by enabling the rapid investigation of secondary metabolic potential within microbial genome sequences. Grouping homologous BGCs into Gene Cluster Families (GCFs) facilitates mapping their architectural and taxonomic diversity and provides insights into the novelty of putative BGCs, through dereplication with BGCs of known function. While multiple databases exist for exploring BGCs from publicly available data, no public resources exist that focus on GCF relationships. Here, we present BiG-FAM, a database of 29,955 GCFs capturing the global diversity of 1,225,071 BGCs predicted from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs). The database offers rich functionalities, such as multi-criterion GCF searches, direct links to BGC databases such as antiSMASH-DB, and rapid GCF annotation of user-supplied BGCs from antiSMASH results. BiG-FAM can be accessed online at https://bigfam.bioinformatics.nl.

[1]  Anna Lechner,et al.  Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. , 2015, Chemistry and Biology.

[2]  Elaina D. Graham,et al.  The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans , 2017, Scientific Data.

[3]  Graham A. Hudson,et al.  Bioinformatic Mapping of Radical S-Adenosylmethionine-Dependent Ribosomally Synthesized and Post-Translationally Modified Peptides Identifies New Cα, Cβ, and Cγ-Linked Thioether-Containing Peptides. , 2019, Journal of the American Chemical Society.

[4]  Marnix H. Medema,et al.  A computational framework to explore large-scale biosynthetic diversity , 2019, Nature Chemical Biology.

[5]  Justin J. J. van der Hooft,et al.  The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery , 2019, ACS central science.

[6]  Ryan A McClure,et al.  Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer , 2016, ACS central science.

[7]  Roger G. Linington,et al.  Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters , 2014, Cell.

[8]  Krystle L. Chavarria,et al.  Diversity and evolution of secondary metabolism in the marine actinomycete genus Salinispora , 2014, Proceedings of the National Academy of Sciences.

[9]  D. Wibberg,et al.  Draft genome sequence of Streptomyces tunisialbus DSM 105760T , 2020, Archives of Microbiology.

[10]  Inna Dubchak,et al.  MycoCosm portal: gearing up for 1000 fungal genomes , 2013, Nucleic Acids Res..

[11]  Erin E. Carlson,et al.  Sharing and community curation of mass spectrometry data with GNPS , 2016 .

[12]  I-Min A. Chen,et al.  IMG-ABC v.5.0: an update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase , 2019, Nucleic Acids Res..

[13]  Kristian Fog Nielsen,et al.  Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking , 2016, Nature Biotechnology.

[14]  Robert D. Finn,et al.  A unified catalog of 204,938 reference genomes from the human gut microbiome , 2020, Nature Biotechnology.

[15]  Justin J. J. van der Hooft,et al.  BiG-SLiCE: A highly scalable tool maps the diversity of 1.2 million biosynthetic gene clusters , 2020, bioRxiv.

[16]  Kai Blin,et al.  The antiSMASH database version 2: a comprehensive resource on secondary metabolite biosynthetic gene clusters , 2018, Nucleic Acids Res..

[17]  Roger G. Linington,et al.  Molecular networking as a dereplication strategy. , 2013, Journal of natural products.

[18]  Weihong Jiang,et al.  The SCIFF‐Derived Ranthipeptides Participate in Quorum Sensing in Solventogenic Clostridia , 2020, Biotechnology journal.

[19]  Hosein Mohimani,et al.  Linking genomics and metabolomics to chart specialized metabolic diversity. , 2020, Chemical Society reviews.

[20]  Nuno Bandeira,et al.  MS/MS networking guided analysis of molecule and gene cluster families , 2013, Proceedings of the National Academy of Sciences.

[21]  A. Cabello,et al.  Microbial natural products as a source of antifungals. , 2003, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[22]  Neil L Kelleher,et al.  A Roadmap for Natural Product Discovery Based on Large-Scale Genomics and Metabolomics , 2014, Nature chemical biology.

[23]  W. Metcalf,et al.  Comparative genomics of actinomycetes with a focus on natural product biosynthetic genes , 2013, BMC Genomics.

[24]  Donovan H. Parks,et al.  Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life , 2017, Nature Microbiology.

[25]  M. Pallen,et al.  Assembly of hundreds of novel bacterial genomes from the chicken caecum , 2020, Genome Biology.

[26]  D. Haft,et al.  Biological Systems Discovery In Silico: Radical S-Adenosylmethionine Protein Families and Their Target Peptides for Posttranslational Modification , 2011, Journal of bacteriology.

[27]  Arnold L. Demain,et al.  Importance of microbial natural products and the need to revitalize their discovery , 2014, Journal of Industrial Microbiology & Biotechnology.

[28]  Donovan H. Parks,et al.  A complete domain-to-species taxonomy for Bacteria and Archaea , 2020, Nature Biotechnology.

[29]  S. Elkahoui,et al.  Streptomyces tunisialbus sp. nov., a novel Streptomyces species with antimicrobial activity , 2018, Antonie van Leeuwenhoek.

[30]  Roger G. Linington,et al.  MIBiG 2.0: a repository for biosynthetic gene clusters of known function , 2019, Nucleic Acids Res..

[31]  Pieter C Dorrestein,et al.  Quantitative molecular networking to profile marine cyanobacterial metabolomes , 2013, The Journal of Antibiotics.

[32]  R. Breitling,et al.  Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast , 2013, Molecular biology and evolution.

[33]  S. Lee,et al.  antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline , 2019, Nucleic Acids Res..

[34]  Mick Watson,et al.  Compendium of 4,941 rumen metagenome-assembled genomes for rumen microbiome biology and enzyme discovery , 2019, Nature Biotechnology.

[35]  Michael A. Skinnider,et al.  PRISM 3: expanded prediction of natural product chemical structures from microbial genomes , 2017, Nucleic Acids Res..

[36]  Juho Rousu,et al.  Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions , 2020, bioRxiv.