antiSMASH 4.0—improvements in chemistry prediction and gene cluster boundary identification

Abstract Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production of such compounds. Since 2011, the ‘antibiotics and secondary metabolite analysis shell—antiSMASH’ has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features, including prediction of gene cluster boundaries using the ClusterFinder method or the newly integrated CASSIS algorithm, improved substrate specificity prediction for non-ribosomal peptide synthetase adenylation domains based on the new SANDPUMA algorithm, improved predictions for terpene and ribosomally synthesized and post-translationally modified peptides cluster products, reporting of sequence similarity to proteins encoded in experimentally characterized gene clusters on a per-protein basis and a domain-level alignment tool for comparative analysis of trans-AT polyketide synthase assembly line architectures. Additionally, several usability features have been updated and improved. Together, these improvements make antiSMASH up-to-date with the latest developments in natural product research and will further facilitate computational genome mining for the discovery of novel bioactive molecules.

[1]  Tilmann Weber,et al.  The secondary metabolite bioinformatics portal: Computational tools to facilitate synthetic biology of secondary metabolite production , 2016, Synthetic and systems biotechnology.

[2]  Victor M. Markowitz,et al.  IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites , 2015, mBio.

[3]  Tilmann Weber,et al.  The evolution of genome mining in microbes - a review. , 2016, Natural product reports.

[4]  Kai Blin,et al.  antiSMASH 2.0—a versatile platform for genome mining of secondary metabolite producers , 2013, Nucleic Acids Res..

[5]  Tilmann Weber,et al.  In silico tools for the analysis of antibiotic biosynthetic pathways. , 2014, International journal of medical microbiology : IJMM.

[6]  Michael A Fischbach,et al.  Computational approaches to natural product discovery. , 2015, Nature chemical biology.

[7]  Carlos Prieto,et al.  NRPSsp: non-ribosomal peptide synthase substrate predictor , 2012, Bioinform..

[8]  Christopher J. Schwalen,et al.  A new genome-mining tool redefines the lasso peptide biosynthetic landscape , 2016, Nature chemical biology.

[9]  Neetika Nath,et al.  CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes , 2015, Bioinform..

[10]  Jeroen S. Dickschat,et al.  Bacterial terpene cyclases. , 2016, Natural product reports.

[11]  Peter Man-Un Ung,et al.  Automated genome mining for natural products , 2009, BMC Bioinformatics.

[12]  Alexandre Renaux,et al.  MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes , 2016, Nucleic Acids Res..

[13]  Michael A. Skinnider,et al.  An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products , 2015, Nature Communications.

[14]  Heidi J. Imker,et al.  Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks. , 2015, Biochimica et biophysica acta.

[15]  Kai Blin,et al.  antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters , 2015, Nucleic Acids Res..

[16]  Kai Blin,et al.  NRPSpredictor2—a web server for predicting NRPS adenylation domain specificity , 2011, Nucleic Acids Res..

[17]  Roy T. Fielding,et al.  Principled design of the modern Web architecture , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[18]  M. Bibb,et al.  The use of a rare codon specifically during development? , 1991, Molecular microbiology.

[19]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[20]  Jurica Zucko,et al.  Predicting substrate specificity of adenylation domains of nonribosomal peptide synthetases and other protein properties by latent semantic indexing , 2013, Journal of Industrial Microbiology & Biotechnology.

[21]  Andreas Bechthold,et al.  The Gene bldA, a Regulator of Morphological Differentiation and Antibiotic Production in Streptomyces , 2015, Archiv der Pharmazie.

[22]  Kai Blin,et al.  CRISPy-web: An online resource to design sgRNAs for CRISPR applications , 2016, Synthetic and systems biotechnology.

[23]  Michael A. Skinnider,et al.  Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM) , 2015, Nucleic acids research.

[24]  Minoru Kanehisa,et al.  Comprehensive analysis of distinctive polyketide and nonribosomal peptide structural motifs encoded in microbial genomes. , 2007, Journal of molecular biology.

[25]  Kai Blin,et al.  The Antibiotic Resistant Target Seeker (ARTS), an exploration engine for antibiotic cluster prioritization and novel drug target discovery , 2017, Nucleic Acids Res..

[26]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[27]  K. Chater,et al.  TTA codons in some genes prevent their expression in a class of developmental, antibiotic-negative, Streptomyces mutants. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Carla S. Jones,et al.  Minimum Information about a Biosynthetic Gene cluster. , 2015, Nature chemical biology.

[29]  Bradley S Moore,et al.  Indexing the Pseudomonas specialized metabolome enabled the discovery of poaeamide B and the bananamides , 2016, Nature Microbiology.

[30]  J. Badger,et al.  The Natural Product Domain Seeker NaPDoS: A Phylogeny Based Bioinformatic Tool to Classify Secondary Metabolite Gene Diversity , 2012, PloS one.

[31]  Kai Blin,et al.  antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences , 2011, Nucleic Acids Res..

[32]  Roger G. Linington,et al.  Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters , 2014, Cell.

[33]  Renzo Kottmann,et al.  The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters , 2016, Nucleic Acids Res..

[34]  Kai Blin,et al.  plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters , 2016, bioRxiv.

[35]  Roland J. Siezen,et al.  Classification of the Adenylation and Acyl-Transferase Activity of NRPS and PKS Systems Using Ensembles of Substrate Specific Hidden Markov Models , 2013, PloS one.