SeMPI: a genome-based secondary metabolite prediction and identification web server

Abstract The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi.

[1]  David J. Fox,et al.  Structure and biosynthesis of the unusual polyketide alkaloid coelimycin P1, a metabolic product of the cpk gene cluster of Streptomyces coelicolor M145 , 2012 .

[2]  A. Keatinge-Clay,et al.  The structures of type I polyketide synthases. , 2012, Natural product reports.

[3]  Robert D. Finn,et al.  HMMER web server: 2015 update , 2015, Nucleic Acids Res..

[4]  Nobuyuki Fujita,et al.  DoBISCUIT: a database of secondary metabolite biosynthetic gene clusters , 2012, Nucleic Acids Res..

[5]  Chu-Young Kim,et al.  The 2.7-Å crystal structure of a 194-kDa homodimeric fragment of the 6-deoxyerythronolide B synthase , 2006 .

[6]  Thomas J. Simpson,et al.  COMPLEX ENZYMES IN MICROBIAL NATURAL PRODUCT BIOSYNTHESIS, PART B: POLYKETIDES, AMINOCOUMARINS AND CARBOHYDRATES , 2009 .

[7]  Kai Blin,et al.  antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters , 2015, Nucleic Acids Res..

[8]  Peter Man-Un Ung,et al.  Automated genome mining for natural products , 2009, BMC Bioinformatics.

[9]  Modification of post-PKS tailoring steps through combinatorial biosynthesis. , 2002, Natural product reports.

[10]  K. Ochi,et al.  New strategies for drug discovery: activation of silent or weakly expressed microbial gene clusters , 2012, Applied Microbiology and Biotechnology.

[11]  Ben Shen,et al.  Advances in polyketide synthase structure and function. , 2008, Current opinion in drug discovery & development.

[12]  E. G. de Macedo Lemos,et al.  Biotechnology of polyketides: New breath of life for the novel antibiotic genetic pathways discovery through metagenomics , 2013, Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology].

[13]  Gitanjali Yadav,et al.  Towards Prediction of Metabolic Products of Polyketide Synthases: An In Silico Analysis , 2009, PLoS Comput. Biol..

[14]  J. Feldmann,et al.  Biosynthesis of the Fluorinated Natural Product Nucleocidin in Streptomyces calvus Is Dependent on the bldA‐Specified Leu‐tRNAUUA Molecule , 2015, Chembiochem : a European journal of chemical biology.

[15]  Carla S. Jones,et al.  Minimum Information about a Biosynthetic Gene cluster. , 2015, Nature chemical biology.

[16]  Stefan Günther,et al.  StreptomeDB 2.0—an extended resource of natural products produced by streptomycetes , 2015, Nucleic Acids Res..

[17]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[18]  M. Greiner,et al.  Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. , 2000, Preventive veterinary medicine.

[19]  Carlos Olano,et al.  Post-PKS tailoring steps in natural product-producing actinomycetes from the perspective of combinatorial biosynthesis. , 2010, Natural product reports.

[20]  Fabian Sievers,et al.  Clustal Omega, accurate alignment of very large numbers of sequences. , 2014, Methods in molecular biology.

[21]  Kira J Weissman,et al.  The structure of docking domains in modular polyketide synthases. , 2003, Chemistry & biology.

[22]  Michael A. Skinnider,et al.  Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM) , 2015, Nucleic acids research.

[23]  B. Shen,et al.  Characterization of the Tautomycin Biosynthetic Gene Cluster from Streptomyces spiroverticillatus Unveiling New Insights into Dialkylmaleic Anhydride and Polyketide Biosynthesis* , 2008, Journal of Biological Chemistry.

[24]  R. Breitling,et al.  Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast , 2013, Molecular biology and evolution.

[25]  Georgios Skiniotis,et al.  Structure of a modular polyketide synthase , 2014, Nature.

[26]  Mark Borodovsky,et al.  GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses , 2005, Nucleic Acids Res..