Automated genome mining for natural products

BackgroundDiscovery of new medicinal agents from natural sources has largely been an adventitious process based on screening of plant and microbial extracts combined with bioassay-guided identification and natural product structure elucidation. Increasingly rapid and more cost-effective genome sequencing technologies coupled with advanced computational power have converged to transform this trend toward a more rational and predictive pursuit.ResultsWe have developed a rapid method of scanning genome sequences for multiple polyketide, nonribosomal peptide, and mixed combination natural products with output in a text format that can be readily converted to two and three dimensional structures using conventional software. Our open-source and web-based program can assemble various small molecules composed of twenty standard amino acids and twenty two other chain-elongation intermediates used in nonribosomal peptide systems, and four acyl-CoA extender units incorporated into polyketides by reading a hidden Markov model of DNA. This process evaluates and selects the substrate specificities along the assembly line of nonribosomal synthetases and modular polyketide synthases.ConclusionUsing this approach we have predicted the structures of natural products from a diverse range of bacteria based on a limited number of signature sequences. In accelerating direct DNA to metabolomic analysis, this method bridges the interface between chemists and biologists and enables rapid scanning for compounds with potential therapeutic value.

[1]  J. Mcchesney,et al.  Plant natural products: back to the future or into extinction? , 2007, Phytochemistry.

[2]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[3]  B. Moore,et al.  Cloning, sequencing, and biochemical characterization of the nostocyclopeptide biosynthetic gene cluster: molecular basis for imine macrocyclization. , 2004, Gene.

[4]  J. Irwin,et al.  Lead discovery using molecular docking. , 2002, Current opinion in chemical biology.

[5]  Kiejung Park,et al.  ASMPKS: an analysis system for modular polyketide synthases , 2007, BMC Bioinformatics.

[6]  H. Jenke-Kodama,et al.  Exploiting the mosaic structure of trans-acyltransferase polyketide synthases for natural product discovery and pathway dissection , 2008, Nature Biotechnology.

[7]  Ben Shen,et al.  Microbial genomics for the improvement of natural product discovery. , 2006, Current opinion in microbiology.

[8]  G. Challis,et al.  Predictive, structure-based model of amino acid recognition by nonribosomal peptide synthetase adenylation domains. , 2000, Chemistry & biology.

[9]  J M Ligon,et al.  The biosynthetic gene cluster for the microtubule-stabilizing agents epothilones A and B from Sorangium cellulosum So ce90. , 2000, Chemistry & biology.

[10]  T. Kuzuyama Mevalonate and Nonmevalonate Pathways for the Biosynthesis of Isoprene Units , 2002, Bioscience, biotechnology, and biochemistry.

[11]  Tilmann Weber,et al.  Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs) , 2005, Nucleic acids research.

[12]  Minoru Kanehisa,et al.  Comprehensive analysis of distinctive polyketide and nonribosomal peptide structural motifs encoded in microbial genomes. , 2007, Journal of molecular biology.

[13]  E. Pichersky,et al.  Metabolomics, genomics, proteomics, and the identification of enzymes and their substrates and products. , 2005, Current opinion in plant biology.

[14]  J. Keasling,et al.  Engineering a mevalonate pathway in Escherichia coli for production of terpenoids , 2003, Nature Biotechnology.

[15]  G. Challis,et al.  Substrate recognition by nonribosomal peptide synthetase multi-enzymes. , 2004, Microbiology.

[16]  Burkhard Haefner,et al.  Drugs from the deep: marine natural products as drug candidates. , 2003, Drug discovery today.

[17]  T. Czárán,et al.  Chemical warfare between microbes promotes biodiversity , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Staunton,et al.  Active-site residue, domain and module swaps in modular polyketide synthases , 2003, Journal of Industrial Microbiology and Biotechnology.

[19]  Arvind Saklani,et al.  Plant-derived compounds in clinical trials. , 2008, Drug discovery today.

[20]  Christopher T Walsh,et al.  Vancomycin assembly: nature's way. , 2003, Angewandte Chemie.

[21]  C. Walsh,et al.  Initiation, elongation, and termination strategies in polyketide and polypeptide antibiotic biosynthesis. , 1999, Current opinion in chemical biology.

[22]  P. Leadlay,et al.  Divergent sequence motifs correlated with the substrate specificity of (methyl)malonyl‐CoA:acyl carrier protein transacylase domains in modular polyketide synthases , 1995, FEBS letters.

[23]  Gregory Kucherov,et al.  NORINE: a database of nonribosomal peptides , 2007, Nucleic Acids Res..

[24]  Mark S Butler,et al.  The role of natural product chemistry in drug discovery. , 2004, Journal of natural products.

[25]  C. Walsh,et al.  Tailoring enzymes that modify nonribosomal peptides during and after chain elongation on NRPS assembly lines. , 2001, Current opinion in chemical biology.

[26]  C. Walsh,et al.  The parallel and convergent universes of polyketide synthases and nonribosomal peptide synthetases. , 1999, Chemistry & biology.

[27]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[28]  Gitanjali Yadav,et al.  NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases , 2004, Nucleic Acids Res..

[29]  C R Hutchinson,et al.  Alteration of the substrate specificity of a modular polyketide synthase acyltransferase domain through site-specific mutations. , 2001, Biochemistry.

[30]  M. Fischbach,et al.  Assembly-line enzymology for polyketide and nonribosomal Peptide antibiotics: logic, machinery, and mechanisms. , 2006, Chemical reviews.

[31]  T. Stachelhaus,et al.  The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases. , 1999, Chemistry & biology.

[32]  S. M. Rates,et al.  Plants as source of drugs. , 2001, Toxicon : official journal of the International Society on Toxinology.

[33]  J. Zucko,et al.  ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures , 2008, Nucleic acids research.

[34]  C. Khosla,et al.  Cloning and heterologous expression of the epothilone gene cluster. , 2000, Science.