Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors

Significance Here we use an informatics-based approach to natural product discovery that is broadly applicable to the isolation of medicinally relevant metabolites from environmental microbiomes. Combining metagenome sequencing and bioinformatics approaches with a defined set of metagenomic tools provides a template for the targeted discovery of compounds from the global metagenome. The power of this approach is demonstrated by surveying ketosynthase domain amplicon sequencing data from 185 soil microbiomes for biosynthetic gene clusters encoding epoxyketone proteasome inhibitors, leading to the isolation and characterization of seven epoxyketone natural products, including compounds with unique warhead structures. We believe this approach is applicable to any conserved biosynthetic gene and provides a higher-throughput cost-effective alternative to whole genome sequencing discovery methods. In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A–E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome.

[1]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[2]  Yaqin Ma,et al.  BatchPrimer3: A high throughput web application for PCR and sequencing primer design , 2008, BMC Bioinformatics.

[3]  C. Brenner,et al.  The nitrilase superfamily: classification, structure and function , 2001, Genome Biology.

[4]  B. Ostash,et al.  Insights into naturally minimised Streptomyces , 2014 .

[5]  Liisa Holm,et al.  Atlas of nonribosomal peptide and polyketide biosynthetic pathways reveals common occurrence of nonmodular enzymes , 2014, Proceedings of the National Academy of Sciences.

[6]  J. Zucko,et al.  ClustScan: an integrated program package for the semi-automatic annotation of modular biosynthetic gene clusters and in silico prediction of novel chemical structures , 2008, Nucleic acids research.

[7]  H. Overkleeft,et al.  Proteasome inhibitors: an expanding army attacking a unique target. , 2012, Chemistry & biology.

[8]  P. Dorrestein,et al.  Direct cloning and refactoring of a silent lipopeptide biosynthetic gene cluster yields the antibiotic taromycin A , 2014, Proceedings of the National Academy of Sciences.

[9]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[10]  S. Tringe,et al.  Comparative Metagenomics of Microbial Communities , 2004, Science.

[11]  Christopher N. Boddy,et al.  Bioinformatics tools for genome mining of polyketide and non-ribosomal peptides , 2014, Journal of Industrial Microbiology & Biotechnology.

[12]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.

[13]  R. H. Baltz Renaissance in antibacterial discovery from actinomycetes. , 2008, Current opinion in pharmacology.

[14]  Kai Blin,et al.  antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences , 2011, Nucleic Acids Res..

[15]  N. Fujii,et al.  Isolation, Structure Elucidation, and Total Synthesis of Tryptopeptins A and B, New TGF‐β Signaling Modulators from Streptomyces sp. , 2015 .

[16]  Neil L Kelleher,et al.  A Roadmap for Natural Product Discovery Based on Large-Scale Genomics and Metabolomics , 2014, Nature chemical biology.

[17]  M. Borodovsky,et al.  Ab initio gene identification in metagenomic sequences , 2010, Nucleic acids research.

[18]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[19]  S. Brady,et al.  Functional analysis of environmental DNA-derived type II polyketide synthases reveals structurally diverse secondary metabolites , 2011, Proceedings of the National Academy of Sciences.

[20]  Sean F. Brady,et al.  Chemical-biogeographic survey of secondary metabolism in soil , 2014, Proceedings of the National Academy of Sciences.

[21]  Paula Y. Calle,et al.  Mapping gene clusters within arrayed metagenomic libraries to expand the structural diversity of biomedically relevant natural products , 2013, Proceedings of the National Academy of Sciences.

[22]  Sylvie Lautru,et al.  Discovery of a new peptide natural product by Streptomyces coelicolor genome mining , 2005, Nature chemical biology.

[23]  R. Daniel The metagenomics of soil , 2005, Nature Reviews Microbiology.

[24]  M. Schorn,et al.  Genetic basis for the biosynthesis of the pharmaceutically important class of epoxyketone proteasome inhibitors. , 2014, ACS chemical biology.

[25]  O. Genilloud,et al.  New PCR Primers for the Screening of NRPS and PKS-I Systems in Actinomycetes: Detection and Distribution of These Biosynthetic Gene Sequences in Major Taxonomic Groups , 2004, Microbial Ecology.

[26]  S. Brady,et al.  eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes. , 2014, Chemistry & biology.

[27]  Susan Holmes,et al.  phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data , 2013, PloS one.

[28]  Sunny Shin,et al.  Substrate binding and sequence preference of the proteasome revealed by active-site-directed affinity probes. , 1998, Chemistry & biology.