RRE-Finder: a Genome-Mining Tool for Class-Independent RiPP Discovery

Bioinformatics-powered discovery of novel ribosomal natural products (RiPPs) has historically been hindered by the lack of a common genetic feature across RiPP classes. Herein, we introduce RRE-Finder, a method for identifying RRE domains, which are present in a majority of prokaryotic RiPP biosynthetic gene clusters (BGCs). RRE-Finder identifies RRE domains 3,000 times faster than current methods, which rely on time-consuming secondary structure prediction. Depending on user goals, RRE-Finder can operate in precision mode to accurately identify RREs present in known RiPP classes or in exploratory mode to assist with novel RiPP discovery. Employing RRE-Finder on the UniProtKB database revealed several high-confidence RREs in novel RiPP-like clusters, suggesting that many new RiPP classes remain to be discovered. ABSTRACT Many ribosomally synthesized and posttranslationally modified peptide classes (RiPPs) are reliant on a domain called the RiPP recognition element (RRE). The RRE binds specifically to a precursor peptide and directs the posttranslational modification enzymes to their substrates. Given its prevalence across various types of RiPP biosynthetic gene clusters (BGCs), the RRE could theoretically be used as a bioinformatic handle to identify novel classes of RiPPs. In addition, due to the high affinity and specificity of most RRE-precursor peptide complexes, a thorough understanding of the RRE domain could be exploited for biotechnological applications. However, sequence divergence of RREs across RiPP classes has precluded automated identification based solely on sequence similarity. Here, we introduce RRE-Finder, a new tool for identifying RRE domains with high sensitivity. RRE-Finder can be used in precision mode to confidently identify RREs in a class-specific manner or in exploratory mode to assist in the discovery of novel RiPP classes. RRE-Finder operating in precision mode on the UniProtKB protein database retrieved ∼25,000 high-confidence RREs spanning all characterized RRE-dependent RiPP classes, as well as several yet-uncharacterized RiPP classes that require future experimental confirmation. Finally, RRE-Finder was used in precision mode to explore a possible evolutionary origin of the RRE domain. The results suggest RREs originated from a co-opted DNA-binding transcriptional regulator domain. Altogether, RRE-Finder provides a powerful new method to probe RiPP biosynthetic diversity and delivers a rich data set of RRE sequences that will provide a foundation for deeper biochemical studies into this intriguing and versatile protein domain. IMPORTANCE Bioinformatics-powered discovery of novel ribosomal natural products (RiPPs) has historically been hindered by the lack of a common genetic feature across RiPP classes. Herein, we introduce RRE-Finder, a method for identifying RRE domains, which are present in a majority of prokaryotic RiPP biosynthetic gene clusters (BGCs). RRE-Finder identifies RRE domains 3,000 times faster than current methods, which rely on time-consuming secondary structure prediction. Depending on user goals, RRE-Finder can operate in precision mode to accurately identify RREs present in known RiPP classes or in exploratory mode to assist with novel RiPP discovery. Employing RRE-Finder on the UniProtKB database revealed several high-confidence RREs in novel RiPP-like clusters, suggesting that many new RiPP classes remain to be discovered.

[1]  Carla S. Jones,et al.  Minimum Information about a Biosynthetic Gene cluster. , 2015, Nature chemical biology.

[2]  The genomic landscape of ribosomal peptides containing thiazole and oxazole heterocycles , 2015, BMC Genomics.

[3]  Graham A. Hudson,et al.  Bioinformatic Expansion and Discovery of Thiopeptide Antibiotics. , 2018, Journal of the American Chemical Society.

[4]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[5]  Yi-Zun Yu,et al.  Evolution of lanthipeptide synthetases , 2012, Proceedings of the National Academy of Sciences.

[6]  J. Klinman,et al.  Nuclear Magnetic Resonance Structure and Binding Studies of PqqD, a Chaperone Required in the Biosynthesis of the Bacterial Dehydrogenase Cofactor Pyrroloquinoline Quinone. , 2017, Biochemistry.

[7]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[8]  Graham A. Hudson,et al.  Biosynthetic Timing and Substrate Specificity for the Thiopeptide Thiomuracin. , 2016, Journal of the American Chemical Society.

[9]  P. Bork,et al.  Interactive Tree Of Life (iTOL) v4: recent updates and new developments , 2019, Nucleic Acids Res..

[10]  A. Driessen,et al.  Substrate Recognition and Specificity of the NisB Protein, the Lantibiotic Dehydratase Involved in Nisin Biosynthesis* , 2011, The Journal of Biological Chemistry.

[11]  W. A. van der Donk,et al.  Expanded Natural Product Diversity Revealed by Analysis of Lanthipeptide-Like Gene Clusters in Actinobacteria , 2015, Applied and Environmental Microbiology.

[12]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[13]  B. Moore,et al.  Discovery and assembly-line biosynthesis of the lymphostin pyrroloquinoline alkaloid family of mTOR inhibitors in Salinispora bacteria. , 2011, Journal of the American Chemical Society.

[14]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.

[15]  Johannes Söding,et al.  MMseqs2: sensitive protein sequence searching for the analysis of massive data sets , 2017, bioRxiv.

[16]  Christopher J. Schwalen,et al.  A new genome-mining tool redefines the lasso peptide biosynthetic landscape , 2016, Nature chemical biology.

[17]  Mark Johnson,et al.  NCBI BLAST: a better web interface , 2008, Nucleic Acids Res..

[18]  Gary D Bader,et al.  Biological Network Exploration with Cytoscape 3 , 2014, Current protocols in bioinformatics.

[19]  Graham A. Hudson,et al.  RiPP antibiotics: biosynthesis and engineering potential. , 2018, Current opinion in microbiology.

[20]  Graham A. Hudson,et al.  Enzymatic Reconstitution and Biosynthetic Investigation of the Lasso Peptide Fusilassin. , 2018, Journal of the American Chemical Society.

[21]  Elizabeth Pierce,et al.  Recognition Sequences and Substrate Evolution in Cyanobactin Biosynthesis , 2014, ACS synthetic biology.

[22]  M. Seyedsayamdost,et al.  Structure and biosynthesis of a macrocyclic peptide containing an unprecedented lysine-to-tryptophan crosslink , 2015, Nature chemistry.

[23]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[24]  D. Mitchell,et al.  Precursor peptide-targeted mining of more than one hundred thousand genomes expands the lanthipeptide natural product family , 2020, BMC Genomics.

[25]  D. Haft,et al.  Biological Systems Discovery In Silico: Radical S-Adenosylmethionine Protein Families and Their Target Peptides for Posttranslational Modification , 2011, Journal of bacteriology.

[26]  B. Kuhlman,et al.  Structural Insights into Thioether Bond Formation in the Biosynthesis of Sactipeptides. , 2017, Journal of the American Chemical Society.

[27]  J. Klinman,et al.  Intrigues and intricacies of the biosynthetic pathways for the enzymatic quinocofactors: PQQ, TTQ, CTQ, TPQ, and LTQ. , 2014, Chemical reviews.

[28]  O. Kuipers,et al.  Identification of distinct nisin leader peptide regions that determine interactions with the modification enzymes NisB and NisC☆ , 2013, FEBS open bio.

[29]  C. Walsh,et al.  How the MccB bacterial ancestor of ubiquitin E1 initiates biosynthesis of the microcin C7 antibiotic , 2009, The EMBO journal.

[30]  D. Haft,et al.  articleExpansion of ribosomally produced natural products : a nitrile hydratase-and Nif 11-related precursor family , 2010 .

[31]  S. Khare,et al.  Structures of the peptide-modifying radical SAM enzyme SuiB elucidate the basis of substrate recognition , 2017, Proceedings of the National Academy of Sciences.

[32]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[33]  Danny A. Bitton,et al.  A deep learning genome-mining strategy for biosynthetic gene cluster prediction , 2019, Nucleic acids research.

[34]  Wael E Houssen,et al.  Structural analysis of leader peptide binding enables leader-free cyanobactin processing , 2015, Nature chemical biology.

[35]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[36]  S. Sunagawa,et al.  Natural noncanonical protein splicing yields products with diverse β-amino acid residues , 2018, Science.

[37]  I. Pelczer,et al.  Charting an Unexplored Streptococcal Biosynthetic Landscape Reveals a Unique Peptide Cyclization Motif. , 2018, Journal of the American Chemical Society.

[38]  Renzo Kottmann,et al.  The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters , 2016, Nucleic Acids Res..

[39]  J. Klinman,et al.  PqqD Is a Novel Peptide Chaperone That Forms a Ternary Complex with the Radical S-Adenosylmethionine Protein PqqE in the Pyrroloquinoline Quinone Biosynthetic Pathway* , 2015, The Journal of Biological Chemistry.

[40]  Graham A. Hudson,et al.  Bioinformatic Mapping of Radical S-Adenosylmethionine-Dependent Ribosomally Synthesized and Post-Translationally Modified Peptides Identifies New Cα, Cβ, and Cγ-Linked Thioether-Containing Peptides. , 2019, Journal of the American Chemical Society.

[41]  Nils Oberg,et al.  The EFI Web Resource for Genomic Enzymology Web Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. , 2019, Biochemistry.

[42]  Graham A. Hudson,et al.  A Prevalent Peptide-Binding Domain Guides Ribosomal Natural Product Biosynthesis , 2015, Nature chemical biology.

[43]  D. Haft,et al.  Bioinformatic evidence for a widely distributed, ribosomally produced electron carrier precursor, its maturation proteins, and its nicotinoprotein redox partners , 2011, BMC Genomics.

[44]  Graham A. Hudson,et al.  In Vitro Biosynthetic Studies of Bottromycin Expand the Enzymatic Capabilities of the YcaO Superfamily. , 2017, Journal of the American Chemical Society.

[45]  D. Mitchell,et al.  Identification of an Auxiliary Leader Peptide-Binding Protein Required for Azoline Formation in Ribosomal Natural Products. , 2015, Journal of the American Chemical Society.

[46]  Erin Beck,et al.  TIGRFAMs and Genome Properties in 2013 , 2012, Nucleic Acids Res..

[47]  Maria Jesus Martin,et al.  Uniclust databases of clustered and deeply annotated protein sequences and alignments , 2016, Nucleic Acids Res..

[48]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[49]  W. A. van der Donk,et al.  Non-ribosomal Peptide Extension by a Peptide Amino-acyl tRNA Ligase. , 2019, Journal of the American Chemical Society.

[50]  A. J. Link,et al.  Lasso Peptide Biosynthetic Protein LarB1 Binds Both Leader and Core Peptide Regions of the Precursor Protein LarA , 2016, ACS central science.

[51]  V. Nizet,et al.  Structural and Functional Dissection of the Heterocyclic Peptide Cytotoxin Streptolysin S*S⃞ , 2009, Journal of Biological Chemistry.

[52]  S. Nair,et al.  Structure and mechanism of the tRNA-dependent lantibiotic dehydratase NisB , 2014, Nature.

[53]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[54]  M. Redinbo,et al.  Post-translational Claisen Condensation and Decarboxylation en Route to the Bicyclic Core of Pantocin A. , 2016, Journal of the American Chemical Society.

[55]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[56]  S. Nair,et al.  Structure and mechanism of lanthipeptide biosynthetic enzymes. , 2014, Current opinion in structural biology.

[57]  Robert D. Finn,et al.  HMMER web server: 2015 update , 2015, Nucleic Acids Res..

[58]  P. G. Arnison,et al.  Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. , 2013, Natural product reports.