Automated Genome Mining of Ribosomal Peptide Natural Products

Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity.1 In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic data sets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs, and apply it to lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connecting multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 to reflect that it is a natural product that was discovered by mass spectrometry based genome mining using algorithmic tools rather than manual inspection of mass spectrometry data and genetic information. The presented tool is available at cyclo.ucsd.edu.

[1]  G. Sheldrick,et al.  Labyrinthopeptins: a new class of carbacyclic lantibiotics. , 2010, Angewandte Chemie.

[2]  Ronald J Moore,et al.  Fully automated four-column capillary LC-MS system for maximizing throughput in proteomic analyses. , 2008, Analytical chemistry.

[3]  Paul D. Cotter,et al.  Identification of a Novel Two-Peptide Lantibiotic, Lichenicidin, following Rational Genome Mining for LanM Proteins , 2009, Applied and Environmental Microbiology.

[4]  Pavel A. Pevzner,et al.  Mutation-tolerant protein identification by mass-spectrometry , 2000, RECOMB '00.

[5]  Pavel A. Pevzner,et al.  A new approach to evaluating statistical significance of spectral identifications. , 2013, Journal of proteome research.

[6]  Wu-chun Feng,et al.  Missing genes in the annotation of prokaryotic genomes , 2010, BMC Bioinformatics.

[7]  Ruedi Aebersold,et al.  The pros and cons of peptide-centric proteomics , 2010, Nature Biotechnology.

[8]  Pavel A. Pevzner,et al.  Protein identification by spectral networks analysis , 2007, Proceedings of the National Academy of Sciences.

[9]  P. Pevzner,et al.  Spectral probabilities and generating functions of tandem mass spectra: a strike against decoy databases. , 2008, Journal of proteome research.

[10]  Neil L. Kelleher,et al.  Discovery and in vitro biosynthesis of haloduracin, a two-component lantibiotic , 2006, Proceedings of the National Academy of Sciences.

[11]  R. Süssmuth,et al.  Characterization of New Class III Lantibiotics—Erythreapeptin, Avermipeptin and Griseopeptin from Saccharopolyspora erythraea, Streptomyces avermitilis and Streptomyces griseus Demonstrates Stepwise N‐Terminal Leader Processing , 2012, Chembiochem : a European journal of chemical biology.

[12]  Dekel Tsur,et al.  Identification of post-translational modifications by blind search of mass spectra , 2005, Nature Biotechnology.

[13]  C. Hertweck,et al.  Genomics-inspired discovery of natural products. , 2011, Current opinion in chemical biology.

[14]  Ronald J Moore,et al.  Chemically etched open tubular and monolithic emitters for nanoelectrospray ionization mass spectrometry. , 2006, Analytical chemistry.

[15]  M. Hudson,et al.  The SapB morphogen is a lantibiotic-like peptide derived from the product of the developmental gene ramS in Streptomyces coelicolor. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Hosein Mohimani,et al.  Cycloquest: identification of cyclopeptides via database search of their mass spectra against genome databases. , 2011, Journal of proteome research.

[17]  P. Pevzner,et al.  Interpreting top-down mass spectra using spectral alignment. , 2008, Analytical chemistry.

[18]  Teruhiko Beppu,et al.  AmfS, an Extracellular Peptidic Morphogen in Streptomyces griseus , 2002, Journal of bacteriology.

[19]  P. Pevzner,et al.  PepNovo: de novo peptide sequencing via probabilistic network modeling. , 2005, Analytical chemistry.

[20]  Nuno Bandeira,et al.  MS/MS networking guided analysis of molecule and gene cluster families , 2013, Proceedings of the National Academy of Sciences.

[21]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[22]  J. Yates,et al.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database , 1994, Journal of the American Society for Mass Spectrometry.

[23]  P. G. Arnison,et al.  Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. , 2013, Natural product reports.

[24]  Oscar P. Kuipers,et al.  BAGEL3: automated identification of genes encoding bacteriocins and (non-)bactericidal posttranslationally modified peptides , 2013, Nucleic Acids Res..

[25]  Scott A. McLuckey,et al.  The American Society for Mass Spectrometry , 1996 .

[26]  R. Süssmuth,et al.  Involvement and unusual substrate specificity of a prolyl oligopeptidase in class III lanthipeptide maturation. , 2013, Journal of the American Chemical Society.

[27]  Kai Blin,et al.  antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences , 2011, Nucleic Acids Res..

[28]  Pieter C. Dorrestein,et al.  A mass spectrometry-guided genome mining approach for natural product peptidogenomics , 2011, Nature chemical biology.

[29]  G. Challis,et al.  Strategies for the Discovery of New Natural Products by Genome Mining , 2009, Chembiochem : a European journal of chemical biology.

[30]  Nuno Bandeira,et al.  Mass spectral molecular networking of living microbial colonies , 2012, Proceedings of the National Academy of Sciences.

[31]  Chris L. Tang,et al.  Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. , 2001, Genome research.

[32]  W. A. van der Donk,et al.  Genome mining for ribosomally synthesized natural products. , 2011, Current opinion in chemical biology.

[33]  J. Willey,et al.  Lantibiotics: peptides of diverse structure and function. , 2007, Annual review of microbiology.