Output ordering and prioritisation system (OOPS): ranking biosynthetic gene clusters to enhance bioactive metabolite discovery

The rapid increase of publicly available microbial genome sequences has highlighted the presence of hundreds of thousands of biosynthetic gene clusters (BGCs) encoding valuable secondary metabolites. The experimental characterization of new BGCs is extremely laborious and struggles to keep pace with the in silico identification of potential BGCs. Therefore, the prioritisation of promising candidates among computationally predicted BGCs represents a pressing need. Here, we propose an output ordering and prioritisation system (OOPS) which helps sorting identified BGCs by a wide variety of custom-weighted biological and biochemical criteria in a flexible and user-friendly interface. OOPS facilitates a judicious prioritisation of BGCs using G+C content, coding sequence length, gene number, cluster self-similarity and codon bias parameters, as well as enabling the user to rank BGCs based upon BGC type, novelty, and taxonomic distribution. Effective prioritisation of BGCs will help to reduce experimental attrition rates and improve the breadth of bioactive metabolites characterized.

[1]  Kai Blin,et al.  antiSMASH 4.0—improvements in chemistry prediction and gene cluster boundary identification , 2017, Nucleic Acids Res..

[2]  M. Bibb,et al.  A Streptomyces coelicolor host for the heterologous expression of Type III polyketide synthase genes , 2015, Microbial Cell Factories.

[3]  I. Hoof,et al.  CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. , 2009, Journal of biotechnology.

[4]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[5]  E. Dittmann,et al.  Ribosomal synthesis of tricyclic depsipeptides in bloom-forming cyanobacteria. , 2008, Angewandte Chemie.

[6]  Roger G. Linington,et al.  Insights into Secondary Metabolism from a Global Analysis of Prokaryotic Biosynthetic Gene Clusters , 2014, Cell.

[7]  Neetika Nath,et al.  CASSIS and SMIPS: promoter-based prediction of secondary metabolite gene clusters in eukaryotic genomes , 2015, Bioinform..

[8]  D. Newman,et al.  Natural products as sources of new drugs over the last 25 years. , 2007, Journal of natural products.

[9]  Renzo Kottmann,et al.  The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters , 2016, Nucleic Acids Res..

[10]  Rodrigo Lopez,et al.  Programmatic access to bioinformatics tools from EMBL-EBI update: 2017 , 2017, Nucleic Acids Res..

[11]  Richard H. Baltz,et al.  Marcel Faber Roundtable: Is our antibiotic pipeline unproductive because of starvation, constipation or lack of inspiration? , 2006, Journal of Industrial Microbiology and Biotechnology.

[12]  Rainer Breitling,et al.  Synthetic biology advances for pharmaceutical production , 2015, Current opinion in biotechnology.

[13]  Rainer Breitling,et al.  Steps towards the synthetic biology of polyketide biosynthesis , 2014, FEMS microbiology letters.

[14]  R. Breitling,et al.  Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast , 2013, Molecular biology and evolution.

[15]  Christian P. Ridley,et al.  Engineered Biosynthesis of the Antiparasitic Agent Frenolicin B and Rationally Designed Analogs in a Heterologous Host , 2011, The Journal of Antibiotics.

[16]  J. Clardy,et al.  Natural Products and Synthetic Biology , 2014, ACS synthetic biology.

[17]  Rainer Breitling,et al.  Synthetic Biology of Natural Products. , 2016, Cold Spring Harbor perspectives in biology.

[18]  Rainer Breitling,et al.  Exploiting plug-and-play synthetic biology for drug discovery and production in microorganisms , 2011, Nature Reviews Microbiology.

[19]  Maria Jesus Martin,et al.  The European Bioinformatics Institute's data resources , 2003, Nucleic Acids Res..