Enabling the hypothesis-driven prioritization of ligand candidates in big databases: Screenlamp and its application to GPCR inhibitor discovery for invasive species control

While the advantage of screening vast databases of molecules to cover greater molecular diversity is often mentioned, in reality, only a few studies have been published demonstrating inhibitor discovery by screening more than a million compounds for features that mimic a known three-dimensional ligand. Two factors contribute: the general difficulty of discovering potent inhibitors, and the lack of free, user-friendly software to incorporate project-specific knowledge and user hypotheses into 3D ligand-based screening. The Screenlamp modular toolkit presented here was developed with these needs in mind. We show Screenlamp’s ability to screen more than 12 million commercially available molecules and identify potent in vivo inhibitors of a G protein-coupled bile acid receptor within the first year of a discovery project. This pheromone receptor governs sea lamprey reproductive behavior, and to our knowledge, this project is the first to establish the efficacy of computational screening in discovering lead compounds for aquatic invasive species control. Significant enhancement in activity came from selecting compounds based on one of the hypotheses: that matching two distal oxygen groups in the three-dimensional structure of the pheromone is crucial for activity. Six of the 15 most active compounds met these criteria. A second hypothesis – that presence of an alkyl sulfate side chain results in high activity – identified another 6 compounds in the top 10, demonstrating the significant benefits of hypothesis-driven screening.

[1]  J. Ballesteros,et al.  [19] Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors , 1995 .

[2]  Robert P. Sheridan,et al.  Multiple protein structures and multiple ligands: effects on the apparent goodness of virtual screening results , 2008, J. Comput. Aided Mol. Des..

[3]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[4]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[5]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[6]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[7]  Vadim Cherezov,et al.  Diversity and modularity of G protein-coupled receptor structures. , 2012, Trends in pharmacological sciences.

[8]  Maria I. Zavodszky,et al.  Distilling the essential features of a protein surface for improving protein-ligand docking, scoring, and virtual screening , 2002, J. Comput. Aided Mol. Des..

[9]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[10]  Ola Spjuth,et al.  Large-scale virtual screening on public cloud resources with Apache Spark , 2017, Journal of Cheminformatics.

[11]  J. A. Grant,et al.  A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction. , 2005, Journal of medicinal chemistry.

[12]  Young Do Kwon,et al.  Design, synthesis and biological evaluation of small molecule inhibitors of CD4-gp120 binding based on virtual screening. , 2011, Bioorganic & medicinal chemistry.

[13]  David E. Shaw,et al.  PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results , 2006, J. Comput. Aided Mol. Des..

[14]  Tingjun Hou,et al.  Assessing the Performance of the MM/PBSA and MM/GBSA Methods. 1. The Accuracy of Binding Free Energy Calculations Based on Molecular Dynamics Simulations , 2011, J. Chem. Inf. Model..

[15]  Araz Jakalian,et al.  Fast, efficient generation of high‐quality atomic charges. AM1‐BCC model: I. Method , 2000 .

[16]  Nan Liu,et al.  CholMine: Determinants and Prediction of Cholesterol and Cholate Binding Across Nonhomologous Protein Structures , 2015, J. Chem. Inf. Model..

[17]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[18]  Marco Biasini,et al.  Toward the estimation of the absolute quality of individual protein structure models , 2010, Bioinform..

[19]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[20]  Johannes Riegler,et al.  Oxygen control of breathing by an olfactory receptor activated by lactate , 2015, Nature.

[21]  Nicholas S. Johnson,et al.  Factors Influencing Capture of Invasive Sea Lamprey in Traps Baited With a Synthesized Sex Pheromone Component , 2015, Journal of Chemical Ecology.

[22]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[23]  Olivier Michielin,et al.  SwissSimilarity: A Web Tool for Low to Ultra High Throughput Ligand-Based Virtual Screening , 2016, J. Chem. Inf. Model..

[24]  Ke Li,et al.  Mixtures of Two Bile Alcohol Sulfates Function as a Proximity Pheromone in Sea Lamprey , 2016, PloS one.

[25]  Guixia Liu,et al.  Performance Evaluation of 2D Fingerprint and 3D Shape Similarity Methods in Virtual Screening , 2012, J. Chem. Inf. Model..

[26]  Gonzalo Colmenarejo,et al.  A New Set of Chemical Starting Points with Plasmodium falciparum Transmission-Blocking Potential for Antimalarial Drug Discovery , 2015, PloS one.

[27]  Philip M. Dean,et al.  Three-dimensional hydrogen-bond geometry and probability information from a crystal survey , 1996, J. Comput. Aided Mol. Des..

[28]  Gerd Heber,et al.  An overview of the HDF5 technology suite and its applications , 2011, AD '11.

[29]  Holger Gohlke,et al.  Target flexibility: an emerging consideration in drug discovery and design. , 2008, Journal of medicinal chemistry.

[30]  R. Stevens,et al.  Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions , 2012, Science.

[31]  Serdar Durdagi,et al.  Virtual screening of eighteen million compounds against dengue virus: Combined molecular docking and molecular dynamics simulations study. , 2016, Journal of molecular graphics & modelling.

[32]  Tanneguy Redarce,et al.  Automatic Lip-Contour Extraction and Mouth-Structure Segmentation in Images , 2011, Computing in Science & Engineering.

[33]  S. Liberles,et al.  High-affinity olfactory receptor for the death-associated odor cadaverine , 2013, Proceedings of the National Academy of Sciences.

[34]  Min-Ho Jang,et al.  Availability of and access to critical habitats in regulated rivers: effects of low‐head barriers on threatened lampreys , 2009 .

[35]  Hiroki Kobayashi,et al.  Integrating Statistical Predictions and Experimental Verifications for Enhancing Protein-Chemical Interaction Predictions in Virtual Screening , 2009, PLoS Comput. Biol..

[36]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[37]  Jack Snoeyink,et al.  Nucleic Acids Research Advance Access published April 22, 2007 MolProbity: all-atom contacts and structure validation for proteins and nucleic acids , 2007 .

[38]  S. Liberles,et al.  Mammalian pheromones. , 2014, Annual review of physiology.

[39]  S. Jordt,et al.  Sensory detection and responses to toxic gases: mechanisms, health effects, and countermeasures. , 2010, Proceedings of the American Thoracic Society.

[40]  Ralf Friedrich,et al.  Methodology and preliminary results , 2017 .

[41]  Christopher I. Bayly,et al.  Fast, efficient generation of high‐quality atomic charges. AM1‐BCC model: II. Parameterization and validation , 2002, J. Comput. Chem..

[42]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[43]  Malgorzata N. Drwal,et al.  Combination of ligand- and structure-based methods in virtual screening. , 2013, Drug discovery today. Technologies.

[44]  David K. Johnson,et al.  Ultra-High-Throughput Structure-Based Virtual Screening for Small-Molecule Inhibitors of Protein-Protein Interactions , 2016, J. Chem. Inf. Model..

[45]  Mason Inman Neurons' Short-Term Plasticity Amplifies Signals , 2006, PLoS biology.

[46]  Thomas A. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[47]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[48]  N. Johnson,et al.  A synthesized pheromone induces upstream movement in female sea lamprey and summons them into traps , 2009, Proceedings of the National Academy of Sciences.

[49]  S. Rasmussen,et al.  The structure and function of G-protein-coupled receptors , 2009, Nature.

[50]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[51]  Silke Sachse,et al.  Atypical Membrane Topology and Heteromeric Function of Drosophila Odorant Receptors In Vivo , 2006, PLoS biology.

[52]  W. B. Scott,et al.  Freshwater fishes of Canada , 1974 .

[53]  B. Zielinski,et al.  Male Sea Lampreys, Petromyzon marinus L., Excrete a Sex Pheromone from Gill Epithelia1 , 2003, Biology of reproduction.

[54]  M. Babu,et al.  Molecular signatures of G-protein-coupled receptors , 2013, Nature.

[55]  Michael A. Boogaard,et al.  Acute Toxicity of TFM and a TFM/Niclosamide Mixture to Selected Species of Fish, Including Lake Sturgeon (Acipenser fulvescens) and Mudpuppies (Necturus maculosus), in Laboratory and Field Exposures , 2003 .

[56]  Thomas R. Hoye,et al.  Synthesis and olfactory activity of unnatural, sulfated 5β-bile acid derivatives in the sea lamprey (Petromyzon marinus) , 2011, Steroids.

[57]  Anjali Rohatgi,et al.  (www.interscience.wiley.com) DOI:10.1002/jmr.942 Scoring ligand similarity in structure-based virtual screening , 2022 .

[58]  Graeme Milligan,et al.  G-protein-coupled receptors for free fatty acids: nutritional and therapeutic targets , 2014, British Journal of Nutrition.

[59]  M. Krasowski,et al.  Diversity of Bile Salts in Fish and Amphibians: Evolution of a Complex Biochemical Pathway , 2010, Physiological and Biochemical Zoology.

[60]  Cynthia S. Kolar,et al.  Research to Guide the Use of Lampricides for Controlling Sea Lamprey , 2007 .

[61]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[62]  Kevin Carr,et al.  The sea lamprey Petromyzon marinus genome reveals the early origin of several chemosensory receptor families in the vertebrate lineage , 2009, BMC Evolutionary Biology.

[63]  Honggao Yan,et al.  Bile Acid Secreted by Male Sea Lamprey That Acts as a Sex Pheromone , 2002, Science.

[64]  Anthony Nicholls,et al.  Conformer Generation with OMEGA: Learning from the Data Set and the Analysis of Failures , 2012, J. Chem. Inf. Model..

[65]  Grzegorz M. Popowicz,et al.  Enabling Large-Scale Design, Synthesis and Validation of Small Molecule Protein-Protein Antagonists , 2012, PloS one.

[66]  P. Sorensen,et al.  The olfactory system of migratory adult sea lamprey (Petromyzon marinus) is specifically and acutely sensitive to unique bile acids released by conspecific larvae , 1995, The Journal of general physiology.

[67]  J. Wess,et al.  Activation and allosteric modulation of a muscarinic acetylcholine receptor , 2013, Nature.

[68]  Sebastian Raschka,et al.  BioPandas: Working with molecular structures in pandas DataFrames , 2017, J. Open Source Softw..

[69]  Kenneth Lundstrom,et al.  An Overview on GPCRs and Drug Discovery: Structure-Based Drug Design and Structural Biology on GPCRs , 2009, Methods in molecular biology.

[70]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[71]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[72]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[73]  Tom Guda,et al.  Retraction: Odour receptors and neurons for DEET and new insect repellents , 2016, Nature.

[74]  Adrian E Roitberg,et al.  Design of e-pharmacophore models using compound fragments for the trans-sialidase of Trypanosoma cruzi: screening for novel inhibitor scaffolds. , 2013, Journal of molecular graphics & modelling.

[75]  John B. O. Mitchell Machine learning methods in chemoinformatics , 2014, Wiley interdisciplinary reviews. Computational molecular science.

[76]  Yoshihito Niimura,et al.  On the Origin and Evolution of Vertebrate Olfactory Receptor Genes: Comparative Genome Analysis Among 23 Chordate Species , 2009, Genome biology and evolution.

[77]  Robert P. Sheridan,et al.  Comparison of Topological, Shape, and Docking Methods in Virtual Screening , 2007, J. Chem. Inf. Model..

[78]  Nagi Ayad,et al.  Large-Scale Computational Screening Identifies First in Class Multitarget Inhibitor of EGFR Kinase and BRD4 , 2015, Scientific Reports.

[79]  G. C. Becker,et al.  Fishes of Wisconsin , 1983 .

[80]  Claudio N. Cavasotto,et al.  Ligand and Decoy Sets for Docking to G Protein-Coupled Receptors , 2012, J. Chem. Inf. Model..

[81]  Gerhard Wolber,et al.  Prospective Virtual Screening in a Sparse Data Scenario: Design of Small‐Molecule TLR2 Antagonists , 2014, ChemMedChem.

[82]  Donald D. Chamberlin,et al.  SEQUEL: A structured English query language , 1974, SIGFIDET '74.

[83]  Prasenjit Mukherjee,et al.  An overview of molecular fingerprint similarity search in virtual screening , 2016, Expert opinion on drug discovery.

[84]  Michal L. Jones,et al.  A rapid assessment approach to prioritizing streams for control of Great Lakes sea lampreys (Petromyzon marinus): a case study in adaptive management , 2008 .

[85]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[86]  Kenneth M Merz,et al.  Limits of Free Energy Computation for Protein-Ligand Interactions. , 2010, Journal of chemical theory and computation.

[87]  C. Brant,et al.  Characterization of sea lamprey pheromone components , 2015 .

[88]  P. Scott-Johnson,et al.  The electroolfactogram: A review of its history and uses , 2002, Microscopy research and technique.

[89]  Alexander M. Lewis,et al.  Identification of a chemical probe for NAADP by virtual screening , 2009, Nature chemical biology.