An open-source drug discovery platform enables ultra-large virtual screens

On average, an approved drug currently costs US$2–3 billion and takes more than 10 years to develop 1 . In part, this is due to expensive and time-consuming wet-laboratory experiments, poor initial hit compounds and the high attrition rates in the (pre-)clinical phases. Structure-based virtual screening has the potential to mitigate these problems. With structure-based virtual screening, the quality of the hits improves with the number of compounds screened 2 . However, despite the fact that large databases of compounds exist, the ability to carry out large-scale structure-based virtual screening on computer clusters in an accessible, efficient and flexible manner has remained difficult. Here we describe VirtualFlow, a highly automated and versatile open-source platform with perfect scaling behaviour that is able to prepare and efficiently screen ultra-large libraries of compounds. VirtualFlow is able to use a variety of the most powerful docking programs. Using VirtualFlow, we prepared one of the largest and freely available ready-to-dock ligand libraries, with more than 1.4 billion commercially available molecules. To demonstrate the power of VirtualFlow, we screened more than 1 billion compounds and identified a set of structurally diverse molecules that bind to KEAP1 with submicromolar affinity. One of the lead inhibitors (iKeap1) engages KEAP1 with nanomolar affinity (dissociation constant ( K d ) = 114 nM) and disrupts the interaction between KEAP1 and the transcription factor NRF2. This illustrates the potential of VirtualFlow to access vast regions of the chemical space and identify molecules that bind with high affinity to target proteins. VirtualFlow, an open-source drug discovery platform, enables the efficient preparation and virtual screening of ultra-large ligand libraries to identify molecules that bind with high affinity to target proteins.

[1]  J. Reymond The chemical space project. , 2015, Accounts of chemical research.

[2]  Suman Sirimulla,et al.  AutoDock VinaXB: implementation of XBSF, new empirical halogen bond scoring function, into AutoDock Vina , 2016, Journal of Cheminformatics.

[3]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[4]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[5]  E. Hulme,et al.  Receptor-ligand interactions : a practical approach , 1992 .

[6]  Magnus Björsne,et al.  Label-Free Primary Screening and Affinity Ranking of Fragment Libraries Using Parallel Analysis of Protein Panels , 2008, Journal of biomolecular screening.

[7]  J Willem M Nissink,et al.  Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017—Utility and Limitations , 2017, ACS chemical biology.

[8]  David Ryan Koes,et al.  Protein-Ligand Scoring with Convolutional Neural Networks , 2016, Journal of chemical information and modeling.

[9]  I. Ayala,et al.  Stereospecific isotopic labeling of methyl groups for NMR spectroscopic studies of high-molecular-weight proteins. , 2010, Angewandte Chemie.

[10]  P. Bonneau,et al.  Compound aggregation in drug discovery: implementing a practical NMR assay for medicinal chemists. , 2013, Journal of medicinal chemistry.

[11]  Chee Keong Kwoh,et al.  Fast, accurate, and reliable molecular docking with QuickVina 2 , 2015, Bioinform..

[12]  H. Willems,et al.  Monoacidic Inhibitors of the Kelch-like ECH-Associated Protein 1: Nuclear Factor Erythroid 2-Related Factor 2 (KEAP1:NRF2) Protein-Protein Interaction with High Cell Potency Identified by Fragment-Based Discovery. , 2016, Journal of medicinal chemistry.

[13]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[14]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[15]  Joachim Kraemer,et al.  Small molecules inhibit the interaction of Nrf2 and the Keap1 Kelch domain through a non-covalent mechanism. , 2013, Bioorganic & medicinal chemistry.

[16]  Jaques Reifman,et al.  DOVIS: an implementation for high-throughput virtual screening using AutoDock , 2008, BMC Bioinformatics.

[17]  Douglas R. Houston,et al.  Consensus Docking: Improving the Reliability of Docking in a Virtual Screening Context , 2013, J. Chem. Inf. Model..

[18]  Rommie E. Amaro,et al.  Ensemble Docking in Drug Discovery. , 2018, Biophysical journal.

[19]  David S. Goodsell,et al.  AutoDockFR: Advances in Protein-Ligand Docking with Explicitly Specified Binding Site Flexibility , 2015, PLoS Comput. Biol..

[20]  David Ryan Koes,et al.  Lessons Learned in Empirical Scoring with smina from the CSAR 2011 Benchmarking Exercise , 2013, J. Chem. Inf. Model..

[21]  Hong Nie,et al.  Characterization of the Potent, Selective Nrf2 Activator, 3-(Pyridin-3-Ylsulfonyl)-5-(Trifluoromethyl)-2H-Chromen-2-One, in Cellular and In Vivo Models of Pulmonary Oxidative Stress , 2017, The Journal of Pharmacology and Experimental Therapeutics.

[22]  Chee-Keong Kwoh,et al.  Protein-Ligand Blind Docking Using QuickVina-W With Inter-Process Spatio-Temporal Integration , 2017, Scientific Reports.

[23]  D. Siderovski,et al.  High-affinity immobilization of proteins using biotin- and GST-based coupling strategies. , 2010, Methods in molecular biology.

[24]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[25]  R. W. Hansen,et al.  Journal of Health Economics , 2022 .

[26]  Jaques Reifman,et al.  DOVIS 2.0: an efficient and easy to use parallel virtual screening tool based on AutoDock 4.0 , 2008, Chemistry Central journal.

[27]  R. Woods,et al.  Vina-Carb: Improving Glycosidic Angles during Carbohydrate Docking. , 2016, Journal of chemical theory and computation.

[28]  Alexander Tropsha,et al.  Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS , 2017, J. Chem. Inf. Model..

[29]  Yurii S. Moroz,et al.  Ultra-large library docking for discovering new chemotypes , 2019, Nature.

[30]  A. Bach,et al.  Non-covalent Small-Molecule Kelch-like ECH-Associated Protein 1-Nuclear Factor Erythroid 2-Related Factor 2 (Keap1-Nrf2) Inhibitors and Their Potential for Targeting Central Nervous System Diseases. , 2018, Journal of medicinal chemistry.

[31]  Qidong You,et al.  Discovery of a Keap1-dependent peptide PROTAC to knockdown Tau by ubiquitination-proteasome degradation pathway. , 2018, European journal of medicinal chemistry.

[32]  J. Irwin,et al.  An Aggregation Advisor for Ligand Discovery. , 2015, Journal of medicinal chemistry.

[33]  Michael Hann,et al.  Stabilization of protein-protein interactions in drug discovery , 2017, Expert opinion on drug discovery.

[34]  Antonio Cuadrado,et al.  Therapeutic targeting of the NRF2 and KEAP1 partnership in chronic diseases , 2019, Nature Reviews Drug Discovery.

[35]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.