Mining for Potent Inhibitors through Artificial Intelligence and Physics: A Unified Methodology for Ligand Based and Structure Based Drug Design.

Determining the viability of a new drug molecule is a time- and resource-intensive task that makes computer-aided assessments a vital approach to rapid drug discovery. Here we develop a machine learning algorithm, iMiner, that generates novel inhibitor molecules for target proteins by combining deep reinforcement learning with real-time 3D molecular docking using AutoDock Vina, thereby simultaneously creating chemical novelty while constraining molecules for shape and molecular compatibility with target active sites. Moreover, through the use of various types of reward functions, we have introduced novelty in generative tasks for new molecules such as chemical similarity to a target ligand, molecules grown from known protein bound fragments, and creation of molecules that enforce interactions with target residues in the protein active site. The iMiner algorithm is embedded in a composite workflow that filters out Pan-assay interference compounds, Lipinski rule violations, uncommon structures in medicinal chemistry, and poor synthetic accessibility with options for cross-validation against other docking scoring functions and automation of a molecular dynamics simulation to measure pose stability. We also allow users to define a set of rules for the structures they would like to exclude during the training process and postfiltering steps. Because our approach relies only on the structure of the target protein, iMiner can be easily adapted for the future development of other inhibitors or small molecule therapeutics of any target protein.

[1]  Chang-Yu Hsieh,et al.  ResGen is a pocket-aware 3D molecular generation model based on parallel multiscale modelling , 2023, Nature Machine Intelligence.

[2]  S. Murmu,et al.  Insilico generation of novel ligands for the inhibition of SARS-CoV-2 main protease (3CLpro) using deep learning , 2023, Frontiers in Microbiology.

[3]  H. Liu,et al.  LS-MolGen: Ligand-and-Structure Dual-Driven Deep Reinforcement Learning for Target-Specific Molecular Generation Improves Binding Affinity and Novelty , 2023, J. Chem. Inf. Model..

[4]  Shikui Tu,et al.  AlphaDrug: protein target specific de novo molecular generation , 2022, PNAS nexus.

[5]  Rommie E. Amaro,et al.  Transmissible SARS-CoV-2 variants with resistance to clinical protease inhibitors , 2022, bioRxiv.

[6]  Hongming Chen,et al.  De Novo Molecule Design Using Molecular Generative Models Constrained by Ligand-Protein Interactions , 2022, J. Chem. Inf. Model..

[7]  Y. Orba,et al.  Discovery of S-217622, a Noncovalent Oral SARS-CoV-2 3CL Protease Inhibitor Clinical Candidate for Treating COVID-19 , 2022, Journal of medicinal chemistry.

[8]  Jike Wang,et al.  InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein-Ligand Interaction Predictions. , 2021, Journal of medicinal chemistry.

[9]  Hu Haifeng,et al.  Accelerating AutoDock Vina with GPUs , 2021, Molecules.

[10]  Farren J. Isaacs,et al.  Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations , 2021, ACS central science.

[11]  T. Laino,et al.  Data-driven molecular design for discovery and synthesis of novel ligands: a case study on SARS-CoV-2 , 2021, Mach. Learn. Sci. Technol..

[12]  V. Dubey,et al.  Microsecond MD Simulation and Multiple-Conformation Virtual Screening to Identify Potential Anti-COVID-19 Inhibitors Against SARS-CoV-2 Main Protease , 2021, Frontiers in Chemistry.

[13]  Sowmya Ramaswamy Krishnan,et al.  De novo design of new chemical entities for SARS-CoV-2 using artificial intelligence , 2021, Future medicinal chemistry.

[14]  Haitao Yang,et al.  Recent Progress in the Drug Development Targeting SARS-CoV-2 Main Protease as Treatment for COVID-19 , 2020, Frontiers in Molecular Biosciences.

[15]  Yurii S. Moroz,et al.  Generating Multibillion Chemical Space of Readily Accessible Screening Compounds , 2020, iScience.

[16]  Dongsup Kim,et al.  Autonomous molecule generation using reinforcement learning and docking to develop potential novel inhibitors , 2020, Scientific Reports.

[17]  Benjamin A. Shoemaker,et al.  PubChem in 2021: new data content and improved web interfaces , 2020, Nucleic Acids Res..

[18]  Hualiang Jiang,et al.  Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors , 2020, Nature.

[19]  Reaz Uddin,et al.  Identification of chymotrypsin-like protease inhibitors of SARS-CoV-2 via integrated computational approach , 2020, Journal of biomolecular structure & dynamics.

[20]  Wonpil Im,et al.  Improving Protein-Ligand Docking Results with High-Throughput Molecular Dynamics Simulations , 2020, J. Chem. Inf. Model..

[21]  Jeremy Howard,et al.  fastai: A Layered API for Deep Learning , 2020, Inf..

[22]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[23]  Sangdun Choi,et al.  A Structure-Based Drug Discovery Paradigm , 2019, International journal of molecular sciences.

[24]  Alán Aspuru-Guzik,et al.  Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation , 2019, Mach. Learn. Sci. Technol..

[25]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[26]  Eugene N. Muratov,et al.  QSAR-Based Virtual Screening: Advances and Applications in Drug Discovery , 2018, Front. Pharmacol..

[27]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[28]  Adrian J Mulholland,et al.  Multiscale Methods in Drug Design Bridge Chemical and Biological Complexity in the Search for Cures. , 2018, Nature reviews. Chemistry.

[29]  Sepp Hochreiter,et al.  Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery , 2018, J. Chem. Inf. Model..

[30]  Nicholay Topin,et al.  Super-convergence: very fast training of neural networks using large learning rates , 2018, Defense + Commercial Sensing.

[31]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[32]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[33]  Olivier Michielin,et al.  SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules , 2017, Scientific Reports.

[34]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[35]  Michael Schroeder,et al.  PLIP: fully automated protein–ligand interaction profiler , 2015, Nucleic Acids Res..

[36]  Jayme L. Dahlin,et al.  PAINS in the Assay: Chemical Mechanisms of Assay Interference and Promiscuous Enzymatic Inhibition Observed during a Sulfhydryl-Scavenging HTS , 2015, Journal of medicinal chemistry.

[37]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[38]  Douglas R. Houston,et al.  Consensus Docking: Improving the Reliability of Docking in a Virtual Screening Context , 2013, J. Chem. Inf. Model..

[39]  Hélène Decornez,et al.  Early phase drug discovery: cheminformatics and computational techniques in identifying lead series. , 2012, Bioorganic & medicinal chemistry.

[40]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[41]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[42]  M. Mezei,et al.  Molecular docking: a powerful approach for structure-based drug discovery. , 2011, Current computer-aided drug design.

[43]  A. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[44]  Matthias Rarey,et al.  On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.

[45]  Matthias Rarey,et al.  Towards an Integrated Description of Hydrogen Bonding and Dehydration: Decreasing False Positives in Virtual Screening with the HYDE Scoring Function , 2008, ChemMedChem.

[46]  Daniel James,et al.  Lessons Learnt from Assembling Screening Libraries for Drug Discovery for Neglected Diseases , 2007, ChemMedChem.

[47]  T. O'Brien,et al.  Fragment-based drug discovery. , 2004, Journal of medicinal chemistry.

[48]  J. Ziebuhr,et al.  Conservation of substrate specificities among coronavirus main proteases. , 2002, The Journal of general virology.

[49]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[50]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .