Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Over the last decade, there has been significant progress in the field of machine learning for de novo drug design, particularly in deep generative models. However, current generative approaches exhibit a significant challenge as they do not ensure that the proposed molecular structures can be feasibly synthesized nor do they provide the synthesis routes of the proposed small molecules, thereby seriously limiting their practical applicability. In this work, we propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design, Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo drug design system. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space by subjecting commercially available small molecule building blocks to valid chemical reactions at every time step of the iterative virtual multi-step synthesis process. The proposed environment for drug discovery provides a highly challenging test-bed for RL algorithms owing to the large state space and high-dimensional continuous action space with hierarchical actions. PGFS achieves state-of-the-art performance in generating structures with high QED and penalized clogP. Moreover, we validate PGFS in an in-silico proof-of-concept associated with three HIV targets. Finally, we describe how the end-to-end training conceptualized in this study represents an important paradigm in radically expanding the synthesizable chemical space and automating the drug discovery process.

[1]  Connor W. Coley,et al.  Machine Learning in Computer-Aided Synthesis Planning. , 2018, Accounts of chemical research.

[2]  Frank Noé,et al.  Efficient multi-objective molecular optimization in a continuous latent space† †Electronic supplementary information (ESI) available: Details of the desirability scaling functions, high resolution figures and detailed results of the GuacaMol benchmark. See DOI: 10.1039/c9sc01928f , 2019, Chemical science.

[3]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  Alexander Tropsha,et al.  Best Practices for QSAR Model Development, Validation, and Exploitation , 2010, Molecular informatics.

[6]  Connor W. Coley,et al.  SCScore: Synthetic Complexity Learned from a Reaction Corpus , 2018, J. Chem. Inf. Model..

[7]  J. Dearden,et al.  QSAR modeling: where have you been? Where are you going to? , 2014, Journal of medicinal chemistry.

[8]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[9]  Yoshua Bengio,et al.  DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation , 2018, ArXiv.

[10]  Wenhao Gao,et al.  The Synthesizability of Molecules Proposed by Generative Models , 2020, J. Chem. Inf. Model..

[11]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[12]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[13]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[14]  Jan H. Jensen,et al.  A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space , 2018, Chemical science.

[15]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[16]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[17]  Anil K. Srivastava,et al.  The coefficient of determination and its adjusted version in linear regression models , 1995 .

[18]  W Patrick Walters,et al.  Assessing the impact of generative AI on medicinal chemistry , 2020, Nature Biotechnology.

[19]  Robert Abel,et al.  Reaction-Based Enumeration, Active Learning, and Free Energy Calculations To Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin-Dependent Kinase 2 Inhibitors , 2019, J. Chem. Inf. Model..

[20]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[21]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[22]  Alán Aspuru-Guzik,et al.  Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space , 2020, ICLR.

[23]  Gisbert Schneider,et al.  Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis , 2019, Nat. Mach. Intell..

[24]  Gianni De Fabritiis,et al.  From Target to Drug: Generative Modeling for Multimodal Structure-Based Ligand Design. , 2019, Molecular pharmaceutics.

[25]  Elman Mansimov,et al.  Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.

[26]  Gisbert Schneider,et al.  Automating drug discovery , 2017, Nature Reviews Drug Discovery.

[27]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[28]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[29]  W. P. Walters,et al.  Virtual Chemical Libraries. , 2018, Journal of medicinal chemistry.

[30]  Igor V. Tetko,et al.  Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set , 2010, J. Chem. Inf. Model..

[31]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[32]  Pieter P. Plehiers,et al.  A robotic platform for flow synthesis of organic compounds informed by AI planning , 2019, Science.

[33]  Valerie J. Gillet,et al.  Knowledge-Based Approach to de Novo Design Using Reaction Vectors , 2009, J. Chem. Inf. Model..

[34]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[35]  Matt J. Kusner,et al.  A Model to Search for Synthesizable Molecules , 2019, NeurIPS.

[36]  David Barber,et al.  Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.

[37]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[38]  Pascal Friederich,et al.  Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation , 2019, Mach. Learn. Sci. Technol..

[39]  D. Hazuda,et al.  HIV-1 antiretroviral drug therapy. , 2012, Cold Spring Harbor perspectives in medicine.

[40]  Connor W. Coley,et al.  Autonomous discovery in the chemical sciences part II: Outlook , 2020, Angewandte Chemie.

[41]  H. M. Vinkers,et al.  SYNOPSIS: SYNthesize and OPtimize System in Silico. , 2003, Journal of medicinal chemistry.

[42]  G. Habermehl Molecular Structure Description , 2001 .

[43]  Alán Aspuru-Guzik,et al.  SELFIES: a robust representation of semantically constrained graphs with an example application in chemistry , 2019, ArXiv.

[44]  Artem Cherkasov,et al.  QSAR without borders. , 2020, Chemical Society reviews.

[45]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[46]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[47]  Yingyu Liang,et al.  N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules , 2018, NeurIPS.

[48]  Daniel C. Elton,et al.  Deep learning for molecular generation and optimization - a review of the state of the art , 2019, Molecular Systems Design & Engineering.

[49]  Abhinav Vishnu,et al.  Deep learning for computational chemistry , 2017, J. Comput. Chem..

[50]  Yang Liu,et al.  Route Designer: A Retrosynthetic Analysis Tool Utilizing Automated Retrosynthetic Rule Generation , 2009, J. Chem. Inf. Model..

[51]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[52]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[53]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[54]  Peter Sunehag,et al.  Reinforcement Learning in Large Discrete Action Spaces , 2015, ArXiv.

[55]  Kirthevasan Kandasamy,et al.  ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations , 2019, AISTATS.

[56]  Johann Gasteiger,et al.  A Graph-Based Genetic Algorithm and Its Application to the Multiobjective Evolution of Median Molecules , 2004, J. Chem. Inf. Model..

[57]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.

[58]  Breandan Considine,et al.  Deep Pepper: Expert Iteration based Chess agent in the Reinforcement Learning Setting , 2018, ArXiv.

[59]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[60]  Károly Héberger,et al.  Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? , 2015, Journal of Cheminformatics.

[61]  A. Balaban Highly discriminating distance-based topological index , 1982 .

[62]  Robert P. Sheridan,et al.  Similarity to Molecules in the Training Set Is a Good Discriminator for Prediction Accuracy in QSAR , 2004, J. Chem. Inf. Model..

[63]  K Tuppurainen,et al.  Frontier orbital energies, hydrophobicity and steric factors as physical QSAR descriptors of molecular mutagenicity. A review with a case study: MX compounds. , 1999, Chemosphere.

[64]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[65]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[66]  Markus Hartenfeller,et al.  DOGS: Reaction-Driven de novo Design of Bioactive Compounds , 2012, PLoS Comput. Biol..

[67]  Daniel W. Davies,et al.  Machine learning for molecular and materials science , 2018, Nature.

[68]  Sepp Hochreiter,et al.  Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery , 2018, J. Chem. Inf. Model..

[69]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[70]  Li Li,et al.  Optimization of Molecules via Deep Reinforcement Learning , 2018, Scientific Reports.

[71]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.