BRADSHAW: a system for automated molecular design

This paper introduces BRADSHAW ( B iological R esponse A nalysis and D esign S ystem using an H eterogenous, A utomated W orkflow), a system for automated molecular design which integrates methods for chemical structure generation, experimental design, active learning and cheminformatics tools. The simple user interface is designed to facilitate access to large scale automated design whilst minimising software development required to introduce new algorithms, a critical requirement in what is a very fast moving field. The system embodies a philosophy of automation, best practice, experimental design and the use of both traditional cheminformatics and modern machine learning algorithms.

[1]  Naomi L Kruhlak,et al.  Combined Use of MC4PC, MDL-QSAR, BioEpisteme, Leadscope PDM, and Derek for Windows Software to Achieve High-Performance, High-Confidence, Mode of Action–Based Predictions of Chemical Carcinogenesis in Rodents , 2008, Toxicology mechanisms and methods.

[2]  Akiko Itai,et al.  Automatic creation of drug candidate structures based on receptor structure. Starting point for artificial lead generation , 1991 .

[3]  Emanuel S. R. Ehmki,et al.  Matched Molecular Series: Measuring SAR Similarity , 2017, J. Chem. Inf. Model..

[4]  Vijay S. Pande,et al.  Step Change Improvement in ADMET Prediction with PotentialNet Deep Featurization , 2019, ArXiv.

[5]  Robert D. Simoni,et al.  The Rational Design of Nucleic Acid Inhibitors to Treat Leukemia: the Work of George H. Hitchings , 2008 .

[6]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[7]  Robert C. Glen,et al.  A genetic algorithm for the automated generation of molecules within constraints , 1995, J. Comput. Aided Mol. Des..

[8]  Edward E. Hodgkin,et al.  The Castlemaine Project: Development of an AI-based Drug Design Support System , 1994 .

[9]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[10]  Lemont B. Kier,et al.  The generation of molecular structures from a graph-based QSAR equation , 1993 .

[11]  Andy Liaw,et al.  Demystifying Multitask Deep Neural Networks for Quantitative Structure-Activity Relationships , 2017, J. Chem. Inf. Model..

[12]  E G Maliski,et al.  The whole molecule design approach to drug discovery. , 1992, Drug design and discovery.

[13]  Anne Hersey,et al.  Legacy data sharing to improve drug safety assessment: the eTOX project , 2017, Nature Reviews Drug Discovery.

[14]  Darren V. S. Green,et al.  QSAR workbench: automating QSAR modeling to drive compound design , 2013, Journal of Computer-Aided Molecular Design.

[15]  Arun K. Ghosh,et al.  Structure-based Design of Drugs and Other Bioactive Molecules: Tools and Strategies , 2014 .

[16]  Gisbert Schneider,et al.  Active-learning strategies in computer-assisted drug discovery. , 2015, Drug discovery today.

[17]  Y. Martin,et al.  Quantitative drug design , 1978 .

[18]  Connor W. Coley,et al.  SCScore: Synthetic Complexity Learned from a Reaction Corpus , 2018, J. Chem. Inf. Model..

[19]  Nikolai S. Zefirov,et al.  General methodology and computer program for the exhaustive restoring of chemical structures by molecular connectivity indexes. Solution of the inverse problem in QSAR/QSPR , 1990 .

[20]  Daniel J. Warner,et al.  Matched molecular pairs as a medicinal chemistry tool. , 2011, Journal of medicinal chemistry.

[21]  M. Waring Lipophilicity in drug discovery , 2010, Expert Opinion on Drug Discovery.

[22]  Stephen D Pickett,et al.  Nuisance Compounds, PAINS Filters, and Dark Chemical Matter in the GSK HTS Collection , 2018, SLAS discovery : advancing life sciences R & D.

[23]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[24]  A. Hill,et al.  Getting physical in drug discovery: a contemporary perspective on solubility and hydrophobicity. , 2010, Drug discovery today.

[25]  Ferenc Darvas APPLICATION OF THE SEQUENTIAL SIMPLEX METHOD IN DESIGNING DRUG ANALOGS , 1974 .

[26]  Andrew G. Leach,et al.  Matched molecular pairs as a guide in the optimization of pharmaceutical properties; a study of aqueous solubility, plasma protein binding and oral exposure. , 2006, Journal of medicinal chemistry.

[27]  K. Friedemann Schmidt,et al.  Predictive Multitask Deep Neural Network Models for ADME-Tox Properties: Learning from Large Data Sets , 2019, J. Chem. Inf. Model..

[28]  Matthias Adam,et al.  Integrating research and development: the emergence of rational drug design in the pharmaceutical industry. , 2005, Studies in history and philosophy of biological and biomedical sciences.

[29]  Jin Woo Kim,et al.  Molecular generative model based on conditional variational autoencoder for de novo molecular design , 2018, Journal of Cheminformatics.

[30]  M. Cases,et al.  Inroads to Predict in Vivo Toxicology—An Introduction to the eTOX Project , 2012, International journal of molecular sciences.

[31]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[32]  Visakan Kadirkamanathan,et al.  Lead Optimization Using Matched Molecular Pairs: Inclusion of Contextual Information for Enhanced Prediction of hERG Inhibition, Solubility, and Lipophilicity , 2010, J. Chem. Inf. Model..

[33]  G. Elion,et al.  Actions of purine analogs: enzyme specificity studies as a basis for interpretation and design. , 1969, Cancer research.

[34]  Jameed Hussain,et al.  Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets , 2010, J. Chem. Inf. Model..

[35]  Christopher E Keefer,et al.  The use of matched molecular series networks for cross target structure activity relationship translation and potency prediction. , 2017, MedChemComm.

[36]  Jonas Boström,et al.  Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity , 2014, Journal of medicinal chemistry.

[37]  Maria F. Sassano,et al.  Automated design of ligands to polypharmacological profiles , 2012, Nature.

[38]  S. Free,et al.  A MATHEMATICAL CONTRIBUTION TO STRUCTURE-ACTIVITY STUDIES. , 1964, Journal of medicinal chemistry.

[39]  Jonathan S. Mason,et al.  Discovery of 1,2,4-Triazine Derivatives as Adenosine A2A Antagonists using Structure Based Drug Design , 2012, Journal of medicinal chemistry.

[40]  Kristian Birchall,et al.  Design, Synthesis, and Testing of Potent, Selective Hepsin Inhibitors via Application of an Automated Closed-Loop Optimization Platform. , 2018, Journal of medicinal chemistry.

[41]  Evan N. Feinberg,et al.  Improvement in ADMET Prediction with Multitask Deep Featurization. , 2020, Journal of medicinal chemistry.

[42]  Igor V. Tetko,et al.  ToxAlerts: A Web Server of Structural Alerts for Toxic Chemicals and Compounds with Potential Adverse Reactions , 2012, J. Chem. Inf. Model..

[43]  Matthias Rarey,et al.  On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.

[44]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[45]  Jennifer L. Knight,et al.  Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. , 2015, Journal of the American Chemical Society.

[46]  D. M. Ryan,et al.  Rational design of potent sialidase-based inhibitors of influenza virus replication , 1993, Nature.

[47]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[48]  Daniel J. Warner,et al.  WizePairZ: A Novel Algorithm to Identify, Encode, and Exploit Matched Molecular Pairs with Unspecified Cores in Medicinal Chemistry , 2010, J. Chem. Inf. Model..

[49]  Michelle L. Lamb,et al.  Targeting adenosine A2A receptor antagonism for treatment of cancer , 2018, Expert opinion on drug discovery.

[50]  Gisbert Schneider,et al.  Computer-Assisted Discovery of Retinoid X Receptor Modulating Natural Products and Isofunctional Mimetics. , 2018, Journal of medicinal chemistry.

[51]  Francis X. M. Casey,et al.  Estimating Solute Transport Parameters Using Stochastic Ranking Evolutionary Strategy , 2008 .

[52]  Thierry Kogej,et al.  Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ArXiv.

[53]  Darren V. S. Green,et al.  The Reduced Graph Descriptor in Virtual Screening and Data-Driven Clustering of High-Throughput Screening Data , 2005, J. Chem. Inf. Model..

[54]  Michael M. Hann,et al.  RECAP-Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry , 1998, J. Chem. Inf. Comput. Sci..

[55]  Anne Mai Wassermann,et al.  Large-scale exploration of bioisosteric replacements on the basis of matched molecular pairs. , 2011, Future medicinal chemistry.

[56]  Stephen D Pickett,et al.  De Novo Molecule Design by Translating from Reduced Graphs to SMILES , 2018, J. Chem. Inf. Model..

[57]  Gisbert Schneider,et al.  Automating drug discovery , 2017, Nature Reviews Drug Discovery.

[58]  Vijay S. Pande,et al.  Low Data Drug Discovery with One-Shot Learning , 2016, ACS central science.

[59]  Jürgen Bajorath,et al.  MMP-Cliffs: Systematic Identification of Activity Cliffs on the Basis of Matched Molecular Pairs , 2012, J. Chem. Inf. Model..

[60]  Robert Abel,et al.  Reaction-Based Enumeration, Active Learning, and Free Energy Calculations To Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin-Dependent Kinase 2 Inhibitors , 2019, J. Chem. Inf. Model..

[61]  Darren V S Green,et al.  Getting physical in drug discovery II: the impact of chromatographic hydrophobicity measurements and aromaticity. , 2011, Drug discovery today.

[62]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[63]  Darren V. S. Green,et al.  CADD medicine: design is the potion that can cure my disease , 2017, Journal of Computer-Aided Molecular Design.

[64]  Ovanes G Mekenyan,et al.  The OECD QSAR Toolbox Starts Its Second Decade. , 2018, Methods in molecular biology.

[65]  Ian Hughes,et al.  Automated Lead Optimization of MMP-12 Inhibitors Using a Genetic Algorithm. , 2011, ACS medicinal chemistry letters.

[66]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[67]  Andreas Verras,et al.  Is Multitask Deep Learning Practical for Pharma? , 2017, J. Chem. Inf. Model..

[68]  R. M. Muir,et al.  Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients , 1962, Nature.

[69]  Michael M. Hann,et al.  RECAP — Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry. , 1998 .