Enabling rapid COVID-19 small molecule drug design through scalable deep learning of generative models

We improved the quality and reduced the time to produce machine learned models for use in small molecule antiviral design. Our globally asynchronous multi-level parallel training approach strong scales to all of Sierra with up to 97.7% efficiency. We trained a novel, character-based Wasserstein autoencoder that produces a higher quality model trained on 1.613 billion compounds in 23 minutes while the previous state of the art takes a day on 1 million compounds. Reducing training time from a day to minutes shifts the model creation bottleneck from computer job turnaround time to human innovation time. Our implementation achieves 318 PFLOPs for 17.1% of half-precision peak. We will incorporate this model into our molecular design loop enabling the generation of more diverse compounds; searching for novel, candidate antiviral drugs improves and reduces the time to synthesize compounds to be tested in the lab.

[1]  Youngsoo Choi,et al.  A fast and accurate physics-informed neural network reduced order model with shallow masked autoencoder , 2020, J. Comput. Phys..

[2]  Youngsoo Choi,et al.  Efficient nonlinear manifold reduced order model , 2020, ArXiv.

[3]  Eric A. Stahlberg,et al.  Accelerating Therapeutics for Opportunities in Medicine: A Paradigm Shift in Drug Discovery , 2020, Frontiers in Pharmacology.

[4]  Derek Jones,et al.  Binding Affinity Prediction by Pairwise Function Based on Neural Network , 2020, J. Chem. Inf. Model..

[5]  Ben S. Southworth,et al.  Diffusion Synthetic Acceleration for Heterogeneous Domains, Compatible with Voids , 2020, ArXiv.

[6]  Bharath Ramsundar,et al.  AMPL: A Data-Driven Modeling Pipeline for Drug Discovery , 2019, J. Chem. Inf. Model..

[7]  Vladimir Z. Tomov,et al.  A quadratic programming flux correction method for high-order DG discretizations of SN transport , 2019, J. Comput. Phys..

[8]  James Demmel,et al.  Large Batch Optimization for Deep Learning: Training BERT in 76 minutes , 2019, ICLR.

[9]  Krzysztof Rataj,et al.  Mol-CycleGAN: a generative model for molecular optimization , 2019, Journal of Cheminformatics.

[10]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[11]  Matthew L. Leininger,et al.  An evaluation of the CORAL interconnects , 2019, SC.

[12]  Sam Ade Jacobs,et al.  Parallelizing Training of Deep Generative Models on Massive Scientific Datasets , 2019, 2019 IEEE International Conference on Cluster Computing (CLUSTER).

[13]  Frank Noé,et al.  Efficient multi-objective molecular optimization in a continuous latent space† †Electronic supplementary information (ESI) available: Details of the desirability scaling functions, high resolution figures and detailed results of the GuacaMol benchmark. See DOI: 10.1039/c9sc01928f , 2019, Chemical science.

[14]  Djork-Arné Clevert,et al.  Efficient multi-objective molecular optimization in a continuous latent space , 2019, Chemical science.

[15]  Zois Boukouvalas,et al.  Deep learning for molecular generation and optimization - a review of the state of the art , 2019, Molecular Systems Design & Engineering.

[16]  Gianni De Fabritiis,et al.  Shape-Based Generative Modeling for de Novo Drug Design , 2019, J. Chem. Inf. Model..

[17]  Paris Perdikaris,et al.  Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..

[18]  Jennifer Listgarten,et al.  Conditioning by adaptive sampling for robust design , 2019, ICML.

[19]  James Demmel,et al.  Large-batch training for LSTM and beyond , 2019, SC.

[20]  Satoshi Matsuoka,et al.  Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Djork-Arné Clevert,et al.  Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations , 2018, Chemical science.

[22]  Markus H. Gross,et al.  Deep Fluids: A Generative Network for Parameterized Fluid Simulations , 2018, Comput. Graph. Forum.

[23]  T. S. Haut,et al.  An Efficient Sweep-Based Solver for the SN Equations on High-Order Meshes , 2018 .

[24]  R. McClarren,et al.  Acceleration of Source Iteration using the Dynamic Mode Decomposition , 2018, 1812.05241.

[25]  Hank Childs,et al.  A flexible system for in situ triggers , 2018, ISAV@SC.

[26]  Marc Snir,et al.  Aluminum: An Asynchronous, GPU-Aware Communication Library Optimized for Large-Scale Training of Deep Neural Networks on HPC Systems , 2018, 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC).

[27]  Bronis R. de Supinski,et al.  The Design, Deployment, and Evaluation of the CORAL Pre-Exascale Systems , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[28]  T. S. Haut,et al.  DSA Preconditioning for DG discretizations of $S_{N}$ transport and High-Order curved meshes , 2018 .

[29]  Prabhat,et al.  Exascale Deep Learning for Climate Analytics , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  Prabhat,et al.  CosmoFlow: Using Deep Learning to Learn the Universe at Scale , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[31]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[32]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[33]  Thomas Blaschke,et al.  Application of Generative Autoencoder in De Novo Molecular Design , 2017, Molecular informatics.

[34]  Petra Schneider,et al.  Generative Recurrent Networks for De Novo Drug Design , 2017, Molecular informatics.

[35]  Yang You,et al.  ImageNet Training in Minutes , 2017, ICPP.

[36]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[37]  Takuya Akiba,et al.  Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.

[38]  Sam Ade Jacobs,et al.  Towards Scalable Parallel Training of Deep Neural Networks , 2017, MLHPC@SC.

[39]  Amanda J Price,et al.  Fragment-based drug discovery and its application to challenging drug targets. , 2017, Essays in biochemistry.

[40]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[41]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[42]  Forrest N. Iandola,et al.  FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Hyojin Kim,et al.  LBANN: livermore big artificial neural network HPC toolkit , 2015, MLHPC@SC.

[44]  J. Reymond The chemical space project. , 2015, Accounts of chemical research.

[45]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[46]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[47]  Tao Wang,et al.  Deep learning with COTS HPC systems , 2013, ICML.

[48]  Robert A. van de Geijn,et al.  Elemental: A New Framework for Distributed Memory Dense Matrix Computations , 2013, TOMS.

[49]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[50]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[51]  Quentin F. Stout,et al.  Statistical Analysis of Communication Time on the IBM SP2 , 2008 .

[52]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[53]  문정진 § 19 , 2000 .

[54]  Kaye Starbird Tony , 1978 .

[55]  Flexible System , 2022 .