Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space

Challenges in natural sciences can often be phrased as optimization problems. Machine learning techniques have recently been applied to solve such problems. One example in chemistry is the design of tailor-made organic materials and molecules, which requires efficient methods to explore the chemical space. We present a genetic algorithm (GA) that is enhanced with a neural network (DNN) based discriminator model to improve the diversity of generated molecules and at the same time steer the GA. We show that our algorithm outperforms other generative models in optimization tasks. We furthermore present a way to increase interpretability of genetic algorithms, which helped us to derive design principles.

[1]  Marcel Mayor,et al.  Quantum superposition of molecules beyond 25 kDa , 2019, Nature Physics.

[2]  Alán Aspuru-Guzik,et al.  SELFIES: a robust representation of semantically constrained graphs with an example application in chemistry , 2019, ArXiv.

[3]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[4]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[5]  Jan H. Jensen,et al.  A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space , 2018, Chemical science.

[6]  Regina Barzilay,et al.  Learning Multimodal Graph-to-Graph Translation for Molecular Optimization , 2018, ICLR.

[7]  Yoshua Bengio,et al.  DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation , 2018, ArXiv.

[8]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[9]  J. Clune,et al.  The Surprising Creativity of Digital Evolution , 2018, ALIFE.

[10]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[11]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[12]  Alán Aspuru-Guzik,et al.  Alkaline Benzoquinone Aqueous Flow Battery for Large‐Scale Storage of Electrical Energy , 2018 .

[13]  Seth R. Marder,et al.  Non-fullerene acceptors for organic solar cells , 2018 .

[14]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[15]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[16]  Koji Tsuda,et al.  ChemTS: an efficient python library for de novo molecular generation , 2017, Science and technology of advanced materials.

[17]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[18]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[19]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[20]  A. Zeilinger,et al.  Twisted photon entanglement through turbulent air across Vienna , 2015, Proceedings of the National Academy of Sciences.

[21]  David N. Beratan,et al.  Strategy To Discover Diverse Optimal Molecules in the Small Molecule Universe , 2015, J. Chem. Inf. Model..

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  P. Wipf,et al.  Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. , 2013, Journal of the American Chemical Society.

[25]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[26]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[27]  Noel M. O'Boyle,et al.  Computational Design and Selection of Optimal Organic Photovoltaic Materials , 2011 .

[28]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[29]  W. Paszkowicz,et al.  Properties of a genetic algorithm equipped with a dynamic penalty function , 2009 .

[30]  Pruettha Nanakorn,et al.  An adaptive penalty function in genetic algorithms for structural design optimization , 2001 .

[31]  Abby L. Parrill,et al.  Evolutionary and genetic methods in drug design , 1996 .

[32]  Robert P. Sheridan,et al.  Using a Genetic Algorithm To Suggest Combinatorial Libraries , 1995, J. Chem. Inf. Comput. Sci..

[33]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Molecule Generation , 2017 .

[34]  J. Devillers Genetic algorithms in molecular modeling , 1996 .