Molecular generation targeting desired electronic properties via deep generative models.

As we seek to discover new functional materials, we need ways to explore the vast chemical space of precursor building blocks, not only generating large numbers of possible building blocks to investigate, but trying to find non-obvious options, that we might not suggest by chemical experience alone. Artificial intelligence techniques provide a possible avenue to generate large numbers of organic building blocks for functional materials, and can even do so from very small initial libraries of known building blocks. Specifically, we demonstrate the application of deep recurrent neural networks for the exploration of the chemical space of building blocks for a test case of donor-acceptor oligomers with specific electronic properties. The recurrent neural network learned how to produce novel donor-acceptor oligomers by trading off between selected atomic substitutions, such as halogenation or methylation, and molecular features such as the oligomer's size. The electronic and structural properties of the generated oligomers can be tuned by sampling from different subsets of the training database, which enabled us to enrich the library of donor-acceptors towards desired properties. We generated approximately 1700 new donor-acceptor oligomers with a recurrent neural network tuned to target oligomers with a HOMO-LUMO gap <2 eV and a dipole moment <2 Debye, which could have potential application in organic photovoltaics.

[1]  Chengyuan Wang,et al.  Selenium-Substituted β-Methylthiobenzo[1,2-b:4,5-b′]dithiophenes: Synthesis, Packing Structure, and Transport Properties , 2019, Chemistry of Materials.

[2]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[3]  Larry R. Dalton,et al.  Recent progress in second-order nonlinear optical polymers and dendrimers , 2008 .

[4]  Stefan Grimme,et al.  A simplified Tamm-Dancoff density functional approach for the electronic excitation spectra of very large molecules. , 2013, The Journal of chemical physics.

[5]  G. Bazan,et al.  Cofacial Versus Coplanar Arrangement in Centrosymmetric Packing Dimers of Dipolar Small Molecules: Structural Effects on the Crystallization Behaviors and Optoelectronic Characteristics. , 2016, ACS applied materials & interfaces.

[6]  Peter Bäuerle,et al.  Small molecule organic semiconductors on the move: promises for future solar energy technology. , 2012, Angewandte Chemie.

[7]  Stefan Grimme,et al.  A simplified time-dependent density functional theory approach for electronic ultraviolet and circular dichroism spectra of very large molecules , 2014 .

[8]  K. Moorthi,et al.  Polymer Optical Constants from Long-Range Corrected DFT Calculations. , 2016, The journal of physical chemistry. B.

[9]  C. B. Nielsen,et al.  The role of chemical design in the performance of organic semiconductors , 2020, Nature Reviews Chemistry.

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Zois Boukouvalas,et al.  Deep learning for molecular generation and optimization - a review of the state of the art , 2019, Molecular Systems Design & Engineering.

[12]  Ian T. Foster,et al.  Virtual Excited State Reference for the Discovery of Electronic Materials Database: An Open-Access Resource for Ground and Excited State Properties of Organic Molecules. , 2019, The journal of physical chemistry letters.

[13]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[14]  F. Weigend Accurate Coulomb-fitting basis sets for H to Rn. , 2006, Physical chemistry chemical physics : PCCP.

[15]  Markus J. Buehler,et al.  Bioinspired hierarchical composite design using machine learning: simulation, additive manufacturing, and experiment , 2018 .

[16]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[17]  J. Reymond The chemical space project. , 2015, Accounts of chemical research.

[18]  Berthold Stöger,et al.  Charge-transfer states in triazole linked donor-acceptor materials: strong effects of chemical modification and solvation. , 2017, Physical chemistry chemical physics : PCCP.

[19]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[20]  Alán Aspuru-Guzik,et al.  Design Principles and Top Non-Fullerene Acceptor Candidates for Organic Photovoltaics , 2017 .

[21]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[22]  M. Frisch,et al.  Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields , 1994 .

[23]  Alessandro Troisi,et al.  Combining electronic and structural features in machine learning models to predict organic solar cells properties , 2019, Materials Horizons.

[24]  Jean-Luc Brédas,et al.  Organic electronic materials: recent advances in the DFT description of the ground and excited states using tuned range-separated hybrid functionals. , 2014, Accounts of chemical research.

[25]  Lei Wang,et al.  Photo-Driven Synthesis of C6-Polyfunctionalized Phenanthridines from Three-Component Reactions of Isocyanides, Alkynes, and Sulfinic Acids by Electron Donor-Acceptor Complex. , 2018, Organic letters.

[26]  Larry R Dalton,et al.  Electric field poled organic electro-optic materials: state of the art and future prospects. , 2010, Chemical reviews.

[27]  Mikkel N. Schmidt,et al.  Machine learning-based screening of complex molecules for polymer solar cells. , 2018, The Journal of chemical physics.

[28]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[29]  Krishnan Raghavachari,et al.  Perspective on “Density functional thermochemistry. III. The role of exact exchange” , 2000 .

[30]  Stephen Wu,et al.  Machine-learning-assisted discovery of polymers with high thermal conductivity using a molecular design algorithm , 2019, npj Computational Materials.

[31]  Chi Chen,et al.  Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals , 2018, Chemistry of Materials.

[32]  W. Goddard,et al.  UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations , 1992 .

[33]  Jeng-Da Chai,et al.  Long-Range Corrected Hybrid Density Functionals with Improved Dispersion Corrections. , 2012, Journal of chemical theory and computation.

[34]  P. Prasad,et al.  Multiphoton absorbing materials: molecular designs, characterizations, and applications. , 2008, Chemical Reviews.

[35]  Sereina Riniker,et al.  Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation , 2015, J. Chem. Inf. Model..

[36]  Alán Aspuru-Guzik,et al.  Optimizing distributions over molecular space. An Objective-Reinforced Generative Adversarial Network for Inverse-design Chemistry (ORGANIC) , 2017 .

[37]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[38]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[39]  Christoph J. Brabec,et al.  Design Rules for Donors in Bulk‐Heterojunction Solar Cells—Towards 10 % Energy‐Conversion Efficiency , 2006 .

[40]  Alán Aspuru-Guzik,et al.  The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid , 2011 .

[41]  Ryan P. Adams,et al.  Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. , 2016, Nature materials.

[42]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[43]  A. Postigo Electron Donor-Acceptor Complexes in Perfluoroalkylation Reactions , 2018, European Journal of Organic Chemistry.

[44]  Long Ye,et al.  Molecular Design of Benzodithiophene-Based Organic Photovoltaic Materials. , 2016, Chemical reviews.

[45]  Reiner Sebastian Sprick,et al.  Mapping Binary Copolymer Property Space with Neural Networks , 2019 .

[46]  Jingui Qin,et al.  Organic host materials for phosphorescent organic light-emitting diodes. , 2011, Chemical Society reviews.

[47]  Gisbert Schneider,et al.  Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators , 2018, Communications Chemistry.

[48]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[49]  Peter Politzer,et al.  An electrostatic correction for improved crystal density predictions of energetic ionic compounds , 2010 .

[50]  K-R Müller,et al.  SchNetPack: A Deep Learning Toolbox For Atomistic Systems. , 2018, Journal of chemical theory and computation.

[51]  M. Bessa,et al.  Bayesian Machine Learning in Metamaterial Design: Fragile Becomes Supercompressible , 2019, Advanced materials.

[52]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[53]  Stefan Grimme,et al.  GFN2-xTB-An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. , 2018, Journal of Chemical Theory and Computation.

[54]  Leland McInnes,et al.  Manifold learning of four-dimensional scanning transmission electron microscopy , 2018, npj Computational Materials.

[55]  A. Becke A New Mixing of Hartree-Fock and Local Density-Functional Theories , 1993 .

[56]  Chih-Yu Hsu,et al.  Synthesis, optical and electrochemical properties of pyridal[2,1,3]thiadiazole based organic dyes for dye sensitized solar cells , 2014 .

[57]  Frank Neese,et al.  Software update: the ORCA program system, version 4.0 , 2018 .

[58]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[59]  R. Ulijn,et al.  Discovery of energy transfer nanostructures using gelation-driven dynamic combinatorial libraries , 2013 .