Property Prediction of Organic Donor Molecules for Photovoltaic Applications Using Extremely Randomized Trees

Organic solar cells are an inexpensive, flexible alternative to traditional silicon‐based solar cells but disadvantaged by low power conversion efficiency due to empirical design and complex manufacturing processes. This process can be accelerated by generating a comprehensive set of potential candidates. However, this would require a laborious trial and error method of modeling all possible polymer configurations. A machine learning model has the potential to accelerate the process of screening potential donor candidates by associating structural features of the compound using molecular fingerprints with their highest occupied molecular orbital energies. In this paper, extremely randomized tree learning models are employed for the prediction of HOMO values for donor compounds, and a web application is developed.1 The proposed models outperform neural networks trained on molecular fingerprints as well as SMILES, as well as other state‐of‐the‐art architectures such as Chemception and Molecular Graph Convolution on two datasets of varying sizes.

[1]  Philippe Blanchard,et al.  Molecular Materials for Organic Photovoltaics: Small is Beautiful , 2014, Advanced materials.

[2]  Rosaria Ciriminna,et al.  Rethinking solar energy education on the dawn of the solar economy , 2016 .

[3]  Evan Bolton,et al.  PubChem3D: conformer ensemble accuracy , 2013, Journal of Cheminformatics.

[4]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[5]  J. Brédas,et al.  Molecular understanding of organic solar cells: the challenges. , 2009, Accounts of chemical research.

[6]  Jian Wang,et al.  Ab initio modeling of quantum transport properties of molecular electronic devices , 2001 .

[7]  Yuksel C. Yabansu,et al.  Establishing structure-property localization linkages for elastic deformation of three-dimensional high contrast composites using deep learning approaches , 2019, Acta Materialia.

[8]  Wei Wang,et al.  Theoretical study of two-photon absorption properties and up-conversion efficiency of new symmetric organic π-conjugated molecules for photovoltaic devices , 2012, Journal of Molecular Modeling.

[9]  Abraham Yosipof,et al.  Visualization Based Data Mining for Comparison Between Two Solar Cell Libraries , 2016, Molecular informatics.

[10]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[11]  Aspuru-Guzik Alan The Harvard Organic Photovoltaics 2015 (HOPV) dataset: An experiment-theory calibration resource. , 2016 .

[12]  Peter J. Skabara,et al.  Nanostructured Materials for Type III Photovoltaics , 2017 .

[13]  Allan D. Headley,et al.  A scale of directional substituent polarizability parameters from ab initio calculations of polarizability potentials , 1986 .

[14]  Edward O. Pyzer-Knapp,et al.  Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery , 2015 .

[15]  Xiaowei Zhan,et al.  Non-fullerene acceptors for organic photovoltaics: an emerging horizon , 2014 .

[16]  Stephen R. Forrest,et al.  Asymmetric tandem organic photovoltaic cells with hybrid planar-mixed molecular heterojunctions , 2004 .

[17]  Adrià Cereto-Massagué,et al.  Molecular fingerprint similarity search in virtual screening. , 2015, Methods.

[18]  Petra Schneider,et al.  Chemically Advanced Template Search (CATS) for Scaffold-Hopping and Prospective Target Prediction for ‘Orphan’ Molecules , 2013, Molecular informatics.

[19]  Mandeep Singh,et al.  Organic materials for photovoltaic applications: Review and mechanism , 2014 .

[20]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[21]  Yasuo Tabei,et al.  SketchSort: Fast All Pairs Similarity Search for Large Databases of Molecular Fingerprints , 2011, Molecular informatics.

[22]  Vittaya Amornkitbamrung,et al.  DFT and TDDFT study on the electronic structure and photoelectrochemical properties of dyes derived from cochineal and lac insects as photosensitizer for dye-sensitized solar cells , 2013, Journal of Molecular Modeling.

[23]  Abraham Yosipof,et al.  RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells , 2017, Journal of Cheminformatics.

[24]  Wei-keng Liao,et al.  ElemNet: Deep Learning the Chemistry of Materials From Only Elemental Composition , 2018, Scientific Reports.

[25]  M. Rupp,et al.  Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties , 2013, 1307.2918.

[26]  Kyungnam Kang,et al.  Integrated optoelectronic model for organic solar cells based on the finite element method including the effect of oblique sunlight incidence and a non-ohmic electrode contact , 2016 .

[27]  Rashid Ahmed,et al.  First principles investigations of vinazene molecule and molecular crystal: a prospective candidate for organic photovoltaic applications , 2015, Journal of Molecular Modeling.

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Wei-keng Liao,et al.  Data Sampling Schemes for Microstructure Design with Vibrational Tuning Constraints , 2018 .

[30]  Abraham Yosipof,et al.  PV Analyzer: A Decision Support System for Photovoltaic Solar Cells Libraries , 2018, Molecular informatics.

[31]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[32]  Christoph J. Brabec,et al.  Topographical and morphological aspects of spray coated organic photovoltaics , 2009 .

[33]  Jianhui Hou,et al.  Active Layer Materials for Organic Solar Cells , 2013 .

[34]  Josep Ferré-Borrull,et al.  Two-dimensional finite-element modeling of periodical interdigitated full organic solar cells , 2013 .

[35]  M. C. Scharber,et al.  CHAPTER 2:Bulk Heterojunction Organic Solar Cells: Working Principles and Power Conversion Efficiencies , 2017 .

[36]  A. Troisi,et al.  Toward Predicting Efficiency of Organic Solar Cells via Machine Learning and Improved Descriptors , 2018, Advanced Energy Materials.

[37]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[38]  Mikkel N. Schmidt,et al.  Machine learning-based screening of complex molecules for polymer solar cells. , 2018, The Journal of chemical physics.

[39]  William L. Jorgensen,et al.  Journal of Chemical Information and Modeling , 2005, J. Chem. Inf. Model..

[40]  Lawrence A. Adutwum,et al.  How To Optimize Materials and Devices via Design of Experiments and Machine Learning: Demonstration Using Organic Photovoltaics. , 2018, ACS nano.

[41]  Mati Karelson,et al.  Topological Fingerprints as an Aid in Finding Structural Patterns for LRRK2 Inhibition , 2014, Molecular informatics.

[42]  Daniel W. Davies,et al.  Machine learning for molecular and materials science , 2018, Nature.

[43]  F. Mendizábal,et al.  Improvement of photovoltaic performance by substituent effect of donor and acceptor structure of TPA-based dye-sensitized solar cells , 2016, Journal of Molecular Modeling.

[44]  Ole Winther,et al.  Deep Generative Models for Molecular Science , 2018, Molecular informatics.

[45]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[46]  A. Choudhary,et al.  Deep materials informatics: Applications of deep learning in materials science , 2019, MRS Communications.

[47]  Christoph J. Brabec,et al.  Design Rules for Donors in Bulk‐Heterojunction Solar Cells—Towards 10 % Energy‐Conversion Efficiency , 2006 .

[48]  Alán Aspuru-Guzik,et al.  The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid , 2011 .

[49]  Wei-keng Liao,et al.  Microstructure optimization with constrained design objectives using machine learning-based feedback-aware data-generation , 2019, Computational Materials Science.

[50]  Alán Aspuru-Guzik,et al.  Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry – the Harvard Clean Energy Project , 2014 .

[51]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[52]  Warren J. Hehre,et al.  AB INITIO Molecular Orbital Theory , 1986 .

[53]  Tonio Buonassisi,et al.  Accelerating Materials Development via Automation, Machine Learning, and High-Performance Computing , 2018, Joule.