wACSF-Weighted atom-centered symmetry functions as descriptors in machine learning potentials.

We introduce weighted atom-centered symmetry functions (wACSFs) as descriptors of a chemical system's geometry for use in the prediction of chemical properties such as enthalpies or potential energies via machine learning. The wACSFs are based on conventional atom-centered symmetry functions (ACSFs) but overcome the undesirable scaling of the latter with an increasing number of different elements in a chemical system. The performance of these two descriptors is compared using them as inputs in high-dimensional neural network potentials (HDNNPs), employing the molecular structures and associated enthalpies of the 133 855 molecules containing up to five different elements reported in the QM9 database as reference data. A substantially smaller number of wACSFs than ACSFs is needed to obtain a comparable spatial resolution of the molecular structures. At the same time, this smaller set of wACSFs leads to a significantly better generalization performance in the machine learning potential than the large set of conventional ACSFs. Furthermore, we show that the intrinsic parameters of the descriptors can in principle be optimized with a genetic algorithm in a highly automated manner. For the wACSFs employed here, we find however that using a simple empirical parametrization scheme is sufficient in order to obtain HDNNPs with high accuracy.

[1]  Li Li,et al.  Bypassing the Kohn-Sham equations with machine learning , 2016, Nature Communications.

[2]  K. Müller,et al.  Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space , 2015, The journal of physical chemistry letters.

[3]  Heather J Kulik,et al.  Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships. , 2017, The journal of physical chemistry. A.

[4]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[5]  Hua Guo,et al.  Permutation invariant polynomial neural network approach to fitting potential energy surfaces. , 2013, The Journal of chemical physics.

[6]  Edward O. Pyzer-Knapp,et al.  Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery , 2015 .

[7]  Peter Sollich,et al.  Accurate interatomic force fields via machine learning with covariant kernels , 2016, 1611.03877.

[8]  Ryo Kobayashi,et al.  Neural network potential for Al-Mg-Si alloys , 2017 .

[9]  Raghunathan Ramakrishnan,et al.  Genetic Optimization of Training Sets for Improved Machine Learning Models of Molecular Properties. , 2016, The journal of physical chemistry letters.

[10]  Zhenwei Li,et al.  Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. , 2015, Physical review letters.

[11]  Kipton Barros,et al.  Learning molecular energies using localized graph kernels. , 2016, The Journal of chemical physics.

[12]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[13]  J. Behler Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. , 2011, Physical chemistry chemical physics : PCCP.

[14]  Shweta Jindal,et al.  Spherical harmonics based descriptor for neural network potentials: Structure and dynamics of Au147 nanocluster. , 2017, The Journal of chemical physics.

[15]  John E Herr,et al.  The many-body expansion combined with neural networks. , 2016, The Journal of chemical physics.

[16]  Jörg Behler,et al.  Constructing high‐dimensional neural network potentials: A tutorial review , 2015 .

[17]  Atsuto Seko,et al.  Representation of compounds for machine-learning prediction of physical properties , 2016, 1611.08645.

[18]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[19]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[20]  George E. Dahl,et al.  Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error. , 2017, Journal of chemical theory and computation.

[21]  R Komanduri,et al.  Ab initio potential-energy surfaces for complex, multichannel systems using modified novelty sampling and feedforward neural networks. , 2005, The Journal of chemical physics.

[22]  Alán Aspuru-Guzik,et al.  Machine learning exciton dynamics , 2015, Chemical science.

[23]  J. Behler,et al.  Comparing the accuracy of high-dimensional neural network potentials and the systematic molecular fragmentation method: A benchmark study for all-trans alkanes. , 2016, The Journal of chemical physics.

[24]  G. vanRossum Python reference manual , 1995 .

[25]  O. A. von Lilienfeld,et al.  Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity. , 2016, The Journal of chemical physics.

[26]  Kieron Burke,et al.  Pure density functional for strong correlation and the thermodynamic limit from machine learning , 2016, 1609.03705.

[27]  Gábor Csányi,et al.  Gaussian approximation potentials: A brief tutorial introduction , 2015, 1502.01366.

[28]  S. Goedecker,et al.  Metrics for measuring distances in configuration spaces. , 2013, The Journal of chemical physics.

[29]  J Behler,et al.  Representing potential energy surfaces by high-dimensional neural network potentials , 2014, Journal of physics. Condensed matter : an Institute of Physics journal.

[30]  Michael Gastegger,et al.  Machine learning molecular dynamics for the simulation of infrared spectra† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02267k , 2017, Chemical science.

[31]  Steven D. Brown,et al.  Neural network models of potential energy surfaces , 1995 .

[32]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[33]  M. Gastegger,et al.  High-Dimensional Neural Network Potentials for Organic Reactions and an Improved Training Algorithm. , 2015, Journal of chemical theory and computation.

[34]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[35]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[36]  Bin Jiang,et al.  Potential energy surfaces from high fidelity fitting of ab initio points: the permutation invariant polynomial - neural network approach , 2016 .

[37]  A. Gray,et al.  I. THE ORIGIN OF SPECIES BY MEANS OF NATURAL SELECTION , 1963 .

[38]  Ryan P. Adams,et al.  Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. , 2016, Nature materials.

[39]  Joshua D. Knowles,et al.  Accuracy and tractability of a kriging model of intramolecular polarizable multipolar electrostatics and its application to histidine , 2013, J. Comput. Chem..

[40]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[41]  O. Anatole von Lilienfeld,et al.  Machine Learning, Quantum Chemistry, and Chemical Space , 2017 .

[42]  M. Rupp,et al.  Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties , 2013, 1307.2918.

[43]  J. Behler Atom-centered symmetry functions for constructing high-dimensional neural network potentials. , 2011, The Journal of chemical physics.

[44]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[45]  Mark N. Gibbs,et al.  Combining ab initio computations, neural networks, and diffusion Monte Carlo: An efficient method to treat weakly bound molecules , 1996 .

[46]  J. Behler First Principles Neural Network Potentials for Reactive Simulations of Large Molecular and Condensed Systems. , 2017, Angewandte Chemie.

[47]  Gerbrand Ceder,et al.  Efficient and accurate machine-learning interpolation of atomic energies in compositions with many species , 2017, 1706.06293.