Atom-density representations for machine learning.

The applications of machine learning techniques to chemistry and materials science become more numerous by the day. The main challenge is to devise representations of atomic systems that are at the same time complete and concise, so as to reduce the number of reference calculations that are needed to predict the properties of different types of materials reliably. This has led to a proliferation of alternative ways to convert an atomic structure into an input for a machine-learning model. We introduce an abstract definition of chemical environments that is based on a smoothed atomic density, using a bra-ket notation to emphasize basis set independence and to highlight the connections with some popular choices of representations for describing atomic systems. The correlations between the spatial distribution of atoms and their chemical identities are computed as inner products between these feature kets, which can be given an explicit representation in terms of the expansion of the atom density on orthogonal basis functions, that is equivalent to the smooth overlap of atomic positions power spectrum, but also in real space, corresponding to n-body correlations of the atom density. This formalism lays the foundations for a more systematic tuning of the behavior of the representations, by introducing operators that represent the correlations between structure, composition, and the target properties. It provides a unifying picture of recent developments in the field and indicates a way forward toward more effective and computationally affordable machine-learning schemes for molecules and materials.

[1]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[2]  Michael Gastegger,et al.  Machine learning molecular dynamics for the simulation of infrared spectra† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02267k , 2017, Chemical science.

[3]  Ralf Drautz,et al.  Atomic cluster expansion for accurate and transferable interatomic potentials , 2019, Physical Review B.

[4]  Alireza Khorshidi,et al.  Amp: A modular approach to machine learning in atomistic simulations , 2016, Comput. Phys. Commun..

[5]  Volker L. Deringer,et al.  Machine learning based interatomic potential for amorphous carbon , 2016, 1611.03277.

[6]  M. Scheffler,et al.  Insightful classification of crystal structures using deep learning , 2017, Nature Communications.

[7]  Christian Trott,et al.  Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials , 2014, J. Comput. Phys..

[8]  Yang Yang,et al.  Accurate molecular polarizabilities with coupled cluster theory and machine learning , 2018, Proceedings of the National Academy of Sciences.

[9]  Noam Bernstein,et al.  Machine learning unifies the modeling of materials and molecules , 2017, Science Advances.

[10]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[11]  S. Goedecker Linear scaling electronic structure methods , 1999 .

[12]  Jörg Behler,et al.  Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials. , 2018, The Journal of chemical physics.

[13]  Alberto Fabrizio,et al.  Transferable Machine-Learning Model of the Electron Density , 2018, ACS central science.

[14]  John B. O. Mitchell Machine learning methods in chemoinformatics , 2014, Wiley interdisciplinary reviews. Computational molecular science.

[15]  Ryosuke Jinnouchi,et al.  Predicting Catalytic Activity of Nanoparticles by a DFT-Aided Machine-Learning Algorithm. , 2017, The journal of physical chemistry letters.

[16]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[17]  W. Kohn,et al.  Nearsightedness of electronic matter. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Andrea Grisafi,et al.  Symmetry-Adapted Machine Learning for Tensorial Properties of Atomistic Systems. , 2017, Physical review letters.

[19]  Jörg Behler,et al.  A neural network potential-energy surface for the water dimer based on environment-dependent atomic energies and charges. , 2012, The Journal of chemical physics.

[20]  Michele Parrinello,et al.  Demonstrating the Transferability and the Descriptive Power of Sketch-Map. , 2013, Journal of chemical theory and computation.

[21]  Daniel J. Rosenkrantz,et al.  An Analysis of Several Heuristics for the Traveling Salesman Problem , 1977, SIAM J. Comput..

[22]  John E Herr,et al.  The many-body expansion combined with neural networks. , 2016, The Journal of chemical physics.

[23]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.

[24]  Seiji Kajita,et al.  A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks , 2017, Scientific Reports.

[25]  Joel M. Bowman,et al.  Permutationally invariant potential energy surfaces in high dimensionality , 2009 .

[26]  S. Goedecker,et al.  Metrics for measuring distances in configuration spaces. , 2013, The Journal of chemical physics.

[27]  Raghunathan Ramakrishnan,et al.  Many Molecular Properties from One Kernel in Chemical Space. , 2015, Chimia.

[28]  Peter Sollich,et al.  Accurate interatomic force fields via machine learning with covariant kernels , 2016, 1611.03877.

[29]  L. Nachbin,et al.  The Haar integral , 1965 .

[30]  Petros Drineas,et al.  CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[31]  Michele Ceriotti,et al.  A Data-Driven Construction of the Periodic Table of the Elements , 2018, 1807.00236.

[32]  Aldo Glielmo,et al.  Efficient nonparametric n -body force fields from machine learning , 2018, 1801.04823.

[33]  Felix A Faber,et al.  Machine Learning Energies of 2 Million Elpasolite (ABC_{2}D_{6}) Crystals. , 2015, Physical review letters.

[34]  Fabio Pietrucci,et al.  Systematic comparison of crystalline and amorphous phases: Charting the landscape of water structures and transformations. , 2015, The Journal of chemical physics.

[35]  Gábor Csányi,et al.  Accuracy and transferability of Gaussian approximation potential models for tungsten , 2014 .

[36]  Stefano Curtarolo,et al.  Finding Unprecedentedly Low-Thermal-Conductivity Half-Heusler Semiconductors via High-Throughput Materials Modeling , 2014, 1401.2439.

[37]  Gerbrand Ceder,et al.  Efficient and accurate machine-learning interpolation of atomic energies in compositions with many species , 2017, 1706.06293.

[38]  E Weinan,et al.  Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics , 2017, Physical review letters.

[39]  Klaus-Robert Müller,et al.  Machine learning of accurate energy-conserving molecular force fields , 2016, Science Advances.

[40]  Gábor Csányi,et al.  Comparing molecules and solids across structural and alchemical space. , 2015, Physical chemistry chemical physics : PCCP.

[41]  Frederick R. Manby,et al.  Machine-learning approach for one- and two-body corrections to density functional theory: Applications to molecular and condensed water , 2013 .

[42]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[43]  Risi Kondor,et al.  Publisher’s Note: On representing chemical environments [Phys. Rev. B 87 , 184115 (2013)] , 2013 .

[44]  James Barker,et al.  LC-GAP: Localized Coulomb Descriptors for the Gaussian Approximation Potential , 2016, Scientific Computing and Algorithms in Industrial Simulations.

[45]  M Gastegger,et al.  wACSF-Weighted atom-centered symmetry functions as descriptors in machine learning potentials. , 2017, The Journal of chemical physics.

[46]  Alok Choudhary,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016 .

[47]  이기수,et al.  II. , 1992 .

[48]  M. Rupp,et al.  Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties , 2013, 1307.2918.

[49]  Yoav Zemel,et al.  Statistical Aspects of Wasserstein Distances , 2018, Annual Review of Statistics and Its Application.

[50]  Stéphane Mallat,et al.  Wavelet Scattering Regression of Quantum Chemical Energies , 2016, Multiscale Model. Simul..

[51]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[52]  M. Rupp,et al.  Machine learning of molecular electronic properties in chemical compound space , 2013, 1305.7074.

[53]  Nicolò Cesa-Bianchi,et al.  Advances in Neural Information Processing Systems 31 , 2018, NIPS 2018.

[54]  Hua Guo,et al.  Permutation invariant polynomial neural network approach to fitting potential energy surfaces. III. Molecule-surface interactions. , 2014, The Journal of chemical physics.

[55]  O. A. von Lilienfeld,et al.  Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity. , 2016, The Journal of chemical physics.

[56]  Anders S. Christensen,et al.  Alchemical and structural distribution based representation for universal quantum machine learning. , 2017, The Journal of chemical physics.

[57]  D. Bowler,et al.  O(N) methods in electronic structure calculations. , 2011, Reports on progress in physics. Physical Society.

[58]  Zhen Xie,et al.  Permutationally Invariant Polynomial Basis for Molecular Energy Surface Fitting via Monomial Symmetrization. , 2010, Journal of chemical theory and computation.

[59]  Michele Ceriotti,et al.  Chemical shifts in molecular solids by machine learning , 2018, Nature Communications.

[60]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.