An assessment of the structural resolution of various fingerprints commonly used in machine learning

Atomic environment fingerprints are widely used in computational materials science, from machine learning potentials to the quantification of similarities between atomic configurations. Many approaches to the construction of such fingerprints, also called structural descriptors, have been proposed. In this work, we compare the performance of fingerprints based on the overlap matrix, the smooth overlap of atomic positions, Behler–Parrinello atom-centered symmetry functions, modified Behler–Parrinello symmetry functions used in the ANI-1ccx potential and the Faber–Christensen–Huang–Lilienfeld fingerprint under various aspects. We study their ability to resolve differences in local environments and in particular examine whether there are certain atomic movements that leave the fingerprints exactly or nearly invariant. For this purpose, we introduce a sensitivity matrix whose eigenvalues quantify the effect of atomic displacement modes on the fingerprint. Further, we check whether these displacements correlate with the variation of localized physical quantities such as forces. Finally, we extend our examination to the correlation between molecular fingerprints obtained from the atomic fingerprints and global quantities of entire molecules.

[1]  O. Anatole von Lilienfeld,et al.  The "DNA" of chemistry: Scalable quantum machine learning with "amons" , 2017, 1707.04146.

[2]  Jörg Behler,et al.  Constructing high‐dimensional neural network potentials: A tutorial review , 2015 .

[3]  J. Behler Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations. , 2011, Physical chemistry chemical physics : PCCP.

[4]  O. A. von Lilienfeld,et al.  Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity. , 2016, The Journal of chemical physics.

[5]  Lorenz C. Blum,et al.  970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. , 2009, Journal of the American Chemical Society.

[6]  Christopher M Wolverton,et al.  Atomistic calculations and materials informatics: A review , 2017 .

[7]  Anders S. Christensen,et al.  Alchemical and structural distribution based representation for universal quantum machine learning. , 2017, The Journal of chemical physics.

[8]  A. Oganov,et al.  Crystal Structure Prediction Using Evolutionary Approach , 2010 .

[9]  Jörg Behler,et al.  From Molecular Fragments to the Bulk: Development of a Neural Network Potential for MOF-5. , 2019, Journal of chemical theory and computation.

[10]  M Gastegger,et al.  wACSF-Weighted atom-centered symmetry functions as descriptors in machine learning potentials. , 2017, The Journal of chemical physics.

[11]  Ali Sadeghi,et al.  A fingerprint based metric for measuring similarities of crystalline structures. , 2015, The Journal of chemical physics.

[12]  T. Frauenheim,et al.  DFTB+, a sparse matrix-based implementation of the DFTB method. , 2007, The journal of physical chemistry. A.

[13]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[14]  Chiho Kim,et al.  Iterative-Learning Strategy for the Development of Application-Specific Atomistic Force Fields , 2019, The Journal of Physical Chemistry C.

[15]  S. Goedecker Linear scaling electronic structure methods , 1999 .

[16]  S. Goedecker,et al.  Metrics for measuring distances in configuration spaces. , 2013, The Journal of chemical physics.

[17]  Hakan Erturk,et al.  A novel approach to describe chemical environments in high-dimensional neural network potentials. , 2019, The Journal of chemical physics.

[18]  F. Leusen,et al.  A major advance in crystal structure prediction. , 2008, Angewandte Chemie.

[19]  Shweta Jindal,et al.  Spherical harmonics based descriptor for neural network potentials: Structure and dynamics of Au147 nanocluster. , 2017, The Journal of chemical physics.

[20]  P. Popelier,et al.  Potential energy surfaces fitted by artificial neural networks. , 2010, The journal of physical chemistry. A.

[21]  M. Babaei,et al.  Locality meets machine learning: Excited and ground-state energy surfaces of large systems at the cost of small ones , 2020 .

[22]  Joost VandeVondele,et al.  Machine Learning Adaptive Basis Sets for Efficient Large Scale Density Functional Theory Simulation , 2018, Journal of chemical theory and computation.

[23]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[24]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[25]  Christian Trott,et al.  Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials , 2014, J. Comput. Phys..

[26]  Lei Cheng,et al.  The Electrolyte Genome project: A big data approach in battery materials discovery , 2015 .

[27]  Mario Valle,et al.  How to quantify energy landscapes of solids. , 2009, The Journal of chemical physics.

[28]  M. Rupp,et al.  Machine Learning for Quantum Mechanical Properties of Atoms in Molecules , 2015, 1505.00350.

[29]  Anders S Christensen,et al.  FCHL revisited: Faster and more accurate quantum machine learning. , 2020, The Journal of chemical physics.

[30]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[31]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[32]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[33]  Álvaro Vázquez-Mayagoitia,et al.  Norm-conserving pseudopotentials with chemical accuracy compared to all-electron calculations. , 2012, The Journal of chemical physics.

[34]  Jörg Behler,et al.  Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials. , 2018, The Journal of chemical physics.

[35]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[36]  J. Behler First Principles Neural Network Potentials for Reactive Simulations of Large Molecular and Condensed Systems. , 2017, Angewandte Chemie.

[37]  Stefan Goedecker,et al.  Crystal structure prediction using the minima hopping method. , 2010, The Journal of chemical physics.

[38]  Reinhold Schneider,et al.  Daubechies wavelets as a basis set for density functional pseudopotential calculations. , 2008, The Journal of chemical physics.

[39]  Muratahan Aykol,et al.  The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies , 2015 .

[40]  S. Curtarolo,et al.  AFLOW: An automatic framework for high-throughput materials discovery , 2012, 1308.5715.

[41]  Stéphane Mallat,et al.  Solid Harmonic Wavelet Scattering for Predictions of Molecule Properties , 2018, The Journal of chemical physics.

[42]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[43]  S. Goedecker,et al.  Finding Reaction Pathways with Optimal Atomic Index Mappings. , 2019, Physical review letters.

[44]  Muratahan Aykol,et al.  Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) , 2013 .

[45]  Gabor Csanyi,et al.  Achieving DFT accuracy with a machine-learning interatomic potential: thermomechanics and defects in bcc ferromagnetic iron , 2017, 1706.10229.

[46]  J. Behler Atom-centered symmetry functions for constructing high-dimensional neural network potentials. , 2011, The Journal of chemical physics.

[47]  Rampi Ramprasad,et al.  Machine Learning Force Fields: Construction, Validation, and Outlook , 2016, 1610.02098.

[48]  Stefano Curtarolo,et al.  High-throughput and data mining with ab initio methods , 2004 .

[49]  Mark Asta,et al.  A database to enable discovery and design of piezoelectric materials , 2015, Scientific Data.

[50]  D. Pettifor,et al.  Electronic structure based descriptor for characterizing local atomic environments , 2018, Physical Review B.