MLatom: A program package for quantum chemical research assisted by machine learning

MLatom is a program package designed for computationally efficient simulations of atomistic systems with machine‐learning (ML) algorithms. It can be used out‐of‐the‐box as a stand‐alone program with a user‐friendly online manual. The use of MLatom does not require extensive knowledge of machine learning, programming, or scripting. The user need only prepare input files and choose appropriate options. The program implements kernel ridge regression and supports Gaussian, Laplacian, and Matérn kernels. It can use arbitrary, user‐provided input vectors and can convert molecular geometries into input vectors corresponding to several types of built‐in molecular descriptors. MLatom saves and re‐uses trained ML models as needed, in addition to estimating the generalization error of ML setups. Various sampling procedures are supported and the gradients of output properties can be calculated. The core part of MLatom is written in Fortran, uses standard libraries for linear algebra, and is optimized for shared‐memory parallel computations. © 2019 Wiley Periodicals, Inc.

[1]  Kieron Burke,et al.  Guest Editorial: Special Topic on Data-Enabled Theoretical Chemistry. , 2018, The Journal of chemical physics.

[2]  Alexander Denzel,et al.  Gaussian Process Regression for Geometry Optimization , 2018, 2009.05803.

[3]  Matthias Rupp,et al.  Machine learning for quantum mechanics in a nutshell , 2015 .

[4]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5]  Noam Bernstein,et al.  Machine learning unifies the modeling of materials and molecules , 2017, Science Advances.

[6]  H. J. Mclaughlin,et al.  Learn , 2002 .

[7]  Roman V. Krems,et al.  Efficient non-parametric fitting of potential energy surfaces for polyatomic molecules with Gaussian processes , 2015, 1509.06473.

[8]  Ove Christiansen,et al.  Gaussian process regression to accelerate geometry optimizations relying on numerical differentiation. , 2018, The Journal of chemical physics.

[9]  Walter Thiel,et al.  Structure-based sampling and self-correcting machine learning for accurate calculations of potential energy surfaces and vibrational levels. , 2017, The Journal of chemical physics.

[10]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  B. Taylor,et al.  CODATA recommended values of the fundamental physical constants: 2006 | NIST , 2007, 0801.0028.

[13]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[14]  Jörg Behler,et al.  Comparison of permutationally invariant polynomials, neural networks, and Gaussian approximation potentials in representing water interactions through many-body expansions. , 2018, The Journal of chemical physics.

[15]  Matthias Rupp,et al.  Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. , 2015, Journal of chemical theory and computation.

[16]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[17]  Klaus-Robert Müller,et al.  Machine learning of accurate energy-conserving molecular force fields , 2016, Science Advances.

[18]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[19]  David R. Glowacki,et al.  Training neural nets to learn reactive potential energy surfaces using interactive quantum chemistry in virtual reality , 2019, The journal of physical chemistry. A.

[20]  Daniel W. Davies,et al.  Machine learning for molecular and materials science , 2018, Nature.

[21]  Gábor Csányi,et al.  Gaussian approximation potentials: A brief tutorial introduction , 2015, 1502.01366.

[22]  David L. Mobley,et al.  SAMPL6 challenge results from $$pK_a$$pKa predictions based on a general Gaussian process model , 2018, J. Comput. Aided Mol. Des..

[23]  Thomas F. Miller,et al.  Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis. , 2018, Journal of chemical theory and computation.

[24]  Klaus-Robert Müller,et al.  Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. , 2013, Journal of chemical theory and computation.

[25]  Markus Meuwly,et al.  Toolkit for the Construction of Reproducing Kernel-Based Representations of Data: Application to Multidimensional Potential Energy Surfaces , 2017, J. Chem. Inf. Model..

[26]  Walter Thiel,et al.  Nonadiabatic Excited-State Dynamics with Machine Learning , 2018, The journal of physical chemistry letters.

[27]  T. Gneiting,et al.  Matérn Cross-Covariance Functions for Multivariate Random Fields , 2010 .

[28]  Alexander Denzel,et al.  Gaussian Process Regression for Transition State Search. , 2018, Journal of chemical theory and computation.

[29]  Walter Thiel,et al.  Machine Learning of Parameters for Accurate Semiempirical Quantum Chemical Calculations , 2015, Journal of chemical theory and computation.

[30]  Jonathan Tennyson,et al.  Accurate ab initio vibrational energies of methyl chloride. , 2015, The Journal of chemical physics.

[31]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[32]  Denis G. Artiukhin,et al.  Approximate high mode coupling potentials using Gaussian process regression and adaptive density guided sampling. , 2019, The Journal of chemical physics.

[33]  Rongjie Lai,et al.  Exact Reconstruction of Euclidean Distance Geometry Problem Using Low-Rank Matrix Completion , 2018, IEEE Transactions on Information Theory.