Machine Learning for Quantum Mechanical Properties of Atoms in Molecules

We introduce machine learning models of quantum mechanical observables of atoms in molecules. Instant out-of-sample predictions for proton and carbon nuclear chemical shifts, atomic core level excitations, and forces on atoms reach accuracies on par with density functional theory reference. Locality is exploited within nonlinear regression via local atom-centered coordinate systems. The approach is validated on a diverse set of 9 k small organic molecules. Linear scaling of computational cost in system size is demonstrated for saturated polymers with up to submesoscale lengths.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  R. Feynman Forces in Molecules , 1939 .

[3]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[4]  J. Stewart Optimization of parameters for semiempirical methods I. Method , 1989 .

[5]  W. Thiel,et al.  Anharmonic force fields from analytic second derivatives: Method and application to methyl bromide , 1989 .

[6]  Shun-ichi Amari,et al.  Four Types of Learning Curves , 1992, Neural Computation.

[7]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[8]  Vincenzo Barone,et al.  TOWARD CHEMICAL ACCURACY IN THE COMPUTATION OF NMR SHIELDINGS : THE PBE0 MODEL , 1998 .

[9]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[10]  Trygve Helgaker,et al.  Molecular Electronic-Structure Theory: Helgaker/Molecular Electronic-Structure Theory , 2000 .

[11]  J. Bozek,et al.  Adiabatic and vertical carbon 1s ionization energies in representative small molecules , 2002 .

[12]  K. Wiesner,et al.  Toward the spectrum of free polyethylene: linear alkanes studied by carbon 1s photoelectron spectroscopy and theory. , 2002, Journal of the American Chemical Society.

[13]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[14]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[15]  B. Champagne,et al.  Theoretical investigation on 1H and 13C NMR chemical shifts of small alkanes and chloroalkanes. , 2006, The Journal of chemical physics.

[16]  F. Weigend Accurate Coulomb-fitting basis sets for H to Rn. , 2006, Physical chemistry chemical physics : PCCP.

[17]  Remarks on GIAO‐DFT predictions of 13C chemical shifts , 2009, Magnetic resonance in chemistry : MRC.

[18]  Ali Alavi,et al.  Fermion Monte Carlo without fixed nodes: a game of life, death, and annihilation in Slater determinant space. , 2009, The Journal of chemical physics.

[19]  Ivan Duchemin,et al.  A scalable and accurate algorithm for the computation of Hartree-Fock exchange , 2010, Comput. Phys. Commun..

[20]  Constantine Bekas,et al.  Very large scale wavefunction orthogonalization in Density Functional Theory electronic structure calculations , 2010, Comput. Phys. Commun..

[21]  K. J. Børve,et al.  Accuracy of Calculated Chemical Shifts in Carbon 1s Ionization Energies from Single-Reference ab Initio Methods and Density Functional Theory. , 2011, Journal of chemical theory and computation.

[22]  K. Burke Perspective on density functional theory. , 2012, The Journal of chemical physics.

[23]  O. A. V. Lilienfeld,et al.  First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties , 2012, 1209.5033.

[24]  W. Marsden I and J , 2012 .

[25]  Jean-Louis Reymond,et al.  Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 , 2012, J. Chem. Inf. Model..

[26]  Paul L. A. Popelier,et al.  Polarisable multipolar electrostatics from the machine learning method Kriging: an application to alanine , 2012, Theoretical Chemistry Accounts.

[27]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[28]  Klaus-Robert Müller,et al.  Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. , 2013, Journal of chemical theory and computation.

[29]  M. Rupp,et al.  Machine learning of molecular electronic properties in chemical compound space , 2013, 1305.7074.

[30]  M. Rupp,et al.  Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties , 2013, 1307.2918.

[31]  N. A. Romero,et al.  Application of Diffusion Monte Carlo to Materials Dominated by van der Waals Interactions. , 2014, Journal of chemical theory and computation.

[32]  C. Ochsenfeld,et al.  Benchmarking Hydrogen and Carbon NMR Chemical Shifts at HF, DFT, and MP2 Levels. , 2014, Journal of chemical theory and computation.

[33]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[34]  A. Becke Perspective: Fifty years of density-functional theory in chemical physics. , 2014, The Journal of chemical physics.

[35]  Zhenwei Li,et al.  Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. , 2015, Physical review letters.

[36]  Li Li,et al.  Understanding Kernel Ridge Regression: Common behaviors from simple functions to density functionals , 2015, ArXiv.

[37]  Neil Vasdev,et al.  Evaluating the accuracy of density functional theory for calculating 1H and 13C NMR chemical shifts in drug molecules , 2015 .

[38]  J. Vybíral,et al.  Big data of materials science: critical role of the descriptor. , 2014, Physical review letters.

[39]  O. A. von Lilienfeld,et al.  Electronic spectra from TDDFT and machine learning in chemical space. , 2015, The Journal of chemical physics.

[40]  Matthias Rupp,et al.  Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. , 2015, Journal of chemical theory and computation.