Kohn-Sham equations as regularizer: building prior knowledge into machine-learned physics

Including prior knowledge is important for effective machine learning models in physics and is usually achieved by explicitly adding loss terms or constraints on model architectures. Prior knowledge embedded in the physics computation itself rarely draws attention. We show that solving the Kohn-Sham equations when training neural networks for the exchange-correlation functional provides an implicit regularization that greatly improves generalization. Two separations suffice for learning the entire one-dimensional H_{2} dissociation curve within chemical accuracy, including the strongly correlated region. Our models also generalize to unseen types of molecules and overcome self-interaction error.

[1]  L. H. Thomas The calculation of atomic fields , 1927, Mathematical Proceedings of the Cambridge Philosophical Society.

[2]  P. Hohenberg,et al.  Inhomogeneous Electron Gas , 1964 .

[3]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[6]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[7]  Krishnan Raghavachari,et al.  Gaussian-2 theory for molecular energies of first- and second-row compounds , 1991 .

[8]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[9]  White,et al.  Density matrix formulation for quantum renormalization groups. , 1992, Physical review letters.

[10]  Nicholas C. Handy,et al.  Exchange‐correlation potentials , 1996 .

[11]  Kresse,et al.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. , 1996, Physical review. B, Condensed matter.

[12]  Michael J. Todd,et al.  Mathematical programming , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[13]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .

[14]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Weitao Yang,et al.  Insights into Current Limitations of Density Functional Theory , 2008, Science.

[17]  G. Schatz The journal of physical chemistry letters , 2009 .

[18]  George C. Schatz,et al.  The journal of physical chemistry letters , 2009 .

[19]  J. Andrew Bagnell,et al.  Efficient Reductions for Imitation Learning , 2010, AISTATS.

[20]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[21]  W. Marsden I and J , 2012 .

[22]  Klaus-Robert Müller,et al.  Finding Density Functionals with Machine Learning , 2011, Physical review letters.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Kieron Burke,et al.  One-dimensional continuum electronic structure with the density-matrix renormalization group and its implications for density-functional theory. , 2011, Physical review letters.

[25]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[26]  Kieron Burke,et al.  Guaranteed convergence of the Kohn-Sham equations. , 2013, Physical Review Letters.

[27]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[28]  Kieron Burke,et al.  Gedanken densities and exact constraints in density functional theory. , 2014, The Journal of chemical physics.

[29]  Martin J. Wainwright,et al.  Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[30]  R. O. Jones,et al.  Density functional theory: Its origins, rise to prominence, and future , 2015 .

[31]  T. E. Baker,et al.  One Dimensional Mimicking of Electronic Structure: The Case for Exponentials , 2015, 1504.05620.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[34]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[35]  Kieron Burke,et al.  Pure density functional for strong correlation and the thermodynamic limit from machine learning , 2016, 1609.03705.

[36]  Daniel S. Jensen,et al.  Numerical methods for the inverse problem of density functional theory , 2017, 1703.04553.

[37]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[38]  Li Li,et al.  Bypassing the Kohn-Sham equations with machine learning , 2016, Nature Communications.

[39]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[40]  Jascha Sohl-Dickstein,et al.  REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.

[41]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[42]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[43]  Daniel Cremers,et al.  Regularization for Deep Learning: A Taxonomy , 2017, ArXiv.

[44]  M. Head‐Gordon,et al.  Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals , 2017 .

[45]  Risi Kondor,et al.  Covariant Compositional Networks For Learning Graphs , 2018, ICLR.

[46]  Ryo Nagai,et al.  Neural-network Kohn-Sham exchange-correlation potential and its out-of-training transferability. , 2018, The Journal of chemical physics.

[47]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[48]  Michael Innes,et al.  Fashionable Modelling with Flux , 2018, ArXiv.

[49]  Kenji Doya,et al.  Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning , 2017, Neural Networks.

[50]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[51]  Amir Barati Farimani,et al.  Weakly-Supervised Deep Learning of Heat Transport via Physics Informed Loss , 2018, ArXiv.

[52]  Alán Aspuru-Guzik,et al.  Automatic Differentiation in Quantum Chemistry with Applications to Fully Variational Hartree–Fock , 2017, ACS central science.

[53]  Kieron Burke,et al.  Can exact conditions improve machine-learned density functionals? , 2018, The Journal of chemical physics.

[54]  E. D. Cubuk,et al.  JAX, M.D.: End-to-End Differentiable, Hardware Accelerated, Molecular Dynamics in Pure Python , 2019, 1912.04232.

[55]  Kristof T. Schütt,et al.  Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions , 2019, Nature Communications.

[56]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[57]  Jonathan Schmidt,et al.  Machine Learning the Physical Nonlocal Exchange-Correlation Functional of Density-Functional Theory. , 2019, The journal of physical chemistry letters.

[58]  David Pfau,et al.  Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks , 2019, Physical Review Research.

[59]  Jascha Sohl-Dickstein,et al.  Guided evolutionary strategies: augmenting random search with surrogate gradients , 2018, ICML.

[60]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[61]  Yan Liu,et al.  Differentiable Physics-informed Graph Networks , 2019, ArXiv.

[62]  Hao Wu,et al.  Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning , 2018, Science.

[63]  Stephan Hoyer,et al.  Learning data-driven discretizations for partial differential equations , 2018, Proceedings of the National Academy of Sciences.

[64]  J. Z. Kolter,et al.  Deep Equilibrium Models , 2019, NeurIPS.

[65]  Jascha Sohl-Dickstein,et al.  Neural reparameterization improves structural optimization , 2019, ArXiv.

[66]  Alan Edelman,et al.  A Differentiable Programming System to Bridge Machine Learning and Scientific Computing , 2019, ArXiv.

[67]  GuanHua Chen,et al.  Toward the Exact Exchange–Correlation Potential: A Three-Dimensional Convolutional Neural Network Construct , 2019 .

[68]  Lei Wang,et al.  Differentiable Programming Tensor Networks , 2019, Physical Review X.

[69]  Paris Perdikaris,et al.  Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , 2019, J. Comput. Phys..

[70]  Ekin D Cubuk,et al.  Screening billions of candidates for solid lithium-ion conductors: A transfer learning approach for small data. , 2019, The Journal of chemical physics.

[71]  E. M. Stoudenmire,et al.  The ITensor Software Library for Tensor Network Calculations , 2020, SciPost Physics Codebases.

[72]  F. Noé,et al.  Deep-neural-network solution of the electronic Schrödinger equation , 2019, Nature Chemistry.

[73]  Ryo Nagai,et al.  Completing density functional theory by machine learning hidden messages from molecules , 2019, npj Computational Materials.

[74]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.