Understanding Kernel Ridge Regression: Common behaviors from simple functions to density functionals

Accurate approximations to density functionals have recently been obtained via machine learning (ML). By applying ML to a simple function of one variable without any random sampling, we extract the qualitative dependence of errors on hyperparameters. We find universal features of the behavior in extreme limits, including both very small and very large length scales, and the noise-free limit. We show how such features arise in ML models of density functionals.

[1]  Li Li,et al.  Understanding Machine-learned Density Functionals , 2014, ArXiv.

[2]  Klaus-Robert Müller,et al.  Finding Density Functionals with Machine Learning , 2011, Physical review letters.

[3]  Saad,et al.  On-line learning in soft committee machines. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[4]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[5]  Klaus Schulten,et al.  A Numerical Study on Learning Curves in Stochastic Multilayer Feedforward Networks , 1996, Neural Computation.

[6]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[7]  Kieron Burke,et al.  Electronic structure via potential functional approximations. , 2011, Physical review letters.

[8]  Shun-ichi Amari,et al.  Dynamics of learning near singularities in radial basis function networks , 2008, Neural Networks.

[9]  Heike Freud,et al.  On Line Learning In Neural Networks , 2016 .

[10]  R. Dreizler,et al.  Density Functional Theory: An Approach to the Quantum Many-Body Problem , 1991 .

[11]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[12]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[13]  P. Hohenberg,et al.  Inhomogeneous Electron Gas , 1964 .

[14]  Kieron Burke,et al.  DFT: A Theory Full of Holes? , 2014, Annual review of physical chemistry.

[15]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[16]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[17]  Ovidiu Ivanciuc,et al.  Applications of Support Vector Machines in Chemistry , 2007 .

[18]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[19]  C. Weizsäcker Zur Theorie der Kernmassen , 1935 .

[20]  John C. Snyder,et al.  Orbital-free bond breaking via machine learning. , 2013, The Journal of chemical physics.

[21]  Michael Biehl,et al.  Transient dynamics of on-line learning in two-layered neural networks , 1996 .

[22]  Klaus-Robert Müller,et al.  Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. , 2013, Journal of chemical theory and computation.

[23]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[24]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[25]  E. L. Short,et al.  Quantum Chemistry , 1969, Nature.

[26]  Kenji Fukumizu,et al.  Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.

[27]  Klaus-Robert Müller,et al.  Nonlinear gradient denoising: Finding accurate extrema from inaccurate functional derivatives , 2015 .

[28]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[29]  W. Kohn,et al.  Self-Consistent Equations Including Exchange and Correlation Effects , 1965 .

[30]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[31]  Klaus-Robert Müller,et al.  Optimizing transition states via kernel-based machine learning. , 2012, The Journal of chemical physics.

[32]  Klaus-Robert Müller,et al.  Kernels, Pre-images and Optimization , 2013, Empirical Inference.