论文信息 - Extrapolation and learning equations

Extrapolation and learning equations

In classical machine learning, regression is treated as a black box process of identifying a suitable function from a hypothesis set without attempting to gain insight into the mechanism connecting inputs and outputs. In the natural sciences, however, finding an interpretable function for a phenomenon is the prime goal as it allows to understand and generalize results. This paper proposes a novel type of function learning network, called equation learner (EQL), that can learn analytical expressions and is able to extrapolate to unseen domains. It is implemented as an end-to-end differentiable feed-forward network and allows for efficient gradient based training. Due to sparsity regularization concise interpretable expressions can be obtained. Often the true underlying source expression is identified.

Christoph H. Lampert | Georg Martius | G. Martius

[1] David E. Rumelhart,et al. Product Units: A Computationally Powerful and Biologically Plausible Extension to Backpropagation Networks , 1989, Neural Computation.

[2] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[3] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[4] Bernhard Schölkopf,et al. Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[5] Bernhard Schölkopf,et al. A tutorial on support vector regression , 2004, Stat. Comput..

[6] Donald F. Specht,et al. A general regression neural network , 1991, IEEE Trans. Neural Networks.

[7] D. Basak,et al. Support Vector Regression , 2008 .

[8] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[9] Joydeep Ghosh,et al. The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[10] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[11] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13] D. Broomhead,et al. Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[14] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[15] Gunnar Rätsch,et al. Predicting Time Series with Support Vector Machines , 1997, ICANN.

[16] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .

[17] Wolfgang Härdle,et al. Nonparametric Curve Estimation from Time Series , 1989 .

[18] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Wojciech Zaremba,et al. Learning to Discover Efficient Mathematical Identities , 2014, NIPS.

[20] E. Kessler,et al. X-ray transition energies: new approach to a comprehensive evaluation , 2003 .

[21] Pedro M. Domingos,et al. Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[22] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[23] Hod Lipson,et al. Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.