A statistical physics approach for the analysis of machine learning algorithms on real data

We combine the replica approach of statistical physics with a variational technique to make it applicable for the analysis of machine learning algorithms on real data. The method is applied to Gaussian process models and their relative, the support vector machine. We discuss the quality of our theoretical results in comparison to experiments. As a key result, we apply our theory on real world benchmark data and show its potential for practical applications by deriving approximate expressions for data averaged performance measures which hold for general data distributions and allow us to optimize the performance of the learning algorithm.

[1]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[2]  H. Nishimori Statistical Physics of Spin Glasses and Information Processing , 2001 .

[3]  Manfred Opper,et al.  A Variational Approach to Learning Curves , 2001, NIPS.

[4]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[5]  Christian Van den Broeck,et al.  Statistical Mechanics of Learning , 2001 .

[6]  Ole Winther,et al.  Gaussian Processes for Classification: Mean-Field Algorithms , 2000, Neural Computation.

[7]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[8]  L. Galway Spline Models for Observational Data , 1991 .

[9]  E. Gardner The space of interactions in neural network models , 1988 .

[10]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[11]  Maria Huhtala,et al.  Random Variables and Stochastic Processes , 2021, Matrix and Tensor Decompositions in Signal Processing.

[12]  R. Feynman,et al.  Quantum Mechanics and Path Integrals , 1965 .

[13]  M. Opper,et al.  Tractable approximations for probabilistic models: the adaptive Thouless-Anderson-Palmer mean field approach. , 2001, Physical review letters.

[14]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[15]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[16]  R Urbanczik,et al.  Universal learning curves of support vector machines. , 2001, Physical review letters.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[20]  Neil D. Lawrence,et al.  Advances in Neural Information Processing Systems 14 , 2002 .

[21]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[22]  G. Parisi,et al.  Statistical Field Theory , 1988 .

[23]  Grace Wahba,et al.  Spline Models for Observational Data , 1990 .

[24]  Michael Biehl,et al.  Properties of an adaptive perceptron algorithm , 1990 .

[25]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[26]  G. Hartmann,et al.  Parallel Processing in Neural Systems and Computers , 1990 .

[27]  M. Opper,et al.  Statistical mechanics of Support Vector networks. , 1998, cond-mat/9811421.

[28]  Callan,et al.  Field Theories for Learning Probability Distributions. , 1996, Physical review letters.

[29]  Manfred Opper,et al.  Statistical mechanics of learning: a variational approach for real data. , 2002, Physical review letters.

[30]  M. Mézard,et al.  Analytic and Algorithmic Solution of Random Satisfiability Problems , 2002, Science.