The Mathematics of Learning: Dealing with Data

Abstract Learning is key to developing systems tailored to a broad range of data analysis and information extraction tasks. We outline the mathematical foundations of learning theory and describe a key algorithm of it.

[1]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[2]  E. Parzen An Approach to Time Series Analysis , 1961 .

[3]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[4]  Charles A. Micchelli,et al.  A Survey of Optimal Recovery , 1977 .

[5]  Grace Wahba Smoothing and Ill-Posed Problems , 1979 .

[6]  Tomaso Poggio,et al.  Computational vision and regularization theory , 1985, Nature.

[7]  C. Micchelli Interpolation of scattered data: Distance matrices and conditionally positive definite functions , 1986 .

[8]  R. Dudley Universal Donsker Classes and Metric Entropy , 1987 .

[9]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[10]  M. Bertero,et al.  Ill-posed problems in early vision , 1988, Proc. IEEE.

[11]  David S. Broomhead,et al.  Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..

[12]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[13]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[14]  R. DeVore,et al.  Optimal nonlinear approximation , 1989 .

[15]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[16]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[17]  T. Poggio,et al.  A network that learns to recognize three-dimensional objects , 1990, Nature.

[18]  L. Galway Spline Models for Observational Data , 1991 .

[19]  Roberto Brunelli,et al.  HyperBF Networks for Real Object Recognition , 1991, IJCAI.

[20]  R. Dudley,et al.  Uniform and universal Glivenko-Cantelli classes , 1991 .

[21]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[22]  Tomaso Poggio,et al.  A Novel Approach to Graphics , 1992 .

[23]  Tomaso A. Poggio,et al.  Regularization Theory and Neural Networks Architectures , 1995, Neural Computation.

[24]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[25]  Federico Girosi,et al.  On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions , 1996, Neural Computation.

[26]  Tomaso Poggio,et al.  Image Representations for Visual Learning , 1996, Science.

[27]  Noga Alon,et al.  Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.

[28]  Martin Vetterli,et al.  Data Compression and Harmonic Analysis , 1998, IEEE Trans. Inf. Theory.

[29]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[30]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[31]  R. DeVore,et al.  Nonlinear approximation , 1998, Acta Numerica.

[32]  Federico Girosi,et al.  An Equivalence Between Sparse Approximation and Support Vector Machines , 1998, Neural Computation.

[33]  Federico Girosi,et al.  Generalization bounds for function approximation from scattered noisy data , 1999, Adv. Comput. Math..

[34]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[35]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[36]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[37]  Thomas Serre,et al.  Categorization by Learning and Combining Object Parts , 2001, NIPS.

[38]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[40]  R. Karp Mathematical Challenges from Genomics and Molecular Biology , 2002 .

[41]  Ding-Xuan Zhou,et al.  The covering number in learning theory , 2002, J. Complex..

[42]  Alessandro Verri,et al.  Learning and vision machines , 2002, Proc. IEEE.

[43]  Partha Niyogi,et al.  Almost-everywhere Algorithmic Stability and Generalization Error , 2002, UAI.

[44]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[45]  Shahar Mendelson,et al.  Improving the sample complexity using global data , 2002, IEEE Trans. Inf. Theory.

[46]  Felipe Cucker,et al.  Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..

[47]  Tomaso Poggio,et al.  Everything old is new again: a fresh look at historical approaches in machine learning , 2002 .

[48]  S. Smale,et al.  ESTIMATING THE APPROXIMATION ERROR IN LEARNING THEORY , 2003 .

[49]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[50]  S. Mendelson Geometric Parameters in Learning Theory , 2004 .

[51]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[52]  Andrew R. Barron,et al.  Approximation and estimation bounds for artificial neural networks , 2004, Machine Learning.

[53]  Tony Ezzat,et al.  Trainable videorealistic speech animation , 2002, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[54]  Tomaso A. Poggio,et al.  Introduction: Learning and Vision at CBCL , 2004, International Journal of Computer Vision.

[55]  S. Smale,et al.  Reproducing kernel hilbert spaces in learning theory , 2006 .

[56]  R. Shah,et al.  Least Squares Support Vector Machines , 2022 .