Support vector learning

Foreword The Support Vector Machine has recently been introduced as a new technique for solving various function estimation problems, including the pattern recognition problem. To develop such a technique, it was necessary to rst extract factors responsible for future generalization, to obtain bounds on generalization that depend on these factors, and lastly to develop a technique that constructively minimizes these bounds. The subject of this book are methods based on combining advanced branches of statistics and functional analysis, developing these theories into practical algorithms that perform better than existing heuristic approaches. The book provides a comprehensive analysis of what can be done using Support Vector Machines, achieving record results in real-life pattern recognition problems. In addition, it proposes a new form of nonlinear Principal Component Analysis using Support Vector kernel techniques, which I consider as the most natural and elegant way for generalization of classical Principal Component Analysis. In many ways the Support Vector machine became so popular thanks to works of Bernhard Schh olkopf. The work, submitted for the title of Doktor der Naturwis-senschaften, appears as excellent. It is a substantial contribution to Machine Learning technology.

[1]  Felix . Klein,et al.  Vergleichende Betrachtungen über neuere geometrische Forschungen , 1893 .

[2]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[3]  K. Karhunen Zur Spektraltheorie stochastischer prozesse , 1946 .

[4]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[5]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[6]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[7]  Mark S. C. Reed,et al.  Method of Modern Mathematical Physics , 1972 .

[8]  G. Wahba Convergence rates of certain approximate solutions to Fredholm integral equations of the first kind , 1973 .

[9]  E. Polak Introduction to linear and nonlinear programming , 1973 .

[10]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[11]  G. Wahba,et al.  Generalized Inverses in Reproducing Kernel Spaces: An Approach to Regularization of Linear Operator Equations , 1974 .

[12]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[14]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[15]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[16]  H. Primas Chemistry, Quantum Mechanics and Reductionism , 1981 .

[17]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[18]  Leslie G. Ungerleider,et al.  Object vision and spatial vision: two cortical pathways , 1983, Trends in Neurosciences.

[19]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[20]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[21]  Alan L. Yuille,et al.  The Motion Coherence Theory , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[22]  S. Ullman Aligning pictorial descriptions: An approach to object recognition , 1989, Cognition.

[23]  John Moody,et al.  Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.

[24]  Alexander H. Waibel,et al.  A novel objective function for improved phoneme recognition using time delay neural networks , 1990, International 1989 Joint Conference on Neural Networks.

[25]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[26]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[27]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  R. Wurtz,et al.  Sensitivity of MST neurons to optic flow stimuli. I. A continuum of response selectivity to large-field stimuli. , 1991, Journal of neurophysiology.

[29]  Heinrich H. Bülthoff,et al.  Psychophysical support for a 2D view interpolation theory of object recognition , 1991 .

[30]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[31]  Léon Bottou,et al.  Local Learning Algorithms , 1992, Neural Computation.

[32]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[33]  Richard Lippmann,et al.  A Boundary Hunting Radial Basis Function Classifier which Allocates Centers Constructively , 1992, NIPS.

[34]  Isabelle Guyon,et al.  Automatic Capacity Tuning of Very Large VC-Dimension Classifiers , 1992, NIPS.

[35]  T. Poggio,et al.  Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries , 1992 .

[36]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[37]  W. Härdle Applied Nonparametric Regression , 1992 .

[38]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[39]  Yehoshua Y. Zeevi,et al.  The Canonical Coordinates Method for Pattern Deformation: Theoretical and Computational Considerations , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[41]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[42]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[43]  Thomas Vetter An early vision model for 3D object recognition , 1994 .

[44]  Isabelle Guyon,et al.  Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[45]  T. Poggio,et al.  Symmetric 3D objects are an easy case for 2D object recognition. , 1994, Spatial vision.

[46]  T. Poggio,et al.  The importance of symmetry and virtual views in three-dimensional object recognition , 1994, Current Biology.

[47]  H. Barlow The neuron doctrine in perception. , 1995 .

[48]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[49]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[50]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[51]  Juha Karhunen,et al.  Generalizations of principal component analysis, optimization problems, and neural networks , 1995, Neural Networks.

[52]  Henry S. Baird,et al.  Document image defect models , 1995 .

[53]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[54]  Kah Kay Sung,et al.  Learning and example selection for object and pattern detection , 1995 .

[55]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[56]  R. Hengstenberg,et al.  Estimation of self-motion by optic flow processing in single visual interneurons , 1996, Nature.

[57]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[58]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[59]  Bernhard Schölkopf,et al.  Comparison of View-Based Object Recognition Algorithms Using Realistic 3D Models , 1996, ICANN.

[60]  John Shawe-Taylor,et al.  A framework for structural risk minimisation , 1996, COLT '96.

[61]  Alexander J. Smola,et al.  Regression estimation with support vector learning machines , 1996 .

[62]  Tomaso Poggio,et al.  Image Representations for Visual Learning , 1996, Science.

[63]  Peter L. Bartlett,et al.  For Valid Generalization the Size of the Weights is More Important than the Size of the Network , 1996, NIPS.

[64]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[65]  Bernhard Schölkopf,et al.  From Regularization Operators to Support Vector Kernels , 1997, NIPS.

[66]  Bernhard Schölkopf,et al.  Improving the accuracy and speed of support vector learning machines , 1997, NIPS 1997.

[67]  Rajesh P. N. Rao,et al.  Localized Receptive Fields May Mediate Transformation-Invariant Recognition in the Visual Cortex , 1997 .

[68]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[69]  Bernhard Schölkopf,et al.  Prior Knowledge in Support Vector Kernels , 1997, NIPS.

[70]  Tomaso A. Poggio,et al.  Linear Object Classes and Image Synthesis From a Single Example Image , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[72]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[73]  Nikolaus F. Troje,et al.  Separation of texture and shape in images of faces for image coding and synthesis , 1997 .

[74]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[75]  Bernhard Schölkopf,et al.  On a Kernel-Based Method for Pattern Recognition, Regression, Approximation, and Operator Inversion , 1998, Algorithmica.

[76]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[77]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[78]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[79]  G. McLachlan,et al.  Pattern Classification: A Unified View of Statistical and Neural Approaches. , 1998 .

[80]  Nikolaus F. Troje,et al.  How is bilateral symmetry of human faces used for recognition of novel views? , 1998, Vision Research.

[81]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .