A Review of Reduced Kernel Trick in Machine Learning

We give a comprehensive introduction to the reduced support vector machine, its extensions and applications. We describe original RSVM algorithm and the statistical theory behind it. Three schemes for selecting the representative reduced set are introduced. These schemes lead to a smaller reduced set than the random sampling scheme without sacrificing prediction accuracy. Although smaller reduced set will have faster support vector machine training, one has to pay extra CPU time in learning the reduced set selection. In addition to classification, applications of reduced kernel trick to regression and dimension reduction are also included in this survey paper. We finally embed the RSVMs in the MapReduce framework for extremely large scale datasets. Some preliminary numerical studies show that RSVMs in MapReduce framework has a good potential for solving large scale nonlinear support vector machines. We believe that the reduced kernel trick will be an important technique in the Big Data era.

[1]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[2]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[3]  Charles P. Staelin Parameter selection for support vector machines , 2002 .

[4]  Chun-Houh Chen,et al.  CAN SIR BE AS POPULAR AS MULTIPLE LINEAR REGRESSION , 2003 .

[5]  Han-Ming Wu Kernel Sliced Inverse Regression with Applications to Classification , 2008 .

[6]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[7]  Ker-Chau Li Sliced inverse regression for dimension reduction (with discussion) , 1991 .

[8]  Yuh-Jye Lee,et al.  SSVM: A Smooth Support Vector Machine for Classification , 2001, Comput. Optim. Appl..

[9]  Dr. M. G. Worster Methods of Mathematical Physics , 1947, Nature.

[10]  Fernando De la Torre,et al.  Robust Kernel Principal Component Analysis , 2008, NIPS.

[11]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[12]  Su-Yun Huang,et al.  Incremental Reduced Support Vector Machines , 2001 .

[13]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[14]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[15]  Yuh-Jye Lee,et al.  Variant Methods of Reduced Set Selection for Reduced Support Vector Machines , 2010, J. Inf. Sci. Eng..

[16]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[17]  Temple F. Smith Occam's razor , 1980, Nature.

[18]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[19]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[20]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[21]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[22]  Yen-Jen Oyang,et al.  An efficient learning algorithm for function approximation with radial basis function networks , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[23]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[24]  Su-Yun Huang,et al.  Nonlinear Dimension Reduction with Kernel Sliced Inverse Regression , 2009, IEEE Transactions on Knowledge and Data Engineering.

[25]  Su-Yun Huang,et al.  Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices , 2013, J. Multivar. Anal..

[26]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[27]  Su-Yun Huang,et al.  Model selection for support vector machines via uniform design , 2007, Comput. Stat. Data Anal..

[28]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[29]  Witold Pedrycz,et al.  A Multifaceted Perspective at Data Analysis: A Study in Collaborative Intelligent Agents , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  Su-Yun Huang,et al.  Reduced Support Vector Machines: A Statistical Theory , 2007, IEEE Transactions on Neural Networks.

[31]  C. K. Hsiao,et al.  Nonlinear measures of association with kernel canonical correlation analysis and applications , 2009 .

[32]  Lars Kai Hansen,et al.  Adaptive Regularization in Neural Network Modeling , 1996, Neural Networks: Tricks of the Trade.

[33]  R. H. Moore,et al.  Regression Graphics: Ideas for Studying Regressions Through Graphics , 1998, Technometrics.