Fisher-regularized support vector machine

This paper proposes Fisher regularized support vector machine (FisherSVM).FisherSVM is a graph-based supervised learning method.FisherSVM has two regularization terms: maximum margin and Fisher regularizations.FisherSVM aims to maximize the margin and minimize the within-class scatter. Support vector machine (SVM) and Fisher discriminant analysis (FDA) are two commonly used methods in machine learning and pattern recognition. A combined method of the linear SVM and FDA, called SVM/LDA (linear discriminant analysis), has been proposed only for the linear case. This paper generalizes this combined method to the nonlinear case from the view of regularization. A Fisher regularization is defined and incorporated into SVM to obtain a Fisher regularized support vector machine (FisherSVM). In FisherSVM, there are two regularization terms , the maximum margin regularization and Fisher regularization, which allow FisherSVM to maximize the classification margin and minimize the within-class scatter. Roughly speaking, FisherSVM can approximatively fulfill the Fisher criterion and obtain good statistical separability. This paper also discusses the connections of FisherSVM to graph-based regularization methods and the mathematical programming method to Kernel Fisher Discriminant Analysis (KFDA). It shows that FisherSVM can be thought of as a graph-based supervised learning method or a robust KFDA. Experimental results on artificial and real-world data show that FisherSVM has a promising generalization performance.

[1]  Mario Michael Krell,et al.  Balanced Relative Margin Machine - The missing piece between FDA and SVM classification , 2014, Pattern Recognit. Lett..

[2]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[3]  Friedhelm Schwenker,et al.  Three learning phases for radial-basis-function networks , 2001, Neural Networks.

[4]  Chih-Cheng Chang,et al.  A novel framework for multi-class classification via ternary smooth support vector machine , 2011, Pattern Recognit..

[5]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[6]  Xiaojin Zhu,et al.  Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning , 2005, ICML.

[7]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[8]  Tony Jebara,et al.  Maximum Relative Margin and Data-Dependent Regularization , 2010, J. Mach. Learn. Res..

[9]  John D. Lafferty,et al.  Semi-supervised learning using randomized mincuts , 2004, ICML.

[10]  Jian Yang,et al.  A new kernel Fisher discriminant algorithm with application to face recognition , 2004, Neurocomputing.

[11]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Hujun Bao,et al.  Laplacian Regularized Gaussian Mixture Model for Data Clustering , 2011, IEEE Transactions on Knowledge and Data Engineering.

[13]  Ja-Chen Lin,et al.  A new LDA-based face recognition system which can solve the small sample size problem , 1998, Pattern Recognit..

[14]  Rama Chellappa,et al.  An experimental evaluation of linear and kernel-based methods for face recognition , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[15]  Li Zhang,et al.  Wavelet support vector machine , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Weida Zhou,et al.  Improved decomposition method for support vector machines , 2003, Proceedings Fifth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2003.

[17]  D. B. Graham,et al.  Characterising Virtual Eigensignatures for General Purpose Face Recognition , 1998 .

[18]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[19]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[20]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[21]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[22]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[23]  Soushan Wu,et al.  Credit rating analysis with support vector machines and neural networks: a market comparative study , 2004, Decis. Support Syst..

[24]  Xinjun Peng,et al.  TPMSVM: A novel twin parametric-margin support vector machine for pattern recognition , 2011, Pattern Recognit..

[25]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[26]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[27]  Weida Zhou,et al.  Support Vector Machines Based on the Orthogonal Projection Kernel of Father Wavelet , 2005, Int. J. Comput. Intell. Appl..

[28]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[29]  Jiawei Han,et al.  Spectral Regression for Efficient Regularized Subspace Learning , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[30]  Konstantinos N. Plataniotis,et al.  An efficient kernel discriminant analysis method , 2005, Pattern Recognit..

[31]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[32]  Johan A. K. Suykens,et al.  Bayesian Framework for Least-Squares Support Vector Machine Classifiers, Gaussian Processes, and Kernel Fisher Discriminant Analysis , 2002, Neural Computation.

[33]  Bernhard Schölkopf,et al.  An improved training algorithm for kernel Fisher discriminants , 2001, AISTATS.

[34]  Chih-Chou Chiu,et al.  Financial time series forecasting using independent component analysis and support vector regression , 2009, Decis. Support Syst..

[35]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[36]  Gunnar Rätsch,et al.  A Mathematical Programming Approach to the Kernel Fisher Algorithm , 2000, NIPS.

[37]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[38]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[39]  Li Zhang,et al.  Density-induced margin support vector machines , 2011, Pattern Recognit..

[40]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[41]  Yuntao Qian,et al.  Face recognition using a kernel fractional-step discriminant analysis algorithm , 2007, Pattern Recognit..

[42]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[43]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[44]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[45]  Li Zhang,et al.  Decision Tree Support Vector Machine , 2007, Int. J. Artif. Intell. Tools.

[46]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.