Hyperdisk based large margin classifier

We introduce a large margin linear binary classification framework that approximates each class with a hyperdisk - the intersection of the affine support and the bounding hypersphere of its training samples in feature space - and then finds the linear classifier that maximizes the margin separating the two hyperdisks. We contrast this with Support Vector Machines (SVMs), which find the maximum-margin separator of the pointwise convex hulls of the training samples, arguing that replacing convex hulls with looser convex class models such as hyperdisks provides safer margin estimates that improve the accuracy on some problems. Both the hyperdisks and their separators are found by solving simple quadratic programs. The method is extended to nonlinear feature spaces using the kernel trick, and multi-class problems are dealt with by combining binary classifiers in the same ways as for SVMs. Experiments on a range of data sets show that the method compares favourably with other popular large margin classifiers.

[1]  Kristin P. Bennett,et al.  Combining support vector and mathematical programming methods for classification , 1999 .

[2]  Zhaohong Deng,et al.  A New Minimax Probability Based Classifier Using Fuzzy Hyper-Ellipsoid , 2007, 2007 International Joint Conference on Neural Networks.

[3]  Hakan Cevikalp,et al.  Large margin classifiers based on convex class models , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[4]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[5]  A. Martínez,et al.  The AR face databasae , 1998 .

[6]  Hakan Cevikalp,et al.  Nearest hyperdisk methods for high-dimensional classification , 2008, ICML '08.

[7]  David J. Kriegman,et al.  Video-based face recognition using probabilistic appearance manifolds , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Leon N. Cooper,et al.  Pattern Classification via Single Spheres , 2005, Discovery Science.

[9]  Daewon Lee,et al.  Domain described support vector classifier for multi-classification problems , 2007, Pattern Recognit..

[10]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[11]  Hakan Cevikalp,et al.  New clustering algorithms for the support vector machine based hierarchical classification , 2010, Pattern Recognit. Lett..

[12]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[13]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[14]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[15]  J. Jian,et al.  A sequential quadratically constrained quadratic programming method with an augmented Lagrangian line search function , 2008 .

[16]  David R. Musicant,et al.  Lagrangian Support Vector Machines , 2001, J. Mach. Learn. Res..

[17]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[18]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[19]  Anton van den Hengel,et al.  Semidefinite Programming , 2014, Computer Vision, A Reference Guide.

[20]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[22]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[23]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[24]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[25]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[26]  Jennifer G. Dy,et al.  A hierarchical method for multi-class support vector machines , 2004, ICML.

[27]  Donald Goldfarb,et al.  Second-order cone programming , 2003, Math. Program..

[28]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[29]  Tingting Mu,et al.  Multiclass Classification Based on Extended Support Vector Data Description , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  Christopher J. C. Burges,et al.  Simplified Support Vector Decision Rules , 1996, ICML.

[31]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[32]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[33]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[34]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[35]  David J. Crisp,et al.  A Geometric Interpretation of v-SVM Classifiers , 1999, NIPS.

[36]  David J. Crisp,et al.  A Geometric Interpretation of ?-SVM Classifiers , 1999, NIPS 2000.

[37]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[38]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[39]  Hoang Tuy,et al.  A robust algorithm for quadratic optimization under quadratic constraints , 2007, J. Glob. Optim..

[40]  I. Song,et al.  Working Set Selection Using Second Order Information for Training Svm, " Complexity-reduced Scheme for Feature Extraction with Linear Discriminant Analysis , 2022 .

[41]  Li Zhang,et al.  Linear programming support vector machines , 2002, Pattern Recognit..

[42]  José Carlos Príncipe,et al.  Binary classification based on SVDD projection and nearest neighbors , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[43]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[44]  Hakan Cevikalp,et al.  Large margin classifiers based on affine hulls , 2010, Neurocomputing.

[45]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[46]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[47]  Pascal Vincent,et al.  K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[48]  J. S. Marron,et al.  Distance-Weighted Discrimination , 2007 .

[49]  Hakan Cevikalp,et al.  Large Margin Classifier Based on Hyperdisks , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[50]  M. Bilginer Gülmezoglu,et al.  The common vector approach and its relation to principal component analysis , 2001, IEEE Trans. Speech Audio Process..

[51]  Hakan Cevikalp,et al.  Manifold Based Local Classifiers: Linear and Nonlinear Approaches , 2010, J. Signal Process. Syst..