A Hierarchical Convex Optimization for Multiclass SVM Achieving Maximum Pairwise Margins with Least Empirical Hinge-Loss

In this paper, we formulate newly a hierarchical convex optimization for multiclass SVM achieving maximum pairwise margins with least empirical hinge-loss. This optimization problem is a most faithful as well as robust multiclass extension of an NP-hard hierarchical optimization appeared for the first time in the seminal paper by C.~Cortes and V.~Vapnik almost 25 years ago. By extending the very recent fixed point theoretic idea [Yamada-Yamagishi 2019] with the generalized hinge loss function [Crammer-Singer 2001], we show that the hybrid steepest descent method [Yamada 2001] in the computational fixed point theory is applicable to this much more complex hierarchical convex optimization problem.

[1]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[2]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[3]  Nelly Pustelnik,et al.  A Proximal Approach for Sparse Multiclass SVM , 2015, ArXiv.

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Laurent Condat,et al.  A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms , 2012, Journal of Optimization Theory and Applications.

[6]  Yann Guermeur,et al.  Combining Discriminant Models with New Multi-Class SVMs , 2002, Pattern Analysis & Applications.

[7]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[8]  Masao Yamagishi,et al.  Hierarchical Convex Optimization by the Hybrid Steepest Descent Method with Proximal Splitting Operators—Enhancements of SVM and Lasso , 2022, Splitting Algorithms, Modern Operator Theory, and Applications.

[9]  Heinz H. Bauschke,et al.  Projection algorithms and monotone operators , 1996 .

[10]  Bernhard Schölkopf,et al.  Comparison of View-Based Object Recognition Algorithms Using Realistic 3D Models , 1996, ICANN.

[11]  Xiaotong Shen,et al.  On L1-Norm Multiclass Support Vector Machines , 2007 .

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[14]  I. Yamada The Hybrid Steepest Descent Method for the Variational Inequality Problem over the Intersection of Fixed Point Sets of Nonexpansive Mappings , 2001 .

[15]  Isao Yamada,et al.  Nonstrictly Convex Minimization over the Bounded Fixed Point Set of a Nonexpansive Mapping , 2003 .

[16]  Bang Công Vu,et al.  A splitting algorithm for dual monotone inclusions involving cocoercive operators , 2011, Advances in Computational Mathematics.

[17]  Yufeng Liu,et al.  Multicategory ψ-Learning , 2006 .

[18]  Isao Yamada,et al.  Minimizing the Moreau Envelope of Nonsmooth Convex Functions over the Fixed Point Set of Certain Quasi-Nonexpansive Mappings , 2011, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[19]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[20]  Keiji Tatsumi,et al.  Performance evaluation of multiobjective multiclass support vector machines maximizing geometric margins , 2011 .

[21]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.