Decomposition techniques for training linear programming support vector machines

In this paper, we propose three decomposition techniques for linear programming (LP) problems: (1) Method 1, in which we decompose the variables into the working set and the fixed set, but we do not decompose the constraints, (2) Method 2, in which we decompose only the constraints and (3) Method 3, in which we decompose both the variables and the constraints into two. By Method 1, the value of the objective function is proved to be non-decreasing (non-increasing) for the maximization (minimization) problem and by Method 2, the value is non-increasing (non-decreasing) for the maximization (minimization) problem. Thus, by Method 3, which is a combination of Methods 1 and 2, the value of the objective function is not guaranteed to be monotonic and there is a possibility of infinite loops. We prove that infinite loops are resolved if the variables in an infinite loop are not released from the working set and Method 3 converges in finite steps. We apply Methods 1 and 3 to LP support vector machines (SVMs) and discuss a more efficient method of accelerating training by detecting the increase in the number of violations and restoring variables in the working set that are released at the previous iteration step. By computer experiments for microarray data with huge input variables and a small number of constraints, we demonstrate the effectiveness of Method 1 for training the primal LP SVM with linear kernels. We also demonstrate the effectiveness of Method 3 over Method 1 for the nonlinear LP SVMs.

[1]  O. Mangasarian,et al.  Massive data discrimination via linear support vector machines , 2000 .

[2]  Shigeo Abe,et al.  Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[3]  Christian Igel,et al.  Maximum-Gain Working Set Selection for SVMs , 2006, J. Mach. Learn. Res..

[4]  Chih-Jen Lin,et al.  A Study on SMO-Type Decomposition Methods for Support Vector Machines , 2006, IEEE Transactions on Neural Networks.

[5]  Sholom M. Weiss,et al.  An Empirical Comparison of Pattern Recognition, Neural Nets, and Machine Learning Classification Methods , 1989, IJCAI.

[6]  Li Zhang,et al.  Linear programming support vector machines , 2002, Pattern Recognit..

[7]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[8]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[9]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[10]  Chih-Jen Lin,et al.  A Simple Decomposition Method for Support Vector Machines , 2002, Machine Learning.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Pavel Laskov,et al.  Feasible Direction Decomposition Algorithms for Training Support Vector Machines , 2002, Machine Learning.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  S. Sathiya Keerthi,et al.  Convergence of a Generalized SMO Algorithm for SVM Classifier Design , 2002, Machine Learning.

[15]  Shigeo Abe,et al.  Character recognition using fuzzy rules extracted from data , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[16]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[17]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Shigeo Abe,et al.  Input Layer Optimization of Neural Networks by Sensitivity Analysis and its Application to Recognition of Numerals , 1991 .

[19]  J. Sudbø,et al.  Gene-expression profiles in hereditary breast cancer. , 2001, The New England journal of medicine.

[20]  Bernhard Schölkopf,et al.  Prior Knowledge in Support Vector Kernels , 1997, NIPS.

[21]  Yuval Rabani,et al.  Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[22]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[23]  Federico Girosi,et al.  An improved training algorithm for support vector machines , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[24]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[25]  Norikazu Takahashi,et al.  Global Convergence of Decomposition Learning Methods for Support Vector Machines , 2006, IEEE Transactions on Neural Networks.

[26]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Chih-Jen Lin,et al.  On the convergence of the decomposition method for support vector machines , 2001, IEEE Trans. Neural Networks.

[28]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[29]  N. Iizuka,et al.  MECHANISMS OF DISEASE Mechanisms of disease , 2022 .

[30]  Don R. Hush,et al.  Polynomial-Time Decomposition Algorithms for Support Vector Machines , 2003, Machine Learning.

[31]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[32]  Vojislav Kecman,et al.  Support vectors selection by linear programming , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[33]  Alexander J. Smola,et al.  Support Vector Machine Reference Manual , 1998 .

[34]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[36]  Shigeo Abe,et al.  Fast Training of Linear Programming Support Vector Machines Using Decomposition Techniques , 2006, ANNPR.

[37]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[38]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[39]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[40]  T. Golub,et al.  Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. , 2003, Cancer research.