Imbalanced data classification using second-order cone programming support vector machines

Learning from imbalanced data sets is an important machine learning challenge, especially in Support Vector Machines (SVM), where the assumption of equal cost of errors is made and each object is treated independently. Second-order cone programming SVM (SOCP-SVM) studies each class separately instead, providing quite an interesting formulation for the imbalanced classification task. This work presents a novel second-order cone programming (SOCP) formulation, based on the LP-SVM formulation principle: the bound of the VC dimension is loosened properly using the l"~-norm, and the margin is directly maximized using two margin variables associated with each class. A regularization parameter C is considered in order to control the trade-off between the maximization of these two margin variables. The proposed method has the following advantages: it provides better results, since it is specially designed for imbalanced classification, and it reduces computational complexity, since one conic restriction is eliminated. Experiments on benchmark imbalanced data sets demonstrate that our approach accomplishes the best classification performance, compared with the traditional SOCP-SVM formulation and with cost-sensitive formulations for linear SVM.

[1]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[2]  Olvi L. Mangasarian,et al.  A finite newton method for classification , 2002, Optim. Methods Softw..

[3]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[4]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[5]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[6]  Héctor Ramírez Cabrera,et al.  Interior proximal algorithm with variable metric for second-order cone programming: applications to structural optimization and support vector machines , 2010, Optim. Methods Softw..

[7]  Eric Horvitz,et al.  Considering Cost Asymmetry in Learning Classifiers , 2006, J. Mach. Learn. Res..

[8]  Kristin P. Bennett,et al.  Duality and Geometry in SVM Classifiers , 2000, ICML.

[9]  Alvaro Soto,et al.  Active learning and subspace clustering for anomaly detection , 2011, Intell. Data Anal..

[10]  Chiranjib Bhattacharyya,et al.  Second Order Cone Programming Formulations for Feature Selection , 2004, J. Mach. Learn. Res..

[11]  Ralf Stecking,et al.  Using Multiple SVM Models for Unbalanced Credit Scoring Data Sets , 2007, GfKl.

[12]  Richard Weber,et al.  A wrapper method for feature selection using Support Vector Machines , 2009, Inf. Sci..

[13]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[14]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[15]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[16]  Yuchun Tang,et al.  Spam Sender Detection with Classification Modeling on Highly Imbalanced Mail Server Behavior Data , 2008, Artificial Intelligence and Pattern Recognition.

[17]  Pedro Antonio Gutiérrez,et al.  A dynamic over-sampling procedure based on sensitivity for multi-class problems , 2011, Pattern Recognit..

[18]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[19]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[20]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[21]  Taghi M. Khoshgoftaar,et al.  Feature Selection with High-Dimensional Imbalanced Data , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[22]  Chiranjib Bhattacharyya,et al.  Maximum Margin Classifiers with Specified False Positive and False Negative Error Rates , 2007, SDM.

[23]  Alexander J. Smola,et al.  Second Order Cone Programming Approaches for Handling Missing and Uncertain Data , 2006, J. Mach. Learn. Res..

[24]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[25]  Xinjun Peng,et al.  Building sparse twin support vector machine classifiers in primal space , 2011, Inf. Sci..

[26]  Li Zhang,et al.  Linear programming support vector machines , 2002, Pattern Recognit..

[27]  Yan-Qing Zhang,et al.  Diversified ensemble classifiers for highly imbalanced data learning and its application in bioinformatics , 2011 .

[28]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[29]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[30]  M WeissGary Mining with rarity , 2004 .

[31]  Taghi M. Khoshgoftaar,et al.  An exploration of learning when data is noisy and imbalanced , 2011, Intell. Data Anal..

[32]  A C C Gibbs,et al.  Data Analysis , 2009, Encyclopedia of Database Systems.

[33]  Donald Goldfarb,et al.  Second-order cone programming , 2003, Math. Program..

[34]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[35]  Qin Li,et al.  Extract minimum positive and maximum negative features for imbalanced binary classification , 2012, Pattern Recognit..

[36]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[37]  Vadlamani Ravi,et al.  Predicting credit card customer churn in banks using data mining , 2008, Int. J. Data Anal. Tech. Strateg..

[39]  Rohini K. Srihari,et al.  Feature selection for text categorization on imbalanced data , 2004, SKDD.

[40]  Adam Kowalczyk,et al.  Extreme re-balancing for SVMs: a case study , 2004, SKDD.

[41]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .