Dynamic class imbalance learning for incremental LPSVM

Linear Proximal Support Vector Machines (LPSVMs), like decision trees, classic SVM, etc. are originally not equipped to handle drifting data streams that exhibit high and varying degrees of class imbalance. For online classification of data streams with imbalanced class distribution, we propose a dynamic class imbalance learning (DCIL) approach to incremental LPSVM (IncLPSVM) modeling. In doing so, we simplify a computationally non-renewable weighted LPSVM to several core matrices multiplying two simple weight coefficients. When data addition and/or retirement occurs, the proposed DCIL-IncLPSVM(1) accommodates newly presented class imbalance by a simple matrix and coefficient updating, meanwhile ensures no discriminative information lost throughout the learning process. Experiments on benchmark datasets indicate that the proposed DCIL-IncLPSVM outperforms classic IncSVM and IncLPSVM in terms of F-measure and G-mean metrics. Moreover, our application to online face membership authentication shows that the proposed DCIL-IncLPSVM remains effective in the presence of highly dynamic class imbalance, which usually poses serious problems to previous approaches.

[1]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[2]  Shaoning Pang,et al.  Incremental linear discriminant analysis for classification of data streams , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Gregory Ditzler,et al.  An ensemble based incremental learning framework for concept drift and class imbalance , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[4]  Gregory Ditzler,et al.  Incremental Learning of New Classes in Unbalanced Datasets: Learn + + .UDNC , 2010, MCS.

[5]  Shaoning Pang,et al.  Membership authentication in the dynamic group by face classification using SVM ensemble , 2003, Pattern Recognit. Lett..

[6]  Nitesh V. Chawla,et al.  Learning from Imbalanced Data: Evaluation Matters , 2012 .

[7]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[8]  Joarder Kamruzzaman,et al.  z-SVM: An SVM for Improved Classification of Imbalanced Data , 2006, Australian Conference on Artificial Intelligence.

[9]  Yan-Shi Dong,et al.  Text classification based on data partitioning and parameter varying ensembles , 2005, SAC '05.

[10]  Shaoning Pang,et al.  Face membership authentication using SVM classification tree generated by membership-based LLE data partition , 2005, IEEE Trans. Neural Networks.

[11]  Sethuraman Panchanathan,et al.  Predicting risk of complications following a drug eluting stent procedure: A SVM approach for imbalanced data , 2009, 2009 22nd IEEE International Symposium on Computer-Based Medical Systems.

[12]  Ichiro Takeuchi,et al.  Multiple Incremental Decremental Learning of Support Vector Machines , 2009, IEEE Transactions on Neural Networks.

[13]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[14]  Monson H. Hayes,et al.  Face Recognition Using An Embedded HMM , 1999 .

[15]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, ICDM.

[16]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[17]  Gang Chen,et al.  Class imbalance robust incremental LPSVM for data streams learning , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[18]  Tomasz Maciejewski,et al.  Local neighbourhood extension of SMOTE for mining imbalanced data , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[19]  Vasile Palade,et al.  FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning , 2010, IEEE Transactions on Fuzzy Systems.

[20]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[21]  Shaoning Pang,et al.  LDA Merging and Splitting With Applications to Multiagent Cooperative Learning and System Alteration , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[23]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[24]  Pong C. Yuen,et al.  Incremental Linear Discriminant Analysis for Face Recognition , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Rong Yan,et al.  On predicting rare classes with SVM ensembles in scene classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[26]  Daijin Kim,et al.  Face recognition using the embedded HMM with second-order block-specific observations , 2003, Pattern Recognit..

[27]  Ji Hong-bing,et al.  A Modified PSVM and its Application to Unbalanced Data Classification , 2007, Third International Conference on Natural Computation (ICNC 2007).

[28]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[29]  T. Yamasaki,et al.  Incremental SVMs and Their Geometrical Analyses , 2005, 2005 International Conference on Neural Networks and Brain.

[30]  Ying Chen,et al.  Efficient text classification by weighted proximal SVM , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[31]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[32]  Rynson W. H. Lau,et al.  Knowledge and Data Engineering for e-Learning Special Issue of IEEE Transactions on Knowledge and Data Engineering , 2008 .

[33]  Pau Klein,et al.  San Francisco, California , 2007 .

[34]  He-Yong Wang,et al.  Combination approach of SMOTE and biased-SVM for imbalanced datasets , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[35]  Nello Cristianini,et al.  Controlling the Sensitivity of Support Vector Machines , 1999 .

[36]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[37]  Amund Tveit,et al.  Multicategory Incremental Proximal Support Vector Classifiers , 2003, KES.

[38]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[39]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[40]  Haibo He,et al.  IMORL: Incremental Multiple-Object Recognition and Localization , 2008, IEEE Transactions on Neural Networks.

[41]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[42]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[43]  Glenn Fung,et al.  Incremental Support Vector Machine Classification , 2002, SDM.

[44]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  Chao-Ton Su,et al.  An Evaluation of the Robustness of MTS for Imbalanced Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[46]  Haibo He,et al.  SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining , 2009, 2009 International Joint Conference on Neural Networks.

[47]  Kaizhu Huang,et al.  Imbalanced learning with a biased minimax probability machine , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[48]  Da Ruan,et al.  New construction of Ensemble Classifiers for imbalanced datasets , 2010, 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering.

[49]  Roshani Ade,et al.  Incremental Learning From Unbalanced Data with Concept Class, Concept Drift and Missing Features : A Review , 2014 .

[50]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[51]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[52]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..