Class imbalance robust incremental LPSVM for data streams learning

Linear Proximal Support Vector Machines (LPSVM), like decision trees, classic SVM, etc. are originally not equipped to handle drifting data streams that exhibit high and varying degrees of class imbalance. For online classification of data streams with imbalanced class distribution, we propose an incremental LPSVM termed DCIL-IncLPSVM that has robust learning performance under class imbalance. In doing so, we simplify a weighted LPSVM, which is computationally not renewable, as several core matrices multiplying two simple weight coefficients. When data addition and/or retirement occurs, the proposed DCIL-IncLPSVM accommodates current class imbalance by a simple matrix and coefficient updating, meanwhile ensures no discriminative information lost throughout the learning process. Experiments on benchmark datasets indicate that the proposed DCIL-IncLPSVM outperforms batch SVM and LPSVM in terms of F-measure, relative sensitivity and G-mean metrics. Moreover, our application to online face membership authentication shows that the proposed DCIL-IncLPSVM remains effective in the presence of highly dynamic class imbalance, which usually poses serious problems to classic incremental SVM (IncSVM) and incremental LPSVM (IncLPSVM).

[1]  Haibo He,et al.  SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining , 2009, 2009 International Joint Conference on Neural Networks.

[2]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[3]  Glenn Fung,et al.  Proximal support vector machine classifiers , 2001, KDD '01.

[4]  Haibo He,et al.  IMORL: Incremental Multiple-Object Recognition and Localization , 2008, IEEE Transactions on Neural Networks.

[5]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[6]  Ji Hong-bing,et al.  A Modified PSVM and its Application to Unbalanced Data Classification , 2007, Third International Conference on Natural Computation (ICNC 2007).

[7]  Ying Chen,et al.  Efficient text classification by weighted proximal SVM , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[9]  T. Yamasaki,et al.  Incremental SVMs and Their Geometrical Analyses , 2005, 2005 International Conference on Neural Networks and Brain.

[10]  Ichiro Takeuchi,et al.  Multiple Incremental Decremental Learning of Support Vector Machines , 2009, IEEE Transactions on Neural Networks.

[11]  Daijin Kim,et al.  Face recognition using the embedded HMM with second-order block-specific observations , 2003, Pattern Recognit..

[12]  Kaizhu Huang,et al.  Imbalanced learning with a biased minimax probability machine , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[14]  Shaoning Pang,et al.  Membership authentication in the dynamic group by face classification using SVM ensemble , 2003, Pattern Recognit. Lett..

[15]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Glenn Fung,et al.  Incremental Support Vector Machine Classification , 2002, SDM.

[18]  Chao-Ton Su,et al.  An Evaluation of the Robustness of MTS for Imbalanced Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.