Oversampling the Minority Class in the Feature Space

The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets.

[1]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[2]  Edward Y. Chang,et al.  Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning , 2003, ICML.

[3]  V. Milman,et al.  Concentration Property on Probability Spaces , 2000 .

[4]  Bernhard Schölkopf,et al.  The connection between regularization operators and support vector kernels , 1998, Neural Networks.

[5]  Remo Guidieri Res , 1995, RES: Anthropology and Aesthetics.

[6]  Kazuyuki Murase,et al.  A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning , 2011, ICONIP.

[7]  Xin Yao,et al.  MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .

[8]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[9]  Gaël Richard,et al.  Multiclass Feature Selection With Kernel Gram-Matrix-Based Criteria , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[10]  J. Miller Numerical Analysis , 1966, Nature.

[11]  Sungzoon Cho,et al.  EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems , 2006, ICONIP.

[12]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[13]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[14]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[15]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Xiangji Huang,et al.  Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles , 2006, PAKDD.

[17]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[18]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[19]  Ji Gao,et al.  Improving SVM Classification with Imbalance Data Set , 2009, ICONIP.

[20]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[21]  Edward Y. Chang,et al.  KBA: kernel boundary alignment considering imbalanced data distribution , 2005, IEEE Transactions on Knowledge and Data Engineering.

[22]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[23]  Chumphol Bunkhumpornpat,et al.  Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem , 2009, PAKDD.

[24]  O. Chapelle Second order optimization of kernel parameters , 2008 .

[25]  Mehryar Mohri,et al.  Algorithms for Learning Kernels Based on Centered Alignment , 2012, J. Mach. Learn. Res..

[26]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[27]  Christian Igel,et al.  Empirical evaluation of the improved Rprop learning algorithms , 2003, Neurocomputing.

[28]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[29]  Bo Zhang,et al.  Learning concepts from large scale imbalanced data sets using support cluster machines , 2006, MM '06.

[30]  Huilin Xiong,et al.  A Unified Framework for Kernelization: The Empirical Kernel Feature Space , 2009, 2009 Chinese Conference on Pattern Recognition.

[31]  Joachim M. Buhmann,et al.  On Relevant Dimensions in Kernel Feature Spaces , 2008, J. Mach. Learn. Res..

[32]  Shai Ben-David,et al.  Learning Bounds for Support Vector Machines with Learned Kernels , 2006, COLT.

[33]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[34]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[35]  Francisco Herrera,et al.  EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling , 2013, Pattern Recognit..

[36]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[37]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[38]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[39]  Lars Schmidt-Thieme,et al.  Cost-sensitive learning methods for imbalanced data , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[40]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[41]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[42]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[43]  Ivor W. Tsang,et al.  The pre-image problem in kernel methods , 2003, IEEE Transactions on Neural Networks.

[44]  Hien M. Nguyen,et al.  Borderline over-sampling for imbalanced data classification , 2009, Int. J. Knowl. Eng. Soft Data Paradigms.

[45]  Shigeo Abe,et al.  Sparse Least Squares Support Vector Regressors Trained in the Reduced Empirical Feature Space , 2007, ICANN.