Combination approach of SMOTE and biased-SVM for imbalanced datasets

A new approach to construct the classifiers from imbalanced datasets is proposed by combining SMOTE (synthetic minority over-sampling technique) and Biased-SVM (biased support vector machine) approaches. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ldquonormalrdquo examples with only a small percentage of ldquoabnormalrdquo or ldquointerestingrdquo examples. The cost of misclassifying an abnormal (interesting) example into a normal example is often much higher than that of the reverse error. It was known as a means of increasing the sensitivity of a classifier to the minority class using SMOTE over-sampling in minority class. But in this paper, it gives a good means of increasing the sensitivity of a classifier to the minority class by using SMOTE approaches within support vectors. As for support vector over-sampling, this paper proposes two different over-sampling algorithms to deal with the support vectors being over-sampled by its neighbors from the k nearest neighbors, not only within the support vectors but also within the entire minority class. Some experimental results confirms that the proposed combination approach of SMOTE and biased-SVM can achieve better classifier performance.