Parameter Analysis of Differential Evolution Based Oversampling Approach for Highly Imbalanced Datasets

Nowadays, almost all performed activities are saved into databases. Data mining methods such as classifiers utilize these datasets for discovering hidden patterns and rules. Proposed methods for classification problems are generally developed considering approximately balanced datasets. However, imbalanced datasets that have the unequal instance numbers in its classes emerges as a common problem in most real domains. Many approaches at the data level are proposed to enable better classification of imbalanced datasets. Differential Evolution Based Oversampling Approach For Highly Imbalanced Datasets (DEBOHID) is one of the proposed method in order to handle this issue on imbalanced datasets. DEBOHID approach utilizes the crossover and mutation processes of DE for generating new synthetic samples. The parameters used by the crossover and mutation processes affect the solution quality. Therefore, in this study, solution quality in highly imbalanced datasets for different crossover and mutation parameter values of DEBOHID approach is investigated. Experimental studies are carried out by using three classifiers and two evaluation metrics for different parameter values. The obtained results are compared with well-known approaches in the literature.