Kd-Tree Based Efficient Ensemble Classification Algorithm for Imbalanced Learning

Ensemble learning combined with resampling is an effective approach for solving the imbalanced classification problems. But it is not suitable for large-scale imbalanced data due to its slower training speed and larger computation. In this paper, an efficient ensemble classification method based on kd-tree for imbalanced data is proposed. First, a kd-tree is constructed from the training data, which will be used for searching quickly k-nearest neighbors of each instance. Then, the under-sampling is repeatedly adopted till the difference of the intra-class coherence between different classes is zero. In the process of under-sampling, the constructed kd-tree is used to get the intra-class coherence. Based on this balanced dataset, ensemble learning is performed. The experimental results show the effectiveness of the proposed the method, especially for large-scale imbalanced data.