Classification of Imbalanced Dataset Based on Random Walk Model

Received: 8 January 2019 Accepted: 21 March 2019 During data classification, the classifier often performs poorly facing imbalanced dataset. To solve the problem, this paper develops a classification method, denoted as the IRWM, for imbalanced dataset based on the random walk model (RWM). Firstly, a positive example and a negative example were set up according to the imbalance of the training data. Then, the training data were mapped separately into the random walk graph (RWG) for the positive example and that for the negative example. Once inputted, each data was walked separately in the two RWGs, yielding two probabilities. After that, the two probabilities of the unclassified data were compared with the preset comparison coefficient to determine the final class of the data. The proposed method was contrasted with the support vector machine (SVM) algorithm through experiments. The results show that our method can effectively classify data in imbalanced dataset.

[1]  Sankalp Jain,et al.  Comparing the performance of meta-classifiers—a case study on selected imbalanced data sets relevant for prediction of liver toxicity , 2018, Journal of Computer-Aided Molecular Design.

[2]  Li Yon,et al.  Review on ensemble algorithms for imbalanced data classification , 2014 .

[3]  Qiong Wu,et al.  A random walk algorithm for automatic construction of domain-oriented sentiment lexicon , 2011, Expert Syst. Appl..

[4]  Fu-Lai Chung,et al.  Fast Decision Using SVM for Incoming Samples: Fast Decision Using SVM for Incoming Samples , 2011 .

[5]  Mukkai S. Krishnamoorthy,et al.  A random walk method for alleviating the sparsity problem in collaborative filtering , 2008, RecSys '08.

[6]  Seyed Mojtaba Hosseini Bamakan,et al.  Ramp loss K-Support Vector Classification-Regression; a robust and sparse multi-class approach to the intrusion detection problem , 2017, Knowl. Based Syst..

[7]  Wang Chao A Multi-Label Classification Algorithm Based on Random Walk Model , 2010 .

[8]  Demin Li,et al.  An Improved Random Walk Based Community Detection Algorithm , 2014, MUE 2014.

[9]  Peter A. Flach,et al.  A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance , 2011, ICML.

[10]  Liu Ben,et al.  An Improved Rotation Forest Classification Algorithm , 2013 .

[11]  Francisco Herrera,et al.  Class Switching according to Nearest Enemy Distance for learning from highly imbalanced data-sets , 2017, Pattern Recognit..