论文信息 - Self-paced Learning for Imbalanced Data

Self-paced Learning for Imbalanced Data

In this paper, we propose a novel training paradigm that combines two learning strategies: cost-sensitive and self-paced learning. This learning approach can be applied to the decision problems where highly imbalanced data is used during training process. The main idea behind the proposed method is to start the learning process by taking large number of minority examples and only the easiest majority objects and then gradually turning to more difficult cases. We examine the quality of this training paradigm comparing to other learning schemas for neural network model using a set of highly imbalanced benchmark datasets.

Jakub M. Tomczak | Maciej Zieba | Jerzy Swiatek | J. Swiatek | Maciej Ziȩba

[1] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[2] Kai A. Krueger,et al. Flexible shaping: How learning in small steps helps , 2009, Cognition.

[3] Qi Xie,et al. Self-Paced Learning for Matrix Factorization , 2015, AAAI.

[4] Shiguang Shan,et al. Self-Paced Learning with Diversity , 2014, NIPS.

[5] Kathrin Klamroth,et al. Biconvex sets and optimization with biconvex functions: a survey and extensions , 2007, Math. Methods Oper. Res..

[6] Shiguang Shan,et al. Self-Paced Curriculum Learning , 2015, AAAI.

[7] Jakub M. Tomczak,et al. Classification Restricted Boltzmann Machine for comprehensible credit scoring model , 2015, Expert Syst. Appl..

[8] Jesús Alcalá-Fdez,et al. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[9] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[10] Jakub M. Tomczak,et al. Probabilistic combination of classification rules and its application to medical diagnosis , 2015, Machine Learning.