Density-based weighting for imbalanced regression

In many real world settings, imbalanced data impedes model performance of learning algorithms, like neural networks, mostly for rare cases. This is especially problematic for tasks focusing on these rare occurrences. For example, when estimating precipitation, extreme rainfall events are scarce but important considering their potential consequences. While there are numerous well studied solutions for classification settings, most of them cannot be applied to regression easily. Of the few solutions for regression tasks, barely any have explored cost-sensitive learning which is known to have advantages compared to sampling-based methods in classification tasks. In this work, we propose a sample weighting approach for imbalanced regression datasets called DenseWeight and a cost-sensitive learning approach for neural network regression with imbalanced data called DenseLoss based on our weighting scheme. DenseWeight weights data points according to their target value rarities through kernel density estimation (KDE). DenseLoss adjusts each data point’s influence on the loss according to DenseWeight, giving rare data points more influence on model training compared to common data points. We show on multiple differently distributed datasets that DenseLoss significantly improves model performance for rare data points through its density-based weighting scheme. Additionally, we compare DenseLoss to the state-of-the-art method SMOGN, finding that our method mostly yields better performance. Our approach provides more control over model training as it enables us to actively decide on the trade-off between focusing on common or rare cases through a single hyperparameter, allowing the training of better models for rare data points.

[1]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[2]  Luís Torgo,et al.  Utility-Based Regression , 2007, PKDD.

[3]  Yen-Chi Chen,et al.  A tutorial on kernel density estimation and recent advances , 2017, 1704.03924.

[4]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Luís Torgo,et al.  SMOTE for Regression , 2013, EPIA.

[7]  Rita P. Ribeiro,et al.  Imbalanced regression and extreme value prediction , 2020, Machine Learning.

[8]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[10]  Huimin Zhao,et al.  An extended tuning method for cost-sensitive regression and forecasting , 2011, Decis. Support Syst..

[11]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[12]  Martial Hebert,et al.  Learning to Model the Tail , 2017, NIPS.

[13]  Luís Torgo,et al.  A Survey of Predictive Modeling on Imbalanced Domains , 2016, ACM Comput. Surv..

[14]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Luís Torgo,et al.  SMOGN: a Pre-processing Approach for Imbalanced Regression , 2017, LIDTA@PKDD/ECML.

[16]  Firuz Kamalov,et al.  Kernel density estimation based sampling for imbalanced class distribution , 2019, Inf. Sci..

[17]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[18]  Hernández-OralloJosé ROC curves for regression , 2013 .

[19]  Shaogang Gong,et al.  Class Rectification Hard Mining for Imbalanced Deep Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  José Hernández-Orallo,et al.  Probabilistic Reframing for Cost-Sensitive Regression , 2014, ACM Trans. Knowl. Discov. Data.

[21]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[22]  C. Daly,et al.  Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States , 2008 .

[23]  Lutz Prechelt,et al.  Early Stopping-But When? , 1996, Neural Networks: Tricks of the Trade.

[24]  Luís Torgo,et al.  UBL: an R package for Utility-based Learning , 2016, ArXiv.

[25]  Sangram Ganguly,et al.  DeepSD: Generating High Resolution Climate Change Projections through Single Image Super-Resolution , 2017, KDD.

[26]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[27]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[28]  Yang Song,et al.  Class-Balanced Loss Based on Effective Number of Samples , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[30]  José Hernández-Orallo,et al.  ROC curves for regression , 2013, Pattern Recognit..