Cost-Sensitive Weighting and Imbalance-Reversed Bagging for Streaming Imbalanced and Concept Drifting in Electricity Pricing Classification

In data streaming environments such as a smart grid, it is impossible to restrict each data chunk to have the same number of samples in each class. Hence, in addition to the concept drift, classification problems in streaming data environments are inherently imbalanced. However, streaming imbalanced and concept drifting problems in the power system and smart grid have rarely been studied. Incremental learning aims to learn the correct classification for the future unseen samples from the given streaming data. In this paper, we propose a new incremental ensemble learning method to handle both concept drift and class imbalance issues. The class imbalance issue is tackled by an imbalance-reversed bagging method that improves the true positive rate while maintains a low false positive rate. The adaptation to concept drift is achieved by a dynamic cost-sensitive weighting scheme for component classifiers according to their classification performances and stochastic sensitivities. The proposed method is applied to a case study for the electricity pricing in Australia to predict whether the price of New South Wales will be higher or lower than that of Victorias in a 24-h period. Experimental results show the effectiveness of the proposed algorithm with statistical significance in comparison to the state-of-the-art incremental learning methods.

[1]  Francisco Martínez-Álvarez,et al.  A New Methodology Based on Imbalanced Classification for Predicting Outliers in Electricity Demand Time Series , 2016 .

[2]  Hamidreza Zareipour,et al.  Data Mining for Electricity Price Classification and the Application to Demand-Side Management , 2012, IEEE Transactions on Smart Grid.

[3]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[4]  H Zareipour,et al.  Classification of Future Electricity Market Prices , 2011, IEEE Transactions on Power Systems.

[5]  F. Nogales,et al.  Price-taker bidding strategy under price uncertainty , 2003, 2003 IEEE Power Engineering Society General Meeting (IEEE Cat. No.03CH37491).

[6]  Thomas H. Morris,et al.  Classification of Disturbances and Cyber-Attacks in Power Systems Using Heterogeneous Time-Synchronized Data , 2015, IEEE Transactions on Industrial Informatics.

[7]  Josef Kittler,et al.  Inverse random under sampling for class imbalance problem and its application to multi-label classification , 2012, Pattern Recognit..

[8]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[9]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[10]  Zhao Yang Dong,et al.  A Review of False Data Injection Attacks Against Modern Power Systems , 2017, IEEE Transactions on Smart Grid.

[11]  Mo-Yuen Chow,et al.  A classification approach for power distribution systems fault cause identification , 2006, IEEE Transactions on Power Systems.

[12]  Patrick P. K. Chan,et al.  LG-Trader: Stock trading decision support based on feature selection by weighted localized generalization error model , 2014, Neurocomputing.

[13]  Jooyoung Park,et al.  Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[14]  Philip S. Yu,et al.  A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions , 2007, SDM.

[15]  Patrick P. K. Chan,et al.  MLPNN Training via a Multiobjective Optimization of Training Error and Stochastic Sensitivity , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Jerzy Stefanowski,et al.  Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  N. Amjady,et al.  Day-Ahead Price Forecasting of Electricity Markets by Mutual Information Technique and Cascaded Neuro-Evolutionary Algorithm , 2009, IEEE Transactions on Power Systems.

[18]  Yuan Yan Tang,et al.  Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift , 2017, IJCAI.

[19]  Luís Torgo,et al.  A Survey of Predictive Modeling on Imbalanced Domains , 2016, ACM Comput. Surv..

[20]  Zhiwei Gao,et al.  From Model, Signal to Knowledge: A Data-Driven Perspective of Fault Detection and Diagnosis , 2013, IEEE Transactions on Industrial Informatics.

[21]  Xinghuo Yu,et al.  Smart Electricity Meter Data Intelligence for Future Energy Systems: A Survey , 2016, IEEE Transactions on Industrial Informatics.

[22]  Daniel S. Yeung,et al.  Feature selection using localized generalization error for supervised classification problems using RBFNN , 2008, Pattern Recognit..

[23]  Ashwani Kumar,et al.  Parameter optimisation using genetic algorithm for support vector machine-based price-forecasting model in National electricity market , 2010 .

[24]  Daniel S. Yeung,et al.  Localized Generalization Error Model and Its Application to Architecture Selection for Radial Basis Function Neural Network , 2007, IEEE Transactions on Neural Networks.

[25]  Chao Lu,et al.  Imbalance Learning Machine-Based Power System Short-Term Voltage Stability Assessment , 2017, IEEE Transactions on Industrial Informatics.

[26]  Daniel S. Yeung,et al.  Diversified Sensitivity-Based Undersampling for Imbalance Classification Problems , 2015, IEEE Transactions on Cybernetics.

[27]  Jean Paul Barddal,et al.  A Survey on Ensemble Learning for Data Stream Classification , 2017, ACM Comput. Surv..

[28]  Kit Po Wong,et al.  A Hybrid Approach for Probabilistic Forecasting of Electricity Price , 2014, IEEE Transactions on Smart Grid.

[29]  Kun Zhang,et al.  Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling , 2014, SDM.

[30]  Haibo He,et al.  SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining , 2009, 2009 International Joint Conference on Neural Networks.

[31]  Taher Niknam,et al.  Probabilistic Forecasting of Hourly Electricity Price by Generalization of ELM for Usage in Improved Wavelet Neural Network , 2017, IEEE Transactions on Industrial Informatics.

[32]  Xin Yao,et al.  A learning framework for online class imbalance learning , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[33]  Ivan Nunes da Silva,et al.  Feature Extraction and Power Quality Disturbances Classification Using Smart Meters Signals , 2016, IEEE Transactions on Industrial Informatics.

[34]  João Gama,et al.  Ensemble learning for data stream analysis: A survey , 2017, Inf. Fusion.

[35]  Maryam Farajzadeh-Zanjani,et al.  An Integrated Class-Imbalanced Learning Scheme for Diagnosing Bearing Defects in Induction Motors , 2017, IEEE Transactions on Industrial Informatics.

[36]  Patrick P. K. Chan,et al.  Steganalysis classifier training via minimizing sensitivity for different imaging sources , 2014, Inf. Sci..

[37]  Mo-Yuen Chow,et al.  Power Distribution Fault Cause Identification With Imbalanced Data Using the Data Mining-Based Fuzzy Classification $E$-Algorithm , 2007, IEEE Transactions on Power Systems.

[38]  Gregory Ditzler,et al.  Incremental Learning of Concept Drift from Streaming Imbalanced Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[39]  W. J. Whiten,et al.  Computational investigations of low-discrepancy sequences , 1997, TOMS.

[40]  Beatrice Lazzerini,et al.  Robust Diagnosis of Rolling Element Bearings Based on Classification Techniques , 2013, IEEE Transactions on Industrial Informatics.