Multi-Stage Prediction for Zero-Inflated Hurricane Induced Power Outages

Predicting hurricane power outages facilitates disaster response decision-making by electric power utilities as well as other organizations of critical importance to society. Predictive models can be built on the basis of statistical learning methods that use data from past hurricanes to capture the effects of climatological, geographical, and environmental variables on the power systems. When the dataset is largely zero-inflated, as power outage datasets often are, classical data mining methods that are based on a relatively balanced number of zeros and non-zeros may fail. General accuracy evaluation metrics also become misleading because they focus on the prevalent zero-valued responses in the dataset. We develop a new framework that operates in three stages by separating the prediction of whether or not power outages will occur from the number of customers without power. In the first stage, the zero-inflation problem is handled via a series of binary classifications. In the second stage, the severity of outages is predicted leveraging clustering techniques. In the final stage, regression models estimate the number of customers without power. We introduce a weighted accuracy metric and investigate its benefits over mean absolute error. We validate the models with data from hurricanes Dennis (2005), Ivan (2004), and Katrina (2005), and then predict power outages associated with hurricanes Matthew (2016) and Irma (2017) in the central Gulf region. The results demonstrate improvement over the traditional approaches in the context of power outage prediction.

[1]  D. Wanik,et al.  Nonparametric Tree‐Based Predictive Modeling of Storm Outages on an Electric Distribution Network , 2017, Risk analysis : an official publication of the Society for Risk Analysis.

[2]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[3]  Seth D. Guikema,et al.  Hybrid data mining-regression for infrastructure risk assessment based on zero-inflated data , 2012, Reliab. Eng. Syst. Saf..

[4]  Yue-Shi Lee,et al.  Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset , 2006 .

[5]  Jery R. Stedinger,et al.  Negative Binomial Regression of Electric Power Outages in Hurricanes , 2005 .

[6]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[7]  Q. Mcnemar Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[8]  Rachel A. Davidson,et al.  Electric Power Distribution System Performance in Carolina Hurricanes , 2003 .

[9]  R. Barandelaa,et al.  Strategies for learning in class imbalance problems , 2003, Pattern Recognit..

[10]  D. Wanik,et al.  Storm outage modeling for an electric distribution network in Northeastern USA , 2015, Natural Hazards.

[11]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[12]  Seth D. Guikema,et al.  Predicting Hurricane Power Outages to Support Storm Response Planning , 2014, IEEE Access.

[13]  Haibin Liu,et al.  Statistical Forecasting of Electric Power Restoration Times in Hurricanes and Ice Storms , 2007, IEEE Transactions on Power Systems.

[14]  A. Pahwa,et al.  Modeling Weather-Related Failures of Overhead Distribution Lines , 2007, 2007 IEEE Power Engineering Society General Meeting.

[15]  David A. Cieslak,et al.  Learning Decision Trees for Unbalanced Data , 2008, ECML/PKDD.

[16]  Nicola Torelli,et al.  Training and assessing classification rules with imbalanced data , 2012, Data Mining and Knowledge Discovery.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Seth D Guikema,et al.  Improving Hurricane Power Outage Prediction Models Through the Inclusion of Local Environmental Factors , 2018, Risk analysis : an official publication of the Society for Risk Analysis.

[19]  S. Guikema,et al.  Statistical models of the effects of tree trimming on power system outages , 2006, IEEE Transactions on Power Delivery.

[20]  Andreas Stolcke,et al.  A study in machine learning from imbalanced data for sentence boundary detection in speech , 2006, Comput. Speech Lang..

[21]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[22]  Roshanak Nateghi,et al.  Power Outage Estimation for Tropical Cyclones: Improved Accuracy with Simpler Models , 2014, Risk analysis : an official publication of the Society for Risk Analysis.

[23]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[24]  Seth D Guikema,et al.  Prestorm Estimation of Hurricane Damage to Electric Power Distribution Systems , 2010, Risk analysis : an official publication of the Society for Risk Analysis.

[25]  Vasile Palade,et al.  FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning , 2010, IEEE Transactions on Fuzzy Systems.

[26]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[27]  Seth D. Guikema,et al.  Importance of soil and elevation characteristics for modeling hurricane-induced power outages , 2011 .

[28]  Seth D Guikema,et al.  Hurricane Isaac: A Longitudinal Analysis of Storm Characteristics and Power Outage Risk , 2016, Risk analysis : an official publication of the Society for Risk Analysis.

[29]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[30]  D. Wanik,et al.  Using vegetation management and LiDAR-derived tree height data to improve outage predictions for electric utilities , 2017 .

[31]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[32]  Haibin Liu,et al.  Spatial generalized linear mixed models of electric power outages due to hurricanes and ice storms , 2008, Reliab. Eng. Syst. Saf..

[33]  Seth D Guikema,et al.  Improving the Predictive Accuracy of Hurricane Power Outage Forecasts Using Generalized Additive Models , 2009, Risk analysis : an official publication of the Society for Risk Analysis.

[34]  Francisco Herrera,et al.  Evolutionary-based selection of generalized instances for imbalanced classification , 2012, Knowl. Based Syst..

[35]  Seth D. Guikema,et al.  Estimating the spatial distribution of power outages during hurricanes in the Gulf coast region , 2009, Reliab. Eng. Syst. Saf..

[36]  David Mease,et al.  Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..