Power distribution system equipment failure identification using machine learning algorithms

In this paper, an approach for identifying equipment failure faults in distribution systems is explored. This task is considered as a binary classification problem in which outages are categorized into two classes of equipment failure and non-equipment failure types. To carry out this study, actual outage data collected by Duke Energy are utilized. First, different variables that make contributions to equipment failures are described and their relationships are examined. Afterward, the presence of imbalanced classes, as a common issue in outage data set, is addressed. Then, to assure that all features are relevant, their importance is examined by employing a novel feature selection algorithm. At the end, three classification algorithms, namely decision tree, logistic regression, and naive Bayesian classifier are trained and tested and their performances are evaluated.

[1]  Mo-Yuen Chow,et al.  A classification approach for power distribution systems fault cause identification , 2006, IEEE Transactions on Power Systems.

[2]  Matti Lehtonen,et al.  Impacts of Fault Diagnosis Schemes on Distribution System Reliability , 2012, IEEE Transactions on Smart Grid.

[3]  Miroslav Kubat,et al.  An Introduction to Machine Learning , 2015, Springer International Publishing.

[4]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[5]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[6]  Mo-Yuen Chow,et al.  Power Distribution Fault Cause Identification With Imbalanced Data Using the Data Mining-Based Fuzzy Classification $E$-Algorithm , 2007, IEEE Transactions on Power Systems.

[7]  Lakhmi C. Jain,et al.  Feature Selection for Data and Pattern Recognition , 2014, Feature Selection for Data and Pattern Recognition.

[8]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[9]  D.G. Kreiss Fault and Equipment Failure Analysis in Distribution Systems Using Intelligent Techniques , 2007, 2007 IEEE Power Engineering Society General Meeting.

[10]  Witold R. Rudnicki,et al.  Feature Selection with the Boruta Package , 2010 .

[11]  Nicola Torelli,et al.  ROSE: a Package for Binary Imbalanced Learning , 2014, R J..

[12]  Charu C. Aggarwal,et al.  Feature Selection for Classification: A Review , 2014, Data Classification: Algorithms and Applications.

[13]  Petar M. Djuric,et al.  Prediction of power equipment failures based on chronological failure records , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[14]  Jesper Tegnér,et al.  Consistent Feature Selection for Pattern Recognition in Polynomial Time , 2007, J. Mach. Learn. Res..

[15]  M.-Y. Chow,et al.  Data Mining and Analysis of Tree-Caused Faults in Power Distribution Systems , 2006, 2006 IEEE PES Power Systems Conference and Exposition.