A Novel Method of Statistical Line Loss Estimation for Distribution Feeders Based on Feeder Cluster and Modified XGBoost

The estimation of losses of distribution feeders plays a crucial guiding role for the planning, design, and operation of a distribution system. This paper proposes a novel estimation method of statistical line loss of distribution feeders using the feeder cluster technique and modified eXtreme Gradient Boosting (XGBoost) algorithm that is based on the characteristic data of feeders that are collected in the smart power distribution and utilization system. In order to enhance the applicability and accuracy of the estimation model, k-medoids algorithm with weighting distance for clustering distribution feeders is proposed. Meanwhile, a variable selection method for clustering distribution feeders is discussed, considering the correlation and validity of variables. This paper next modifies the XGBoost algorithm by adding a penalty function in consideration of the effect of the theoretical value to the loss function for the estimation of statistical line loss of distribution feeders. The validity of the proposed methodology is verified by 762 distribution feeders in the Shanghai distribution system. The results show that the XGBoost method has higher accuracy than decision tree, neural network, and random forests by comparison of Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Absolute Percentage Error (APE) indexes. In particular, the theoretical value can significantly improve the reasonability of estimated results.

[1]  Gheorghe Grigoras,et al.  Energy losses estimation in electrical distribution networks with a decision trees-based algorithm , 2013, 2013 8TH INTERNATIONAL SYMPOSIUM ON ADVANCED TOPICS IN ELECTRICAL ENGINEERING (ATEE).

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Vina Ayumi Pose-based human action recognition with Extreme Gradient Boosting , 2016, 2016 IEEE Student Conference on Research and Development (SCOReD).

[4]  Long Chen,et al.  Short-Term Load Forecasting Using EMD-LSTM Neural Networks with a Xgboost Algorithm for Feature Importance Evaluation , 2017 .

[5]  Xinwei Zheng,et al.  Radar emitter classification for large data set based on weighted-xgboost , 2017 .

[6]  Jianguo Jiang,et al.  Using Multi-features and Ensemble Learning Method for Imbalanced Malware Classification , 2016, 2016 IEEE Trustcom/BigDataSE/ISPA.

[7]  Sebastián Dormido,et al.  Determination of the optimal number of clusters using a spectral clustering optimization , 2016, Expert Syst. Appl..

[8]  Yufei Xia,et al.  A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring , 2017, Expert Syst. Appl..

[9]  Andy Liaw,et al.  Extreme Gradient Boosting as a Method for Quantitative Structure-Activity Relationships , 2016, J. Chem. Inf. Model..

[10]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[11]  Mahmoud Reza Haghifam,et al.  A New Loss Estimation Method in Limited Data Electric Distribution Networks , 2013, IEEE Transactions on Power Delivery.

[12]  Gianluca Guadagni,et al.  Application of machine learning methodologies to multiyear forecasts of video subscribers , 2017, 2017 Systems and Information Engineering Design Symposium (SIEDS).

[13]  C. Dortolina,et al.  The loss that is unknown is no loss at all: a top-down/bottom-up approach for estimating distribution losses , 2005, IEEE Transactions on Power Systems.

[14]  Donald E. Brown,et al.  Customer churn analysis for a software-as-a-service company , 2017, 2017 Systems and Information Engineering Design Symposium (SIEDS).

[15]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Masoud Dehghani,et al.  Distribution feeder classification based on self organized maps (case study: Lorestan province, Iran) , 2015, 2015 20th Conference on Electrical Power Distribution Networks Conference (EPDC).

[17]  François-Joseph Lapointe,et al.  Using the stability of objects to determine the number of clusters in datasets , 2017, Inf. Sci..

[18]  Jan Schepers,et al.  Selecting Among Multi-Mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria , 2008, J. Classif..

[19]  Leonardo M. O. Queiroz,et al.  Energy Losses Estimation in Power Distribution Systems , 2012, IEEE Transactions on Power Systems.

[20]  M. J. van der Laan,et al.  A new partitioning around medoids algorithm , 2003 .

[21]  Bryan Palmintier,et al.  Clustering distribution feeders in the Arizona Public Service territory , 2014, 2014 IEEE 40th Photovoltaic Specialist Conference (PVSC).

[22]  Ricardo Aler,et al.  Improving the separation of direct and diffuse solar radiation components using machine learning by gradient boosting , 2017 .

[23]  Faisal Saeed,et al.  Bioactive Molecule Prediction Using Extreme Gradient Boosting , 2016, Molecules.

[24]  D. L. Flaten,et al.  Distribution system losses calculated by percent loading , 1988 .

[25]  Boris G. Mirkin,et al.  Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads , 2010, J. Classif..

[26]  N. B. Venkateswarlu,et al.  A Behavioral Study of Some Widely Employed Partitional and Model-Based Clustering Algorithms and Their Hybridizations , 2017 .

[27]  Francisco J. Martínez de Pisón Ascacibar,et al.  Searching Parsimonious Solutions with GA-PARSIMONY and XGBoost in High-Dimensional Databases , 2016, SOCO-CISIS-ICEUTE.

[28]  D. Rajicic,et al.  Energy summation method for energy loss computation in radial distribution networks , 1996 .

[29]  Wojciech Kwedlo,et al.  A clustering method combining differential evolution with the K-means algorithm , 2011, Pattern Recognit. Lett..

[30]  O.M. Mikic,et al.  Variance-Based Energy Loss Computation in Low Voltage Distribution Networks , 2007, IEEE Transactions on Power Systems.

[31]  Chen Ning Analysis on Technical Line Losses of Power Grids and Countermeasures to Reduce Line Losses , 2006 .

[32]  Robert J. Broderick,et al.  Clustering methodology for classifying distribution feeders , 2013, 2013 IEEE 39th Photovoltaic Specialists Conference (PVSC).

[33]  Youyong Li,et al.  ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity. , 2017, Molecular pharmaceutics.

[34]  Christian Hennig,et al.  Recovering the number of clusters in data sets with noise features using feature rescaling factors , 2015, Inf. Sci..

[35]  Luis F. Ochoa,et al.  Statistical Top-Down approach for energy loss estimation in distribution systems , 2015, 2015 IEEE Eindhoven PowerTech.

[36]  Nicandro Cruz-Ramírez,et al.  Improved multi-objective clustering with automatic determination of the number of clusters , 2016, Neural Computing and Applications.

[37]  Xueqian Fu,et al.  Improved LSF method for loss estimation and its application in DG allocation , 2016 .

[38]  A. Padilha-Feltrin,et al.  A Top-Down Approach for Distribution Loss Evaluation , 2009, IEEE Transactions on Power Delivery.

[39]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[40]  A. L. Shenkman,et al.  Energy loss computation by using statistical techniques , 1990 .

[41]  P. Lezhniuk,et al.  Evaluation and forecast of electric energy losses in distribution networks applying fuzzy-logic , 2008, 2008 IEEE Power and Energy Society General Meeting - Conversion and Delivery of Electrical Energy in the 21st Century.

[42]  Chin Kim Gan,et al.  System wide MV distribution network technical losses estimation based on reference feeder and energy flow model , 2017 .

[43]  Yufei Xia,et al.  Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending , 2017, Electron. Commer. Res. Appl..

[44]  J. Antonanzas,et al.  Estimation methods for global solar radiation: Case study evaluation of five different approaches in central Spain , 2017 .