论文信息 - Load Forecasting of Power SCADA Based on Spark MLlib

Load Forecasting of Power SCADA Based on Spark MLlib

In order to improve the accuracy and speed of power forecasting in power SCADA system, a distributed real-time steaming forecasting model is designed based on K-means algorithm and Random Forest algorithm in the Spark machine learning library (MLlib). The model uses the sliding window mechanism to segment the incoming data stream. K-means Clustering is used to correct the abnormally data, and then the Random Forest Regression forecasting is performed. Model algorithms is implemented based on the Spark RDD, the performance of the algorithm is verified by sending the data through the daemon process which is a simulation of the message queue. The results show that the forecasting accuracy of the algorithm is superior to the traditional serial Random Forest forecasting and satisfies the real-time requirement. Keywords-component; spark; decision tree; random forest; kmenas

Tao Lin | Chong Jiang

[1] Ming-Wei Chang,et al. Load Forecasting Using Support Vector Machines: A Study on EUNITE Competition 2001 , 2004, IEEE Transactions on Power Systems.

[2] Enrique Castillo,et al. Electricity Load Forecast using Functional Networks , 2002 .

[3] Wang Baoy,et al. A Distributed Load Forecasting Algorithm Based on Cloud Computing and Extreme Learning Machine , 2014 .

[4] X. Ma,et al. Short‐Term Load Forecasting , 1999 .