An improved random forest model of short-term wind-power forecasting to enhance accuracy, efficiency, and robustness

Short‐term wind‐power forecasting methods like neural networks are trained by empirical risk minimization. The local optimum and overfitting problem is likely to occur in the model‐training stage, leading to the poor ability of reasoning and generalization in the prediction stage. To solve the problem, a model of short‐term wind power forecasting is proposed based on 2‐stage feature selection and a supervised random forest in the paper. First, in data preprocessing, some redundant features can be removed by a variable importance measure method and intimate samples can be selected based on relevant analysis, so that the efficiency of model training and the correlation degree between input and output samples can be enhanced. Second, an improved supervised random forest (RF) methodology is proposed to compose a new RF based on evaluating the performance of each decision tree and restructuring the decision trees. A new index of external validation in correlation with wind speed in numerical weather prediction has been proposed to overcome the shortcomings of the internal validation index that seriously depends on the training samples. The simulation examples have verified the rationality and feasibility of the improvement. Case studies of measured data from a wind farm have shown that the proposed model has a better performance than the original RF, back propagation neural network, Bayesian network, and support vector machine, in aspects of ensuring accuracy, efficiency, and robustness, and especially if there is high rate of noisy data and wind power curtailment duration in the historical data.

[1]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Henrik Madsen,et al.  Conditional Weighted Combination of Wind Power Forecasts , 2010 .

[3]  Eréndira Rendón,et al.  A comparison of internal and external cluster validation indexes , 2011 .

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Wei Liu,et al.  Subagging for the improvement of predictive stability of extreme learning machine for spectral quantitative analysis of complex samples , 2017 .

[6]  D J Burke,et al.  Factors Influencing Wind Energy Curtailment , 2011, IEEE Transactions on Sustainable Energy.

[7]  Pierre Pinson,et al.  Wind Energy: Forecasting Challenges for Its Operational Management , 2013, 1312.6471.

[8]  Li Li,et al.  Maximum relevance minimum common redundancy feature selection for nonlinear data , 2017, Inf. Sci..

[9]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[10]  Soteris A. Kalogirou,et al.  Artificial neural networks in renewable energy systems applications: a review , 2001 .

[11]  Foster J. Provost,et al.  Machine learning for targeted display advertising: transfer learning in action , 2013, Machine Learning.

[12]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[13]  Henrik Madsen,et al.  Ensemble-based Probabilistic Forecasting at Horns Rev , 2009 .

[14]  Yi-Ming Wei,et al.  One day ahead wind speed forecasting: A resampling-based approach , 2016 .

[15]  Henrik Madsen,et al.  A new reference for wind power forecasting , 1998 .

[16]  David H. Wolpert,et al.  An Efficient Method To Estimate Bagging's Generalization Error , 1999, Machine Learning.

[17]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[18]  Geoffrey M. Henebry,et al.  Spatial model error analysis using autocorrelation indices , 1995 .

[19]  G. Giebel,et al.  Wind-Climate Estimation Based on Mesoscale and Microscale Modeling: Statistical–Dynamical Downscaling for Wind Energy Applications , 2014 .

[20]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[21]  Roberto Kawakami Harrop Galvão,et al.  Effect of the subsampling ratio in the application of subagging for multivariate calibration with the successive projections algorithm , 2011 .

[22]  Peter Bauer,et al.  The quiet revolution of numerical weather prediction , 2015, Nature.

[23]  Mark R. Segal,et al.  Machine Learning Benchmarks and Random Forest Regression , 2004 .

[24]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  U. Focken,et al.  New developments in wind energy forecasting , 2008, 2008 IEEE Power and Energy Society General Meeting - Conversion and Delivery of Electrical Energy in the 21st Century.

[26]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[27]  Xiao Wei,et al.  Wind curtailment of China׳s wind power operation: Evolution, causes and solutions , 2016 .

[28]  Gregor Giebel,et al.  Implementation of a Model Output Statistics based on meteorological variable screening for short‐term wind power forecast , 2013 .

[29]  Wenjian Wang,et al.  Error estimation based on variance analysis of k-fold cross-validation , 2017, Pattern Recognit..

[30]  Line H. Clemmensen,et al.  Forest Floor Visualizations of Random Forests , 2016, ArXiv.

[31]  Gregor Giebel,et al.  Wind power forecasting-a review of the state of the art , 2017 .

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[34]  David Infield,et al.  Short-term spatio-temporal prediction of wind speed and direction , 2014 .

[35]  Henrik Madsen,et al.  Spatio‐temporal analysis and modeling of short‐term wind power forecast errors , 2011 .