Binding data mining and expert knowledge for one-day-ahead prediction of hourly global solar radiation

Abstract A new methodology to predict one-day-ahead hourly solar global radiation is proposed in this paper. This information is very useful to address many real problems; for instance, energy-market decision making is one of the contexts where that information is essential to ensure the correct integration of grid-connected photovoltaic solar systems. The developed methodology is based on the contribution of different experts to obtain improved data-driven models when included in the data mining process. The modelling phase, when models are induced and new patterns can be identified, is the one that most benefits from that expert knowledge. In this case, it is achieved by combining clustering, regression and classification methods that exploit meteorological data (directly measured or predicted by weather services). The developed models have been embedded in a prediction system that offers reliable forecasts on next-day hourly global solar radiation. As a result of the automatic learning process including the knowledge of different experts, 14 different types of day were identified based on the shape of hourly solar radiation throughout a day. The conventional definitions of types of days, that usually consider 4 options, are updated with this new proposal. The next-day prediction of hourly global radiation is obtained in two phases: in the first one, the next-day type is obtained from among the 14 possible types of day; in the second one, values of hourly global radiation are obtained using the centroid of the predicted type of day and extraterrestrial solar radiation. The relative root mean square error of the prediction model is less than 20%, meaning a significant reduction compared to previous models. Moreover, the proposed models can be recognized in the context of eXplainable Artificial Intelligence.

[1]  M. Muselli,et al.  Classification of typical meteorological days from global irradiation records and comparison between two Mediterranean coastal sites in Corsica Island , 2000 .

[2]  Viorel Badescu,et al.  A current perspective on the accuracy of incoming solar energy forecasting , 2019, Progress in Energy and Combustion Science.

[3]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[6]  S. Deng,et al.  A critical review of the models used to estimate solar radiation , 2017 .

[7]  Miguel-Ángel Manso-Callejo,et al.  Forecasting short-term solar irradiance based on artificial neural networks and data from neighboring meteorological stations , 2016 .

[8]  María Pérez-Ortiz,et al.  A mixture of experts model for predicting persistent weather patterns , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[9]  Seungjin Choi,et al.  Supervised Learning , 2009, Encyclopedia of Biometrics.

[10]  Jaime S. Cardoso,et al.  Machine Learning Interpretability: A Survey on Methods and Metrics , 2019, Electronics.

[11]  J. Kleissl,et al.  Chapter 8 – Overview of Solar-Forecasting Methods and a Metric for Accuracy Evaluation , 2013 .

[12]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[13]  Youcef Messlem,et al.  Estimation of the daily global solar radiation based on Box–Jenkins and ANN models: A combined approach , 2016 .

[14]  Stéphanie Monjoly,et al.  Forecast Horizon and Solar Variability Influences on the Performances of Multiscale Hybrid Forecast Model , 2019 .

[15]  Y. Krakovsky,et al.  Robust interval forecasting algorithm based on a probabilistic cluster model , 2018 .

[16]  A. Mellit,et al.  A 24-h forecast of solar irradiance using artificial neural network: Application for performance prediction of a grid-connected PV plant at Trieste, Italy , 2010 .

[17]  Cyril Voyant,et al.  Multi-horizon solar radiation forecasting for Mediterranean locations using time series models , 2013, ArXiv.

[18]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[19]  Christophe Ponsard,et al.  Combining Process Guidance and Industrial Feedback for Successfully Deploying Big Data Projects , 2017, Open J. Big Data.

[20]  Llanos Mora-López,et al.  Influence of time resolution in the estimation of self-consumption and self-sufficiency of photovoltaic facilities , 2018, Applied Energy.

[21]  Ozgur Kisi,et al.  Modeling solar radiation of Mediterranean region in Turkey by using fuzzy genetic approach , 2014 .

[22]  Yingjie Tian,et al.  A Comprehensive Survey of Clustering Algorithms , 2015, Annals of Data Science.

[23]  Muammer Ozgoren,et al.  Estimation of global solar radiation using ANN over Turkey , 2012, Expert Syst. Appl..

[24]  Chengqi Zhang,et al.  Data preparation for data mining , 2003, Appl. Artif. Intell..

[25]  Bangyin Liu,et al.  Online 24-h solar power forecasting based on weather type classification using artificial neural network , 2011 .

[26]  Llanos Mora-López,et al.  Modeling and forecasting hourly global solar radiation using clustering and classification techniques , 2016 .

[27]  Ravinesh C. Deo,et al.  Deep solar radiation forecasting with convolutional neural network and long short-term memory network algorithms , 2019, Applied Energy.

[28]  Yugang Niu,et al.  Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM , 2018 .

[29]  Ian H. Witten,et al.  Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques , 2016 .

[30]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[31]  Nikos E. Mastorakis,et al.  Multilayer perceptron and neural networks , 2009 .

[32]  Il-Yop Chung,et al.  Day-Ahead Solar Irradiance Forecasting for Microgrids Using a Long Short-Term Memory Recurrent Neural Network: A Deep Learning Approach , 2019, Energies.

[33]  Yong Wang,et al.  Using Model Trees for Classification , 1998, Machine Learning.

[34]  Mashud Rana,et al.  Multiple steps ahead solar photovoltaic power forecasting based on univariate machine learning models and data re-sampling , 2020 .

[35]  M. Iqbal An introduction to solar radiation , 1983 .

[36]  Betul Bektas Ekici,et al.  A least squares support vector machine model for prediction of the next day solar insolation for effective use of PV systems , 2014 .

[37]  Daniela M. Witten,et al.  An Introduction to Statistical Learning: with Applications in R , 2013 .

[38]  Matteo De Felice,et al.  Data-driven upscaling methods for regional photovoltaic power estimation and forecast using satellite and numerical weather prediction data , 2017 .

[39]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[40]  X. Wen,et al.  A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset , 2016 .

[41]  C. Coimbra,et al.  Forecasting of global and direct solar irradiance using stochastic learning methods, ground experiments and the NWS database , 2011 .

[42]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[43]  Kurt Hornik,et al.  Open-source machine learning: R meets Weka , 2009, Comput. Stat..

[44]  M. Shcherbakov,et al.  A Survey of Forecast Error Measures , 2013 .

[45]  David Mease,et al.  Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..

[46]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .