A missing data treatment method for photovoltaic installations

Due to the high installation rate of Photovoltaics (PV) systems, the challenging task of data processing arises. As the number of intermittent PV systems grows especially in the distribution networks, this task becomes even more important. In many cases, the data present inconsistencies, i.e. missing, incomplete or profoundly wrong values. The present paper proposes a method built on machine learning algorithms for missing data completion. The original incomplete PV generation time series are filled and restored with low error. The proposed method can be easily applied in real installations with high rate of data collection and storage needs.

[1]  Eleni Koubli,et al.  Inference of missing PV monitoring data using neural networks , 2016, 2016 IEEE 43rd Photovoltaic Specialists Conference (PVSC).

[2]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[3]  Jitender S. Deogun,et al.  Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method , 2004, Rough Sets and Current Trends in Computing.

[4]  Edgar Acuña,et al.  The Treatment of Missing Values and its Effect on Classifier Accuracy , 2004 .

[5]  Xi Zhang,et al.  Iterative multi-task learning for time-series modeling of solar panel PV outputs , 2018 .

[6]  Siva Ramakrishna Madeti,et al.  Monitoring system for photovoltaic plants: A review , 2017 .

[7]  Zahir Tari,et al.  A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis , 2014, IEEE Transactions on Emerging Topics in Computing.

[8]  Diofantos G. Hadjimitsis,et al.  Filling in missing sea-surface temperature satellite data over the Eastern Mediterranean Sea using the DINEOF algorithm , 2014 .

[9]  Gustavo E. A. P. A. Batista,et al.  An analysis of four missing data treatment methods for supervised learning , 2003, Appl. Artif. Intell..

[10]  Goran Šimunović,et al.  Improving the residential natural gas consumption forecasting models by using solar radiation , 2014 .

[11]  Francisco Herrera,et al.  On the choice of the best imputation methods for missing values considering three groups of classification methods , 2012, Knowledge and Information Systems.

[12]  Bingru Yang,et al.  A SVM Regression Based Approach to Filling in Missing Values , 2005, KES.

[13]  Jerzy W. Grzymala-Busse,et al.  Handling Missing Attribute Values in Preterm Birth Data Sets , 2005, RSFDGrC.

[14]  Shin Ishii,et al.  A Bayesian missing value estimation method for gene expression profile data , 2003, Bioinform..

[15]  Christos Stefanakos,et al.  A unified methodology for the analysis, completion and simulation of nonstationary time series with missing values, with application to wave data , 2001 .

[16]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[17]  Martin Wistuba,et al.  Multi-Plant Photovoltaic Energy Forecasting Challenge with Regression Tree Ensembles and Hourly Average Forecasts , 2017, DC@PKDD/ECML.

[18]  Andrew K. C. Wong,et al.  Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[20]  Francisco Herrera,et al.  A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling missing attribute values: The good synergy between RBFNs and EventCovering method , 2010, Neural Networks.

[21]  Petri T. Helo,et al.  Big data applications in operations/supply-chain management: A literature review , 2016, Comput. Ind. Eng..

[22]  Douglas Steinley,et al.  K-means clustering: a half-century synthesis. , 2006, The British journal of mathematical and statistical psychology.

[23]  T. Schneider Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values. , 2001 .

[24]  Ioannis P. Panapakidis,et al.  Enhancing the clustering process in the category model load profiling , 2015 .