Clustering of wind resource data for the South African renewable energy development zones

This study investigates the use of clustering methodologies as a means of reducing spatio-temporal wind speed data into statistically representative classes of temporal profiles for further processing and interpretation. The clustering methodologies are applied to the high-resolution spatio-temporal, meso-scale renewable energy resource dataset produced for Southern Africa by the Council of Scientific and Industrial Research. This large dataset incorporates thousands of coordinates and represents a challenge from a computational perspective. This dataset can be reduced by applying clustering techniques to classify the temporal wind speed profiles into categories with similar statistical properties. Various clustering algorithms are considered, with the view to compare the performances of these algorithms for large wind resource datasets, namely k-means, partitioning around medoids, the clustering large applications algorithm, agglomerative clustering, the divisive analysis algorithm and fuzzy c-means clustering. Two distance measures are considered, namely the Euclidean distance and Pearson correlation distance. The validation metrics evaluated in the investigation includes the silhouette coefficient, the Calinski-Harabasz index and the Dunn index. Case study results are presented for the Komsberg Renewable Energy Development Zone, located in Western Cape, South Africa. This zone is selected based on the high mean wind speed and large standard deviation exhibited by the temporal wind speed profiles associated with the zone. The effects of seasonal variation in the temporal wind speed profiles are considered by partitioning the input dataset in accordance with the low and high demand seasons defined by the Megaflex Time of Use tariff. The clustered wind resource maps produced by the proposed methodology represent a valuable input dataset for further studies such as siting and the optimal geographical allocation of wind generation capacity to reduce the variability and ramping effects that are inherent to wind energy.

[1]  Aruna Bhat,et al.  K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Recognition , 2014, SOCO 2014.

[2]  G. W. Milligan,et al.  Methodology Review: Clustering Methods , 1987 .

[3]  Sonja Wogrin,et al.  The Market Value of Variable Renewables The Effect of Solar and Wind Power Variability on their Relative Price , 2013 .

[4]  K. Thangavel,et al.  Clustering Categorical Data Using Silhouette Coefficient as a Relocating Measure , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[5]  Tommi Kärkkäinen,et al.  Introduction to partitioning-based clustering methods with a robust example , 2006 .

[6]  K. alik An efficient k'-means clustering algorithm , 2008 .

[7]  Ian F. C. Smith,et al.  A comprehensive validity index for clustering , 2008, Intell. Data Anal..

[8]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[9]  Xin Jin,et al.  K-Means Clustering , 2010, Encyclopedia of Machine Learning.

[10]  M. A. Mottalib,et al.  An Accurate Grid -based PAM Clustering Method for Large Dataset , 2012 .

[11]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Kenneth G. Manton,et al.  Fuzzy Cluster Analysis , 2005 .

[13]  Gangman Yi,et al.  Analysis of Clustering Evaluation Considering Features of Item Response Data Using Data Mining Technique for Setting Cut-Off Scores , 2017, Symmetry.

[14]  Lion Hirth,et al.  The effect of solar wind power variability on their relative price , 2013 .

[15]  Hendrik J. Vermeulen,et al.  Clustering of Wind Resource Weibull Characteristics on the South African Renewable Energy Development Zones , 2019, 2019 10th International Renewable Energy Congress (IREC).

[16]  G. W. Milligan,et al.  An examination of the effect of six types of error perturbation on fifteen clustering algorithms , 1980 .

[17]  Jimy Dudhia,et al.  The Weather Research and Forecast Model: software architecture and performance [presentation] , 2005 .

[18]  M. C. Ortiz,et al.  Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes , 2004 .

[19]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[20]  D. Wilks Chapter 15 - Cluster Analysis , 2011 .