Analyzing big time series data in solar engineering using features and PCA

In solar engineering, we encounter big time series data such as the satellite-derived irradiance data and string-level measurements from a utility-scale photovoltaic (PV) system. While storing and hosting big data are certainly possible using today’s data storage technology, it is challenging to effectively and efficiently visualize and analyze the data. We consider a data analytics algorithm to mitigate some of these challenges in this work. The algorithm computes a set of generic and/or application-specific features to characterize the time series, and subsequently uses principal component analysis to project these features onto a two-dimensional space. As each time series can be represented by features, it can be treated as a single data point in the feature space, allowing many operations to become more amenable. Three applications are discussed within the overall framework, namely (1) the PV system type identification, (2) monitoring network design, and (3) anomalous string detection. The proposed framework can be easily translated to many other solar engineer applications.

[1]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[2]  S. Nann,et al.  Potentials for tracking photovoltaic systems and V-troughs in moderate climates , 1990 .

[3]  Dazhi Yang,et al.  Solar radiation on inclined surfaces: Corrections and benchmarks , 2016 .

[4]  Karen Abrinia,et al.  A review of principle and sun-tracking methods for maximizing solar systems output , 2009 .

[5]  Benjamin Heydecker,et al.  Estimating probability distributions of dynamic queues , 2015 .

[6]  Norman R. Swanson,et al.  Mining Big Data Using Parsimonious Factor, Machine Learning, Variable Selection and Shrinkage Methods , 2016 .

[7]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[8]  Clifford W. Hansen,et al.  Statistical criteria for characterizing irradiance time series. , 2010 .

[9]  R. Belmans,et al.  Fluctuations in instantaneous clearness index: Analysis and statistics , 2007 .

[10]  D. Lew,et al.  The Western Wind and Solar Integration Study Phase 2 , 2013 .

[11]  V. H. Lachos,et al.  mixsmsn: Fitting Finite Mixture of Scale Mixture of Skew-Normal Distributions , 2013 .

[12]  Rob J. Hyndman,et al.  Large-Scale Unusual Time Series Detection , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[13]  Lu Zhao,et al.  Forecasting of global horizontal irradiance by exponential smoothing, using decompositions , 2015 .

[14]  Shailesh Kumar,et al.  A monthly probability distribution function of daily global irradiation values appropriate for both tropical and temperate locations , 1987 .

[15]  Kara Clark,et al.  Western Wind and Solar Integration Study , 2011 .

[16]  Giorgio Sulligoi,et al.  A novel fault diagnosis technique for photovoltaic systems based on artificial neural networks , 2016 .

[17]  Alkiviadis F. Bais,et al.  Estimating probability distributions of solar irradiance , 2015, Theoretical and Applied Climatology.

[18]  Nicholas W. Miller,et al.  Western Wind and Solar Integration Study Phase 3 – Frequency Response and Transient Stability , 2014 .

[19]  Dazhi Yang,et al.  Forecast UPC-level FMCG demand, Part I: Exploratory analysis and visualization , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[20]  Radu Platon,et al.  Online Fault Detection in PV Systems , 2015, IEEE Transactions on Sustainable Energy.

[21]  M. Jurado,et al.  Statistical distribution of the clearness index with radiation data integrated over five minute intervals , 1995 .

[22]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[23]  Beatriz Pateiro-López,et al.  Generalizing the Convex Hull of a Sample: The R Package alphahull , 2010 .

[24]  P. Ineichen,et al.  A new operational model for satellite-derived irradiances: description and validation , 2002 .

[25]  Yang Dazhi,et al.  Spatial data dimension reduction using quadtree: A case study on satellite-derived solar radiation , 2016 .

[26]  K. Hollands,et al.  A probability density function for the clearness index, with applications , 1983 .

[27]  Dazhi Yang,et al.  Very short term irradiance forecasting using the lasso , 2015 .

[28]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[29]  Thomas Reindl,et al.  Solar irradiance monitoring network design using the variance quadtree algorithm , 2015 .

[30]  R. Inman,et al.  Solar forecasting methods for renewable energy integration , 2013 .

[31]  Hisashi Kobayashi,et al.  Probability, Random Processes, and Statistical Analysis: Random processes , 2011 .

[32]  Violeta Holmes,et al.  Fault detection algorithm for grid-connected photovoltaic plants , 2016 .

[33]  Andreas Kazantzidis,et al.  Retrieval of surface solar irradiance, based on satellite-derived cloud information, in Greece , 2015 .

[34]  Dazhi Yang,et al.  Reconciling solar forecasts: Geographical hierarchy , 2017 .

[35]  Ye Zhao,et al.  Fault experiments in a commercial-scale PV laboratory and fault detection using local outlier factor , 2014, 2014 IEEE 40th Photovoltaic Specialist Conference (PVSC).

[36]  B. Liu,et al.  Daily insolation on surfaces tilted towards equator , 1961 .

[37]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[38]  B. Lehman,et al.  Outlier detection rules for fault detection in solar photovoltaic arrays , 2013, 2013 Twenty-Eighth Annual IEEE Applied Power Electronics Conference and Exposition (APEC).

[39]  Jonathon Shlens,et al.  A Tutorial on Principal Component Analysis , 2014, ArXiv.

[40]  L. K. Hansen,et al.  On Clustering fMRI Time Series , 1999, NeuroImage.

[41]  Thomas Reindl,et al.  Optimal Orientation and Tilt Angle for Maximizing in-Plane Solar Irradiation for PV Applications in Singapore , 2014, IEEE Journal of Photovoltaics.

[42]  K. G. Terry Hollands,et al.  A three-state model for the probability distribution of instantaneous solar radiation, with applications , 2013 .

[43]  S. Wilcox,et al.  Users Manual for TMY3 Data Sets (Revised) , 2008 .

[44]  Dazhi Yang,et al.  Day-Ahead Solar Irradiance Forecasting in a Tropical Environment , 2015 .

[45]  Dazhi Yang,et al.  Very short-term irradiance forecasting at unobserved locations using spatio-temporal kriging , 2015 .

[46]  M. Hummon,et al.  Sub-Hour Solar Data for Power System Modeling From Static Spatial Variability Analysis: Preprint , 2012 .

[47]  R. L. Thorndike Who belongs in the family? , 1953 .

[48]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[49]  Andreas Kazantzidis,et al.  Determination of measuring sites for solar irradiance, based on cluster analysis of satellite-derived cloud estimations , 2013 .

[50]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Clifford W. Hansen,et al.  Evaluation of Global Horizontal Irradiance to Plane-of-Array Irradiance Models at Locations Across the United States , 2015, IEEE Journal of Photovoltaics.

[52]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[53]  P. Jirutitijaroen,et al.  Hourly solar irradiance time series forecasting using cloud cover index , 2012 .

[54]  Takashi Nakajima,et al.  Evaluation of Variation in Surface Solar Irradiance and Clustering of Observation Stations in Japan , 2016 .

[55]  Vincenzo d'Alessandro,et al.  Monitoring and Diagnostics of PV Plants by a Wireless Self-Powered Sensor for Individual Panels , 2016, IEEE Journal of Photovoltaics.

[56]  Christopher J. Smith,et al.  An all-sky radiative transfer method to predict optimal tilt and azimuth angle of a solar collector , 2016 .