Fuzzy clustering of time series using extremes

Abstract In this study we explore the grouping together of time series with similar seasonal patterns using extreme value analysis with fuzzy clustering. Input features into the fuzzy clustering methods are parameter estimates of time varying location, scale and shape obtained from fitting the generalised extreme value (GEV) distribution to annual maxima or the r -largest order statistics per year of the time series. An innovative contribution of the study is the development of new generalised fuzzy clustering procedures taking into account weights, and the derivation of iterative solutions based on the GEV parameter estimators. Simulation studies conducted to evaluate the methods, reveal good performance. An application is made to a set of daily sea-level time series from around the coast of Australia where the identified clusters are well validated and they can be meaningfully interpreted.

[1]  S. Coles,et al.  An Introduction to Statistical Modeling of Extreme Values , 2001 .

[2]  Geert Molenberghs,et al.  Marginal correlation from an extended random-effects model for repeated and overdispersed counts , 2011 .

[3]  Frank Klawonn,et al.  Fuzzy clustering: More than just fuzzification , 2015, Fuzzy Sets Syst..

[4]  Andrés M. Alonso,et al.  Clustering seasonal time series using extreme value analysis: An application to Spanish temperature time series , 2015 .

[5]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[6]  Daniel Peña,et al.  Bayesian analysis of dynamic factor models: an application to air pollution and mortality in São Paulo, Brazil , 2008 .

[7]  Ricardo J. G. B. Campello,et al.  A fuzzy extension of the silhouette width criterion for cluster analysis , 2006, Fuzzy Sets Syst..

[8]  D. Sundar,et al.  Analysis of extreme sea level along the east coast of India , 2004 .

[9]  Brian Everitt,et al.  Cluster analysis , 1974 .

[10]  Pierpaolo D’Urso,et al.  Autocorrelation-based fuzzy clustering of time series , 2009, Fuzzy Sets Syst..

[11]  Inigo J. Losada,et al.  Analyzing monthly extreme sea levels with a time-dependent GEV model , 2007 .

[12]  Pierpaolo D'Urso,et al.  Fuzzy Clustering for Data Time Arrays With Inlier and Outlier Time Trajectories , 2005, IEEE Transactions on Fuzzy Systems.

[13]  Andrés M. Alonso,et al.  Comparing generalized Pareto models fitted to extreme observations: an application to the largest temperatures in Spain , 2014, Stochastic Environmental Research and Risk Assessment.

[14]  P. Groenen,et al.  Cluster differences scaling with a within-clusters loss component and a fuzzy successive approximation strategy to avoid local minima , 1997 .

[15]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[16]  R. Reiss,et al.  Statistical Analysis of Extreme Values-with applications to insurance , 1997 .

[17]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[18]  Heungsun Hwang,et al.  Fuzzy Clusterwise Generalized Structured Component Analysis , 2007 .

[19]  Andrés M. Alonso,et al.  Extreme value and cluster analysis of European daily temperature series , 2011 .

[20]  Michael N. Tsimplis,et al.  Extreme sea-level distribution and return periods in the Aegean and Ionian Seas , 1997 .

[21]  C. Guedes Soares,et al.  Application of the r largest-order statistics for long-term predictions of significant wave height , 2004 .

[22]  Pierpaolo D'Urso Fuzzy C-Means Clustering Models For Multivariate Time-Varying Data: Different Approaches , 2004, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[23]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[24]  Elizabeth Ann Maharaj,et al.  Wavelet-based Fuzzy Clustering of Time Series , 2010, J. Classif..

[25]  Andrés M. Alonso,et al.  Clustering Time Series of Sea Levels: Extreme Value Approach , 2010 .