Clustering of Cyclostationary Signals with Applications to Climate Station Sitings, Eliminations, and Merges

This paper considers methods to classify and discriminate the multidimensional cyclostationary climatological time series. The methods take into account both the seasonal mean cycles and random behavior in the series, simultaneously considering means and all component-by-component autocovariances. This improves classical Hotelling T2 statistics that classify through mean changes only, and constant-mean speech methods that classify exclusively through sample autocovariance differences. Here, two series are compared by assuming that both follow the same time series model; from this, a test statistic representing a distance between the two series is developed from linear prediction theory. This construction generates a level- α test statistic for Gaussian data that can be used to assess how different the two series are. The derived distances can be used in a clustering algorithm, e.g., to group series with similar behavior. Such information is useful to eliminate or merge the climate stations whose data are redundant to another station or to optimally locate a collection of stations. The techniques are first tested on simulated series with known structures, and then applied to 11 two-dimensional series in the National Oceanic and Atmospheric Administration's data buoy catalog. Natural clusters emerge which are geographically realistic. Specifically, the methods were able to perfectly group stations in the Gulf of Mexico, the Carolina Coast, the Pacific Ocean, and offshore New England.

[1]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[2]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[3]  Benjamin Bechtel,et al.  Classification of Local Climate Zones Based on Multiple Earth Observation Data , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[4]  Elizabeth Ann Maharaj Comparison of non-stationary time series in the frequency domain , 2002 .

[5]  P. V. de Souza,et al.  LPC distance measures and statistical tests with particular reference to the likelihood ratio , 1982 .

[6]  Gerik Scheuermann,et al.  Visual Exploration of Climate Variability Changes Using Wavelet Analysis , 2009, IEEE Transactions on Visualization and Computer Graphics.

[7]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2003, Third IEEE International Conference on Data Mining.

[8]  G. P. King,et al.  Using cluster analysis to classify time series , 1992 .

[9]  William A. Gardner,et al.  Characterization of cyclostationary random signal processes , 1975, IEEE Trans. Inf. Theory.

[10]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[11]  H. Tong,et al.  Cluster of time series models: an example , 1990 .

[12]  D. Piccolo A DISTANCE MEASURE FOR CLASSIFYING ARIMA MODELS , 1990 .

[13]  T. D. Mitchell,et al.  An improved method of constructing a database of monthly climate observations and associated high‐resolution grids , 2005 .

[14]  Marc Prevosto,et al.  Survey of stochastic models for wind and sea state time series , 2007 .

[15]  Elizabeth Ann Maharaj,et al.  Cluster of Time Series , 2000, J. Classif..

[16]  P. J. Diggle,et al.  TESTS FOR COMPARING TWO ESTIMATED SPECTRAL DENSITIES , 1986 .

[17]  Peter J. Diggle,et al.  Nonparametric Comparison of Cumulative Periodograms , 1991 .

[18]  Robert J. Schalkoff,et al.  Pattern recognition : statistical, structural and neural approaches / Robert J. Schalkoff , 1992 .

[19]  Ngan Tran,et al.  Snow Facies Over Ice Sheets Derived From Envisat Active and Passive Observations , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[20]  Bo Li,et al.  Revisiting Climate Region Definitions via Clustering , 2009 .

[21]  K. Rose Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.

[22]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[23]  J. Swanepoel,et al.  The comparison of two spectral density functions using the bootstrap , 1986 .

[24]  Pierre F. J. Lermusiaux,et al.  Many Task Computing for Real-Time Uncertainty Prediction and Data Assimilation in the Ocean , 2011, IEEE Transactions on Parallel and Distributed Systems.

[25]  David G. Long,et al.  An iterative approach to multisensor sea ice classification , 2000, IEEE Trans. Geosci. Remote. Sens..

[26]  William Cyrus Navidi,et al.  Statistics for Engineers and Scientists , 2004 .

[27]  C. Chatfield,et al.  Fourier Analysis of Time Series: An Introduction , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[28]  James J. Simpson,et al.  Long-term climate patterns in Alaskan surface temperature and precipitation and their biological consequences , 2002, IEEE Trans. Geosci. Remote. Sens..

[29]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[30]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[31]  Elizabeth Ann Maharaj,et al.  Comparison and classification of stationary multivariate time series , 1999, Pattern Recognit..

[32]  Robert Lund,et al.  Choosing Seasonal Autocovariance Structures: PARMA or SARMA? , 2012 .

[33]  Lee-Ing Tong,et al.  Monitoring defects in IC fabrication using a Hotelling T/sup 2/ control chart , 2005 .

[34]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[35]  James Zijun Wang,et al.  Feature Selection in AVHRR Ocean Satellite Images by Means of Filter Methods , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[36]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[37]  Harry L. Hurd,et al.  Climatological time series with periodic correlation , 1995 .

[38]  Antonio Napolitano,et al.  Cyclostationarity: Half a century of research , 2006, Signal Process..

[39]  Wolfgang Lucht,et al.  Comparative evaluation of seasonal patterns in long time series of satellite image data and simulations of a global vegetation model , 2004, IEEE Transactions on Geoscience and Remote Sensing.