Anomaly Construction in Climate Data: Issues and Challenges

Earth science data consists of a strong seasonality component as indicated by the cycles of repeated patterns in climate variables such as air pressure, temperature and precipitation. The seasonality forms the strongest signals in this data and in order to find other patterns, the seasonality is removed by subtracting the monthly mean values of the raw data for each month. However since the raw data like air temperature, pressure, etc. are constantly being generated with the help of satellite observations, the climate scientists usually use a moving reference base interval of some years of raw data to calculate the mean in order to generate the anomaly time series and study the changes with respect to that. In this paper, we evaluate different measures for base computation and show how an arbitrary choice of base can skew the results and lead to a favorable outcome which might not necessarily be true. We perform a detailed study of different base selection criterion and base periods to highlight that the outcome of data mining can be sensitive to choice of the base. We present a case study of the dipole in the Sahel region to highlight the bias creeping into the results due to the choice of the base. Finally, we propose a generalized model for base selection which uses Monte-Carlo based methods to minimize the expected variance in the anomaly time-series of the underlying datasets. Our research can be instructive for climate scientists and researchers in temporal domain to enable them to choose the right base which would not bias the outcome of the results.

[1]  Vipin Kumar,et al.  Discovery of climate indices using clustering , 2003, KDD '03.

[2]  Li Wei,et al.  Assumption-Free Anomaly Detection in Time Series , 2005, SSDBM.

[3]  R. Reynolds,et al.  The NCEP/NCAR 40-Year Reanalysis Project , 1996, Renewable Energy.

[4]  Charles Jones,et al.  The Influence of Intraseasonal Variations on Medium- to Extended-Range Weather Forecasts over South America , 2000 .

[5]  D J Thomson,et al.  Dependence of global temperatures on atmospheric CO2 and solar irradiance. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[6]  C. F. Wu JACKKNIFE , BOOTSTRAP AND OTHER RESAMPLING METHODS IN REGRESSION ANALYSIS ' BY , 2008 .

[7]  P. Webster,et al.  The horizontal and vertical structure of east Asian winter monsoon pressure surges , 1999 .

[8]  Marten Scheffer,et al.  Regime Shifts in the Sahara and Sahel: Interactions between Ecological and Climatic Systems in Northern Africa , 2003, Ecosystems.

[9]  Martin P. Tingley,et al.  A Bayesian ANOVA Scheme for Calculating Climate Anomalies, with Applications to the Instrumental Temperature Record , 2012 .

[10]  Bhaskar Jha,et al.  A New Methodology for Estimating the Unpredictable Component of Seasonal Atmospheric Variability , 2007 .

[11]  Changbao Wu,et al.  Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis , 1986 .

[12]  R. Saravanan,et al.  Oceanic Forcing of Sahel Rainfall on Interannual to Interdecadal Time Scales , 2003, Science.

[13]  Chul Eddy Chung,et al.  On the evolution of the annual cycle in the tropical Pacific , 2001 .

[14]  Potsdam,et al.  Complex networks in climate dynamics. Comparing linear and nonlinear network construction methods , 2009, 0907.4359.

[15]  Vipin Kumar,et al.  Discovering Dynamic Dipoles in Climate Data , 2011, SDM.

[16]  K. Trenberth Some Effects of Finite Sample Size and Persistence on Meteorological Statistics. Part I: Autocorrelations , 1984 .

[17]  Paul J. Roebber,et al.  What Do Networks Have to Do with Climate , 2006 .