The distribution of Kendall's tau for testing the significance of cross-correlation in persistent data

Abstract Kendall's tau (τ) has been widely used as a distribution-free measure of cross-correlation between two variables. It has been previously shown that persistence in the two involved variables results in the inflation of the variance of τ. In this paper, the full null distribution of Kendall's τ for persistent data with multivariate Gaussian dependence is derived, and an approximation to the full distribution is proposed. The effect of the deviation from the multivariate Gaussian dependence model on the distribution of τ is also investigated. As a demonstration, the temporal consistency and field significance of the cross-correlation between the North Hemisphere (NH) temperature time series in the period 1850–1995 and a set of 784 NH tree-ring width (TRW) proxies in addition to 105 NH tree-ring maximum latewood density (MXD) proxies are studied. When persistence is ignored, the original Mann-Kendall test gives temporally inconsistent results between the early half (1850–1922) and the late half (1923–1995) of the record. These temporal inconsistencies are largely eliminated when persistence is accounted for, indicating the spuriousness of a large portion of the identified cross-correlations. Furthermore, the use of the modified test in combination with a field significance test that is robust to spatial correlation indicates the absence of field significant cross-correlation in both halves of the record. These results have serious implications for the use of tree-ring data as temperature proxies, and emphasize the importance of utilizing the correct distribution of Kendall's τ in order to avoid the overestimation of the significance of cross-correlation between data that exhibit significant persistence. Citation Hamed, K. H. (2011) The distribution of Kendall's tau for testing the significance of cross-correlation in persistent data. Hydrol. Sci. J. 56(5), 841–853.

[1]  J. Doornik,et al.  An Omnibus Test for Univariate and Multivariate Normality , 2008 .

[2]  H. Panarello,et al.  Large scale meteorological phenomena, ENSO and ITCZ, define the Paraná River isotope composition , 2009 .

[3]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[4]  S. Yue,et al.  Regional streamflow trend detection with consideration of both temporal and spatial correlation , 2002 .

[5]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[6]  A. Genz,et al.  Numerical computation of multivariate t-probabilities with application to power calculation of multiple contrasts , 1999 .

[7]  C. Genest,et al.  Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask , 2007 .

[8]  A. I. McLeod,et al.  Preservation of the rescaled adjusted range: 1. A reassessment of the Hurst Phenomenon , 1978 .

[9]  T. McMahon,et al.  El Nino/Southern Oscillation and Australian rainfall, streamflow and drought : Links and potential for forecasting , 1998 .

[10]  W. L. Lane,et al.  Applied Modeling of Hydrologic Time Series , 1997 .

[11]  H. Bravo,et al.  Coherence between atmospheric teleconnections, Great Lakes water levels, and regional climate , 2008 .

[12]  F. Serinaldi,et al.  Design hyetograph analysis with 3-copula function , 2006 .

[13]  Improved finite‐sample Hurst exponent estimates using rescaled range analysis , 2007 .

[14]  D. Culler,et al.  Comparison of methods , 2000 .

[15]  J. ...,et al.  Applied modeling of hydrologic time series , 1980 .

[16]  H. Grudd Torneträsk tree-ring width and density ad 500–2004: a test of climatic sensitivity and a new 1500-year reconstruction of north Fennoscandian summers , 2008 .

[17]  Demetris Koutsoyiannis,et al.  Statistical analysis of hydroclimatic time series: Uncertainty and insights , 2007 .

[18]  Khaled H. Hamed Trend detection in hydrologic data: The Mann–Kendall trend test under the scaling hypothesis , 2008 .

[19]  Carlo De Michele,et al.  Extremes in Nature : an approach using Copulas , 2007 .

[20]  P. Friederichs,et al.  Multivariate non-normally distributed random variables in climate research - introduction to the copula approach , 2008 .

[21]  Khaled H. Hamed Effect of persistence on the significance of Kendall’s tau as a measure of correlation between natural time series , 2009 .

[22]  Khaled H. Hamed Exact Distribution of the Mann-Kendall Trend Test Statistic for Persistent Data , 2009 .

[23]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[24]  W. Hoeffding,et al.  Rank Correlation Methods , 1949 .

[25]  S. Yue,et al.  Power of the Mann–Kendall and Spearman's rho tests for detecting monotonic trends in hydrological series , 2002 .

[26]  J. Royston An Extension of Shapiro and Wilk's W Test for Normality to Large Samples , 1982 .

[27]  Richard M. Vogel,et al.  Trends in floods and low flows in the United States: impact of spatial correlation , 2000 .

[28]  Demetris Koutsoyiannis,et al.  Climate change, the Hurst phenomenon, and hydrological statistics , 2003 .

[29]  Frank Bretz,et al.  Comparison of Methods for the Computation of Multivariate t Probabilities , 2002 .

[30]  M. Hughes,et al.  Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia , 2008, Proceedings of the National Academy of Sciences.

[31]  T. Piechota,et al.  Long lead-time streamflow forecasting of the North Platte River incorporating oceanic-atmospheric climate variability , 2009 .

[32]  H. B. Mann Nonparametric Tests Against Trend , 1945 .

[33]  Daniel S. Wilks,et al.  On “Field Significance” and the False Discovery Rate , 2006 .

[34]  M. Kendall,et al.  The Advanced Theory of Statistics: Vol. I—Distribution Theory , 1959 .