Statistical analysis of coverage error in simple global temperature estimators

Background Global mean surface temperature is widely used in the climate literature as a measure of the impact of human activity on the climate system. While the concept of a spatial average is simple, the estimation of that average from spatially incomplete data is not. Correlation between nearby map grid cells means that missing data cannot simply be ignored. Estimators that (often implicitly) assume uncorrelated observations can be biased when naively applied to the observed data, and in particular, the commonly used area weighted average is a biased estimator under these circumstances. Some surface temperature products use different forms of infilling or imputation to estimate temperatures for regions distant from the nearest observation, however the impacts of such methods on estimation of the global mean are not necessarily obvious or themselves unbiased. This issue was addressed in the 1970s by Ruvim Kagan, however his work has not been widely adopted, possibly due to its complexity and dependence on subjective choices in estimating the dependence between geographically proximate observations. Objectives The aim of this work is to present a simple estimator for global mean surface temperature from spatially incomplete data which retains many of the benefits of the work of Kagan, while being fully specified by two equations and a single parameter. The main purpose of the simplified estimator is to better explain to users of temperature data the problems associated with estimating an unbiased global mean from spatially incomplete data, however the estimator may also be useful for problems with specific requirements for reproducibility and performance. Methods The new estimator is based on generalized least squares, and uses the correlation matrix of the observations to weight each observation in accordance with the independent information it contributes. It can be implemented in fewer than 20 lines of computer code. The performance of the estimator is evaluated for different levels of observational coverage using reanalysis data with artificial noise. Results For recent decades the generalized least squares estimator mitigates most of the error associated with the use of a naive area weighted average. The improvement arises from the fact that coverage bias in the historical temperature record does not arise from an absolute shortage of observations (at least for recent decades), but rather from spatial heterogeneity in the distribution of observations, with some regions being relatively undersampled and others oversampled. The estimator addresses this problem by reducing the weight of the oversampled regions, in contrast to some existing global temperature datasets which extrapolate temperatures into the unobserved regions. The results are almost identical to the use of kriging (Gaussian process interpolation) to impute missing data to global coverage, followed by an area weighted average of the resulting field. However, the new formulation allows direct diagnosis of the contribution of individual observations and sources of error. Conclusions More sophisticated solutions to the problem of missing data in global temperature estimation already exist. However the simple estimator presented here, and the error analysis that it enables, demonstrate why such solutions are necessary. The 2013 Fifth Assessment Report of the Intergovernmental Panel on Climate Change discussed a slowdown in warming for the period 1998-2012, quoting the trend in the HadCRUT4 historical temperature dataset from the United Kingdom Meteorological Office in collaboration with the Climatic Research Unit of the University of East Anglia, along with other records. Use of the new estimator for global mean surface temperature would have reduced the apparent slowdown in warming of the early 21st century by one third in the spatially incomplete HadCRUT4 product.

[1]  Robert Johansson,et al.  Numerical Python , 2018, Apress.

[2]  Jared Rennie,et al.  The international surface temperature initiative global land surface databank: monthly temperature data release description and methods , 2014 .

[3]  M. Benno Blumenthal,et al.  Reduced space optimal analysis for historical data sets: 136 years of Atlantic sea surface temperatures , 1997 .

[4]  J. Hansen,et al.  Global temperature change , 2006, Proceedings of the National Academy of Sciences.

[5]  Michel Crucifix,et al.  Thermohaline circulation hysteresis: A model intercomparison , 2005 .

[6]  Thomas C. Peterson,et al.  Possible artifacts of data biases in the recent global surface warming hiatus , 2015, Science.

[7]  Bin Zhao,et al.  The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). , 2017, Journal of climate.

[8]  J. Graham,et al.  Missing data analysis: making it work in the real world. , 2009, Annual review of psychology.

[9]  Dick Dee,et al.  Low‐frequency variations in surface atmospheric humidity, temperature, and precipitation: Inferences from reanalyses and monthly gridded observational data sets , 2010 .

[10]  K. Vinnikov,et al.  Empirical Data on Contemporary Global Climate Changes (Temperature and Precipitation) , 1990 .

[11]  Thomas M. Smith,et al.  Improvements to NOAA’s Historical Merged Land–Ocean Surface Temperature Analysis (1880–2006) , 2008 .

[12]  Stefan Rahmstorf,et al.  Global temperature evolution 1979–2010 , 2011 .

[13]  Thomas M. Smith,et al.  Extended Reconstructed Sea Surface Temperature Version 4 (ERSST.v4). Part I: Upgrades and Intercomparisons , 2014 .

[14]  Reto Knutti,et al.  Reconciling controversies about the ‘global warming hiatus’ , 2017, Nature.

[15]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[16]  Thomas M. Smith,et al.  Global temperature change and its uncertainties since 1861 , 2001 .

[17]  J. Curry,et al.  Berkeley Earth Temperature Averaging Process , 2013 .

[18]  P. Jones,et al.  Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set , 2012 .

[19]  Kevin Cowtan,et al.  Assessing recent warming using instrumentally homogeneous sea surface temperature records , 2017, Science Advances.

[20]  Stephan Lewandowsky,et al.  On the definition and identifiability of the alleged “hiatus” in global warming , 2015, Scientific Reports.

[21]  Nick Rayner,et al.  Reassessing biases and other uncertainties in sea surface temperature observations measured in situ since 1850: 1. Measurement and sampling uncertainties , 2011 .

[22]  Thomas M. Smith,et al.  Optimal Averaging of Seasonal Sea Surface Temperatures and Associated Confidence Intervals (1860–1989) , 1994 .

[23]  J. Thepaut,et al.  The ERA‐Interim reanalysis: configuration and performance of the data assimilation system , 2011 .

[24]  Ed Anderson,et al.  LAPACK Users' Guide , 1995 .

[25]  J. Hansen,et al.  GLOBAL SURFACE TEMPERATURE CHANGE , 2010 .

[26]  P. Jones,et al.  An intercomparison of trends in surface air temperature analyses at the global, hemispheric, and grid‐box scale , 2005 .

[27]  Xrin –Xe,et al.  DEPARTMENT OF COMMERCE National Oceanic and Atmospheric Administration , 2017 .

[28]  J. Thepaut,et al.  A reassessment of temperature variations and trends from global reanalyses and monthly surface climatological datasets , 2017 .

[29]  Chris Chatfield,et al.  19. Statistical Analysis with Missing Data , 1988 .

[30]  K. Cowtan,et al.  Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends , 2014 .

[31]  W. Collins,et al.  Evaluation of climate models , 2013 .

[32]  Thomas M. Smith,et al.  Averaging of Meteorological Fields , 1997 .

[33]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[34]  P. Jones,et al.  Uncertainty estimates in regional and global observed temperature changes: A new data set from 1850 , 2006 .

[35]  P. Jones,et al.  Hemispheric and Large-Scale Surface Air Temperature Variations: An Extensive Revision and an Update to 2001. , 2003 .

[36]  P. Jones,et al.  Hemispheric and large-scale land-surface air temperature variations: An extensive revision and an update to 2010: LAND-SURFACE TEMPERATURE VARIATIONS , 2012 .

[37]  Keith R. Briffa,et al.  Estimating Sampling Errors in Large-Scale Temperature Averages , 1997 .

[38]  S. Guillas,et al.  Uncertainty in regional temperatures inferred from sparse global observations: Application to a probabilistic classification of El Niño , 2017 .

[39]  Reto Knutti,et al.  Energy budget constraints on climate response , 2013 .

[40]  Peter W. Thorne,et al.  Revisiting radiosonde upper air temperatures from 1958 to 2002 , 2005 .

[41]  B. Christiansen,et al.  Recent global warming hiatus dominated by low‐latitude temperature trends in surface and troposphere data , 2015 .

[42]  Corinne Le Quéré,et al.  Climate Change 2013: The Physical Science Basis , 2013 .

[43]  Ying Xu,et al.  Recently amplified arctic warming has contributed to a continual global warming trend , 2017, Nature Climate Change.

[44]  Thomas M. Smith,et al.  NOAA's Merged Land-Ocean Surface Temperature Analysis , 2012 .

[45]  R. Madden,et al.  Optimal Averaging for the Determination of Global Mean Temperature: Experiments with Model Data , 1995 .

[46]  kwang-yul kim,et al.  Spectral Approach to Optimal Estimation of the Global Average Temperature , 1994 .

[47]  Mark New,et al.  Surface air temperature and its changes over the past 150 years , 1999 .

[48]  Paul Poli,et al.  Arctic warming in ERA‐Interim and other analyses , 2015 .

[49]  M. Bosilovich,et al.  Modern Era Retrospective-Analysis for Research and Applications , 2009 .

[50]  J. Houghton,et al.  Climate Change 2013 - The Physical Science Basis: Working Group I Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change , 2014 .

[51]  C. Deser,et al.  Towards predictive understanding of regional climate change , 2015 .