A sandwich smoother for spatio-temporal functional data

Abstract Statistical analysis of spatio-temporal data has been evolving to handle increasingly large data sets. For example, the North American CORDEX program is producing daily values of climate-related variables on spatial grids with approximately 100,000 locations over 150 years. Smoothing of such massive and noisy data is essential to understanding their spatio-temporal features. It also reduces the size of the data by representing them in terms of suitable basis functions, which facilitates further computations and statistical analysis. Traditional tensor-based methods break down under the size of such massive data. We develop a penalized spline method for representing such data using a generalization of the sandwich smoother proposed by Xiao et al. (2013). Unlike the original method, our generalization treats the spatial and temporal dimensions distinctly and allows the methodology to be directly applied to non-gridded data. We demonstrate the practicality of the methodology using both simulated and real data. The new smoother, as well as the original sandwich smoother, is implemented in the hero R package.

[1]  M. W. Qian,et al.  Coordinated Global and Regional Climate Modeling , 2016 .

[2]  E. Pebesma,et al.  Classes and Methods for Spatial Data , 2015 .

[3]  J. Minx,et al.  Climate Change 2014 : Synthesis Report , 2014 .

[4]  Roni Yagel,et al.  Hardware assisted volume rendering of unstructured grids by incremental slicing , 1996, Proceedings of 1996 Symposium on Volume Visualization.

[5]  G. Wahba Smoothing noisy data with spline functions , 1975 .

[6]  Ming-Jun Lai,et al.  Bivariate Penalized Splines for Regression , 2013 .

[7]  Ian H. Sloan,et al.  Wendland functions with increasing smoothness converge to a Gaussian , 2012, Adv. Comput. Math..

[8]  Karl E. Taylor,et al.  An overview of CMIP5 and the experiment design , 2012 .

[9]  Piotr Kokoszka,et al.  Evaluation of the cooling trend in the ionosphere using functional regression with incomplete curves , 2017 .

[10]  C. Reinsch Smoothing by spline functions , 1967 .

[11]  M. Durbán,et al.  Generalized linear array models with applications to multidimensional smoothing , 2006 .

[12]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[13]  Richard A. L. Jones,et al.  The North American Regional Climate Change Assessment Program: Overview of Phase I Results , 2012 .

[14]  Edzer J. Pebesma,et al.  Applied Spatial Data Analysis with R - Second Edition , 2008, Use R!.

[15]  T. Masui,et al.  Variation factors of global cropland requirements from the IPCC special report on emissions scenarios (SRES) , 2009 .

[16]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[17]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[18]  P. Kokoszka,et al.  Testing separability of space--time functional processes , 2015, 1509.07017.

[19]  John A. D. Aston,et al.  Smooth Principal Component Analysis over two-dimensional manifolds with an application to Neuroimaging , 2016, 1601.03670.

[20]  Spencer Graves,et al.  Functional Data Analysis with R and MATLAB , 2009 .

[21]  Chris Chatfield,et al.  Statistical Methods for Spatial Data Analysis , 2004 .

[22]  A. Weaver,et al.  The Canadian Centre for Climate Modelling and Analysis global coupled model and its climate , 2000 .

[23]  Holger Wendland,et al.  Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree , 1995, Adv. Comput. Math..

[24]  Joshua P French,et al.  autoimage: Multiple Heat Maps for Projected Coordinates , 2017, R J..

[25]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[26]  S. Klein,et al.  The new GFDL global atmosphere and land model AM2-LM2: Evaluation with prescribed SST simulations , 2004 .

[27]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[28]  H. Künsch Gaussian Markov random fields , 1979 .

[29]  Paul H. C. Eilers,et al.  Twenty years of P-splines , 2015 .

[30]  Emily L. Kang,et al.  Spatio‐Temporal data fusion for massive sea surface temperature data from MODIS and AMSR‐E instruments , 2018, Environmetrics.

[31]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[32]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[33]  John A. D. Aston,et al.  Tests for separability in nonparametric covariance operators of random surfaces , 2015, 1505.02023.

[34]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[35]  Paul H. C. Eilers,et al.  Fast and compact smoothing on large multidimensional grids , 2006, Comput. Stat. Data Anal..

[36]  Piotr Kokoszka,et al.  Detection of change in the spatiotemporal mean function , 2017 .

[37]  Jeffrey S. Morris,et al.  Robust and Gaussian spatial functional regression models for analysis of event-related potentials , 2018, NeuroImage.

[38]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[39]  Luo Xiao,et al.  Fast bivariate P‐splines: the sandwich smoother , 2013 .

[40]  L. Waller,et al.  Applied Spatial Statistics for Public Health Data: Waller/Applied Spatial Statistics , 2004 .

[41]  Piotr Kokoszka,et al.  Quantifying the risk of heat waves using extreme value theory and spatio-temporal functional data , 2019, Comput. Stat. Data Anal..

[42]  Surajit Ray,et al.  Functional principal component analysis of spatially correlated data , 2014, Stat. Comput..

[43]  S. Wood Generalized Additive Models: An Introduction with R, Second Edition , 2017 .

[44]  F. Giorgi,et al.  Addressing climate information needs at the regional level: the CORDEX framework , 2009 .

[45]  T. Gneiting Compactly Supported Correlation Functions , 2002 .

[46]  Song-You Hong,et al.  The NCEP Regional Spectral Model: An Update , 1997 .

[47]  David Ruppert,et al.  Semiparametric Regression: Author Index , 2003 .

[48]  F. O’Sullivan A Statistical Perspective on Ill-posed Inverse Problems , 1986 .

[49]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[50]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[51]  M. Dubey,et al.  Observed and model simulated 20th century Arctic temperature variability: Canadian Earth System Model CanESM2 , 2011 .

[52]  Matthias Katzfuss,et al.  Multi-Resolution Filters for Massive Spatio-Temporal Data , 2018, Journal of Computational and Graphical Statistics.

[53]  Jordan G. Powers,et al.  A Description of the Advanced Research WRF Version 2 , 2005 .

[54]  Richard G. Jones,et al.  A Regional Climate Change Assessment Program for North America , 2009 .