Space‐Time Data fusion Under Error in Computer Model Output: An Application to Modeling Air Quality

We provide methods that can be used to obtain more accurate environmental exposure assessment. In particular, we propose two modeling approaches to combine monitoring data at point level with numerical model output at grid cell level, yielding improved prediction of ambient exposure at point level. Extending our earlier downscaler model (Berrocal, V. J., Gelfand, A. E., and Holland, D. M. (2010b). A spatio-temporal downscaler for outputs from numerical models. Journal of Agricultural, Biological and Environmental Statistics 15, 176-197), these new models are intended to address two potential concerns with the model output. One recognizes that there may be useful information in the outputs for grid cells that are neighbors of the one in which the location lies. The second acknowledges potential spatial misalignment between a station and its putatively associated grid cell. The first model is a Gaussian Markov random field smoothed downscaler that relates monitoring station data and computer model output via the introduction of a latent Gaussian Markov random field linked to both sources of data. The second model is a smoothed downscaler with spatially varying random weights defined through a latent Gaussian process and an exponential kernel function, that yields, at each site, a new variable on which the monitoring station data is regressed with a spatial linear model. We applied both methods to daily ozone concentration data for the Eastern US during the summer months of June, July and August 2001, obtaining, respectively, a 5% and a 15% predictive gain in overall predictive mean square error over our earlier downscaler model (Berrocal et al., 2010b). Perhaps more importantly, the predictive gain is greater at hold-out sites that are far from monitoring sites.

[1]  C. F. Sirmans,et al.  Nonstationary multivariate process modeling through spatially varying coregionalization , 2004 .

[2]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[3]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[4]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[5]  D. Byun,et al.  Review of the Governing Equations, Computational Algorithms, and Other Components of the Models-3 Community Multiscale Air Quality (CMAQ) Modeling System , 2006 .

[6]  Hans Wackernagel,et al.  Multivariate Geostatistics: An Introduction with Applications , 1996 .

[7]  Bradley P. Carlin,et al.  Bayesian areal wombling for geographical boundary analysis , 2005 .

[8]  A. Gelfand,et al.  A bivariate space-time downscaler under space and time misalignment. , 2010, The annals of applied statistics.

[9]  J. Besag,et al.  Bayesian Computation and Stochastic Systems , 1995 .

[10]  Haotian Hang,et al.  Inconsistent Estimation and Asymptotically Equal Interpolations in Model-Based Geostatistics , 2004 .

[11]  Yuhang Wang,et al.  Statistical correction and downscaling of chemical transport model ozone forecasts over Atlanta , 2008 .

[12]  A. Gelfand,et al.  High-Resolution Space–Time Ozone Modeling for Assessing Trends , 2007, Journal of the American Statistical Association.

[13]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[14]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[15]  Minjung Kyung,et al.  Bayesian Inference for Directional Conditionally Autoregressive Models , 2009 .

[16]  Michael A. West,et al.  Bayesian Forecasting and Dynamic Models (2nd edn) , 1997, J. Oper. Res. Soc..

[17]  Alan E Gelfand,et al.  A Spatio-Temporal Downscaler for Output From Numerical Models , 2010, Journal of agricultural, biological, and environmental statistics.

[18]  C. Gotway,et al.  Combining Incompatible Spatial Data , 2002 .

[19]  David Higdon,et al.  A process-convolution approach to modelling temperatures in the North Atlantic Ocean , 1998, Environmental and Ecological Statistics.

[20]  Montserrat Fuentes,et al.  Model Evaluation and Spatial Interpolation by Bayesian Combination of Observations with Outputs from Numerical Models , 2005, Biometrics.

[21]  R. Kohn,et al.  On Gibbs sampling for state space models , 1994 .

[22]  Jingyu Feng,et al.  Combining numerical model output and particulate data using Bayesian space–time modeling , 2009 .

[23]  Alan E. Gelfand,et al.  Spatial process modelling for univariate and multivariate dynamic spatial data , 2005 .

[24]  Noel A. C. Cressie,et al.  Statistics for Spatial Data: Cressie/Statistics , 1993 .

[25]  James V. Zidek,et al.  Combining Measurements and Physical Model Outputs for the Spatial Prediction of Hourly Ozone Space-Time Fields ∗ , 2008 .