A case study in preferential sampling: Long term monitoring of air pollution in the UK

Abstract The effects of air pollution are a major concern both in terms of the environment and human health. The majority of information relating to concentrations of air pollution comes from monitoring networks, data from which are used to inform regulatory criteria and in assessing health effects. In the latter case, measurements from the network are interpreted as being representative of levels to which populations are exposed. However there is the possibility of selection bias if monitoring sites are located in only the most polluted areas, a concept referred to as preferential sampling. Here we examine long-term changes in levels of air pollution from a monitoring network in the UK which was operational from the 1960s until 2006. During this unique period in history, concentrations fell dramatically from levels which would be unrecognisable in the UK today, reflecting changes in the large scale use of fossil fuels. As levels fell the network itself was subject to considerable change. We use spatio-temporal models, set within a Bayesian framework using INLA for inference, to model declining concentrations in relation to changes in the network. The results support the hypothesis of preferential sampling that has largely been ignored in environmental risk analysis.

[1]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[2]  Gavin Shaddick,et al.  Long-term associations of outdoor air pollution with mortality in Great Britain , 2007, Thorax.

[3]  Bradley P Carlin,et al.  spBayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models. , 2007, Journal of statistical software.

[4]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[5]  A. Hansell,et al.  Land Use Regression Modeling To Estimate Historic (1962−1991) Concentrations of Black Smoke and Sulfur Dioxide for Great Britain , 2011, Environmental science & technology.

[6]  W. Ott A physical explanation of the lognormality of pollutant concentrations. , 1990, Journal of the Air & Waste Management Association.

[7]  H. Rue,et al.  Spatio-temporal modeling of particulate matter concentration through the SPDE approach , 2012, AStA Advances in Statistical Analysis.

[8]  Leonhard Held,et al.  Spatio‐temporal disease mapping using INLA , 2011 .

[9]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[10]  P. Diggle,et al.  Geostatistical inference under preferential sampling , 2010 .

[11]  Alan E Gelfand,et al.  On the effect of preferential sampling in spatial prediction , 2012, Environmetrics.

[12]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[13]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[14]  Nicole Kar,et al.  The UK Department for Environment, Food And Rural Affairs ("Defra") publishes the final report of the Cave review of competition and innovation in water markets , 2009 .

[15]  D. Dunson,et al.  Bayesian geostatistical modelling with informative sampling locations. , 2011, Biometrika.

[16]  Edzer Pebesma,et al.  Mapping of background air pollution at a fine spatial scale across the European Union. , 2009, The Science of the total environment.

[17]  Lou Massa,et al.  Can One Take the Logarithm or the Sine of a Dimensioned Quantity or a Unit? Dimensional Analysis Involving Transcendental Functions , 2011 .