Valid predictions with confidence estimation in an air pollution problem

The present study is aimed to evaluate levels of air pollution for the Barcelona Metropolitan Region. For this purpose, a newly developed approach called conformal predictors is considered, and, in particular, use is made of the ridge regression confidence machine (RRCM). The hallmark of this method is that it gives valid estimates, i.e. for a given level of significance of prediction, the probability of error does not exceed this level. Moreover, the chosen specification of the RRCM predictor does not place any requirements on data distribution, apart from being independent and identically distributed. A linear ridge regression conformal predictor has been applied to the data. It has allowed to obtain valid interval estimates of annual nitrogen dioxide concentrations with 95 % confidence. The model has provided good results, but to further increase the efficiency of prediction, the RBF kernel has been used. The data for this study have been provided by the XVPCA (Network for Monitoring and Forecasting of Air Pollution) of the Generalitat of Catalonia. The pollutant considered in this paper is nitrogen dioxide. Its values are represented by annual average concentrations within the period from 1998 to 2009. This paper also describes an application of ordinary kriging, and its results have been compared to those of ridge regression conformal predictor.

[1]  Bin Zou,et al.  Air pollution exposure assessment methods utilized in epidemiological studies. , 2009, Journal of environmental monitoring : JEM.

[2]  Hannes Kazianka,et al.  Bayesian Trans-Gaussian Kriging with Log-Log Transformed Skew Data , 2009 .

[3]  R. Lark,et al.  Geostatistics for Environmental Scientists , 2001 .

[4]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[5]  H. Wackernagle,et al.  Multivariate geostatistics: an introduction with applications , 1998 .

[6]  Bert Brunekreef,et al.  GIS-Based Estimation of Exposure to Particulate Matter and NO2 in an Urban Area: Stochastic versus Dispersion Modeling , 2005, Environmental health perspectives.

[7]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  M. Stein,et al.  A Bayesian analysis of kriging , 1993 .

[9]  A. Gammerman,et al.  On-line predictive linear regression , 2005, math/0511522.

[10]  J. Chilès,et al.  Geostatistics: Modeling Spatial Uncertainty , 1999 .

[11]  Beate Ritz,et al.  Particulate Air Pollution Exposure and C-reactive Protein During Early Pregnancy , 2011, Epidemiology.

[12]  Maria Antònia Barceló,et al.  Spatial variability in mortality inequalities, socioeconomic deprivation, and air pollution in small areas of the Barcelona Metropolitan Region, Spain. , 2009, The Science of the total environment.

[13]  R. Reese Geostatistics for Environmental Scientists , 2001 .

[14]  Amy J. Ruggles,et al.  An Experimental Comparison of Ordinary and Universal Kriging and Inverse Distance Weighting , 1999 .

[15]  N. Draper,et al.  Applied Regression Analysis. , 1967 .

[16]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[17]  Jürgen Pilz,et al.  Why do we need and how should we implement Bayesian kriging methods , 2008 .

[18]  Altaf Arain,et al.  A review and evaluation of intraurban air pollution exposure models , 2005, Journal of Exposure Analysis and Environmental Epidemiology.

[19]  Gavin C. Cawley,et al.  Fast exact leave-one-out cross-validation of sparse least-squares support vector machines , 2004, Neural Networks.

[20]  Who Europe Air Quality Guidelines Global Update 2005: Particulate Matter, ozone, nitrogen dioxide and sulfur dioxide , 2006 .

[21]  W. Gasarch,et al.  The Book Review Column 1 Coverage Untyped Systems Simple Types Recursive Types Higher-order Systems General Impression 3 Organization, and Contents of the Book , 2022 .

[22]  J. Schwartz,et al.  Short-term effects of nitrogen dioxide on mortality: an analysis within the APHEA project , 2006, European Respiratory Journal.

[23]  Mark S Goldberg,et al.  Assessing Spatial Variability of Ambient Nitrogen Dioxide in Montréal, Canada, with a Land-Use Regression Model , 2005, Journal of the Air & Waste Management Association.

[24]  P. Diggle,et al.  Model‐based geostatistics , 2007 .