Effect of spatial outliers on the regression modelling of air pollutant concentrations: A case study in Japan

Abstract Land use regression (LUR) or regression kriging have been widely used to estimate spatial distribution of air pollutants especially in health studies. The quality of observations is crucial to these methods because they are completely dependent on observations. When monitoring data contain biases or uncertainties, estimated map will not be reliable. In this study, we apply the spatial outlier detection method, which is widely used in soil science, to observations of PM 2.5 and NO 2 obtained from the regulatory monitoring network in Japan. The spatial distributions of annual means are modelled both by LUR and regression kriging using the data sets with and without the detected outliers respectively and the obtained results are compared to examine the effect of spatial outliers. Spatial outliers remarkably deteriorate the prediction accuracy except for that of LUR model for NO 2 . This discrepancy of the effect might be due to the difference in the characteristics of PM 2.5 and NO 2 . The difference in the number of observations makes a limited contribution to it. Although further investigation at different spatial scales is required, our study demonstrated that the spatial outlier detection method is an effective procedure for air pollutant data and should be applied to it when observation based prediction methods are used to generate concentration maps.

[1]  Y.-J. Wu,et al.  Spatio‐temporal change of soil organic matter content of Jiangsu Province, China, based on digital soil maps , 2012 .

[2]  Jiansheng Wu,et al.  Applying land use regression model to estimate spatial variation of PM2.5 in Beijing, China , 2015, Environmental Science and Pollution Research.

[3]  A. Kondo,et al.  Application of Regression Kriging to Air Pollutant Concentrations in Japan with High Spatial Resolution , 2015 .

[4]  R. M. Lark,et al.  Modelling complex soil properties as contaminated regionalized variables , 2002 .

[5]  Robert J. Hijmans,et al.  Geographic Data Analysis and Modeling , 2015 .

[6]  J. Schwartz,et al.  Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the Mid-Atlantic states. , 2012, Environmental science & technology.

[7]  P. Elliott,et al.  A regression-based method for mapping traffic-related air pollution: application and testing in four contrasting urban environments. , 2000, The Science of the total environment.

[8]  Yuqi Bai,et al.  Daily Estimation of Ground-Level PM2.5 Concentrations over Beijing Using 3 km Resolution MODIS AOD. , 2015, Environmental science & technology.

[9]  Edzer J. Pebesma,et al.  Multivariable geostatistics in S: the gstat package , 2004, Comput. Geosci..

[10]  J. Lamarque,et al.  Description and evaluation of the Model for Ozone and Related chemical Tracers, version 4 (MOZART-4) , 2009 .

[11]  Peter A. Dowd,et al.  The Variogram and Kriging: Robust and Resistant Estimators , 1984 .

[12]  Stephen L. Rathbun,et al.  Characterizing the Spatiotemporal Variability of PM2.5in Cusco, Peru Using Kriging with External Drift. , 2009, ATS 2009.

[13]  Marc G. Genton,et al.  Highly Robust Variogram Estimation , 1998 .

[14]  Edzer Pebesma,et al.  Mapping of background air pollution at a fine spatial scale across the European Union. , 2009, The Science of the total environment.

[15]  Jun Wang,et al.  Intercomparison between satellite‐derived aerosol optical thickness and PM2.5 mass: Implications for air quality studies , 2003 .

[16]  M. Hubert,et al.  A Robust Measure of Skewness , 2004 .

[17]  Multi-Model Analyses of Dominant Factors Influencing Elemental Carbon in Tokyo Metropolitan Area of Japan , 2014 .

[18]  Takatoshi Hiraki,et al.  Significant geographic gradients in particulate sulfate over Japan determined from multiple-site measurements and a chemical transport model: Impacts of transboundary pollution from the Asian continent , 2010 .

[19]  Mark Richards,et al.  A regionalized national universal kriging model using Partial Least Squares regression for estimating annual PM2.5 concentrations in epidemiology. , 2013, Atmospheric environment.

[20]  Kazuhiko Ito,et al.  A land use regression for predicting fine particulate matter concentrations in the New York City region , 2007 .

[21]  T. R. Lister,et al.  The assessment of point and diffuse metal pollution of soils from an urban geochemical survey of Sheffield, England , 2005 .

[22]  Dayton Dove,et al.  Spatial prediction of seabed sediment texture classes by cokriging from a legacy database of point observations , 2012 .

[23]  A. Kondo,et al.  Evaluation of Air Quality Model Performance for Simulating Long-Range Transport and Local Pollution of PM2.5 in Japan , 2016 .

[24]  J. Schwartz,et al.  Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements , 2011 .

[25]  Xiaohui Xu,et al.  Predicting regional space–time variation of PM2.5 with land-use regression model and MODIS data , 2011, Environmental Science and Pollution Research.

[26]  N. Cressie,et al.  Robust estimation of the variogram: I , 1980 .

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  J. Chow,et al.  Quantification of PM 2.5 organic carbon sampling artifacts in US networks , 2010 .

[29]  R. M. Lark,et al.  A comparison of some robust estimators of the variogram for use in soil survey , 2000 .

[30]  Tomislav Hengl,et al.  A practical guide to geostatistical mapping of environmental variables , 2007 .

[31]  Chun-Nan Liu,et al.  Sampling and conditioning artifacts of PM2.5 in filter-based samplers , 2014 .

[32]  Andre G. Journel,et al.  Geostatistics for Natural Resources Characterization: Part 1 , 2013 .

[33]  Xuezheng Shi,et al.  Using robust kriging and sequential Gaussian simulation to delineate the copper- and lead-contaminated areas of a rapidly industrialized city in Yangtze River Delta, China , 2007 .

[34]  A. Kannari,et al.  Theoretical implication of reversals of the ozone weekend effect systematically observed in Japan , 2009 .

[35]  G. Heuvelink,et al.  Mapping Soil Properties of Africa at 250 m Resolution: Random Forests Significantly Improve Current Predictions , 2015, PloS one.

[36]  M. Brauer,et al.  Global Estimates of Ambient Fine Particulate Matter Concentrations from Satellite-Based Aerosol Optical Depth: Development and Application , 2010, Environmental health perspectives.