Point source influence on observed extreme pollution levels in a monitoring network

Abstract This paper presents a strategy to quantify the influence major point sources in a region have on extreme pollution values observed at each of the monitors in the network. We focus on the number of hours in a day the levels at a monitor exceed a specified health threshold. The number of daily exceedances are modeled using observation-driven negative binomial time series regression models, allowing for a zero-inflation component to characterize the probability of no exceedances in a particular day. The spatial nature of the problem is addressed through the use of a Gaussian plume model for atmospheric dispersion computed at locations of known emissions, creating covariates that impact exceedances. In order to isolate the influence of emitters at individual monitors, we fit separate regression models to the series of counts from each monitor. We apply a final model clustering step to group monitor series that exhibit similar behavior with respect to mean, variability, and common contributors to support policy decision making. The methodology is applied to eight benzene pollution series measured at air quality monitors around the Houston ship channel, a major industrial port.

[1]  Galit Shmueli,et al.  A Flexible Regression Model for Count Data , 2008 .

[2]  M. Bedogni,et al.  The ozone patterns in the aerological basin of Milan (Italy) , 1996 .

[3]  Benjamin Kedem,et al.  Regression models for time series analysis , 2002 .

[4]  S. Linder,et al.  Comparative Assessment of Air Pollution–Related Health Risks in Houston , 2007, Environmental health perspectives.

[5]  Richard A. Davis,et al.  Maximum Likelihood Estimation for an Observation Driven Model for Poisson Counts , 2005 .

[6]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[7]  Bonnie K. Ray,et al.  Regression Models for Time Series Analysis , 2003, Technometrics.

[8]  Andy H. Lee,et al.  Modelling bivariate count series with excess zeros. , 2005, Mathematical biosciences.

[9]  Pedro Oyola,et al.  Examination of pollution trends in Santiago de Chile with cluster analysis of PM10 and Ozone data , 2006 .

[10]  Q. Vuong Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses , 1989 .

[11]  S. Batterman,et al.  Extreme value analyses of VOC exposures and risks: A comparison of RIOPA and NHANES datasets. , 2012, Atmospheric environment.

[12]  W. T. Davis,et al.  Air Pollution: Its Origin and Control , 1976 .

[13]  A. Zeileis,et al.  Regression Models for Count Data in R , 2008 .

[14]  Elaine Symanski,et al.  Kriged and modeled ambient air levels of benzene in an urban environment: an exposure assessment study , 2011, Environmental health : a global access science source.

[15]  Isabella Morlini,et al.  Searching for structure in measurements of air pollutant concentration , 2007 .

[16]  S. Ghigo,et al.  Analysis of air quality monitoring networks by functional clustering , 2008 .

[18]  M. Fraser,et al.  Source identification and apportionment of volatile organic compounds in Houston, TX , 2006 .

[19]  Diane Lambert,et al.  Zero-inflacted Poisson regression, with an application to defects in manufacturing , 1992 .

[20]  Alan Robins,et al.  An Inverse Method for Determining Source Characteristics for Emergency Response Applications , 2012, Boundary-Layer Meteorology.

[21]  V. Joshi,et al.  Cluster analysis of Delhi's ambient air quality data. , 2003, Journal of environmental monitoring : JEM.

[22]  Charles A. Bouman,et al.  Inverse problems in atmospheric dispersion with randomly scattered sensors , 2006, Digit. Signal Process..

[23]  Martyn T. Smith,et al.  Benzene Exposure and Risk of Non-Hodgkin Lymphoma , 2007, Cancer Epidemiology Biomarkers & Prevention.

[24]  D. Matteson,et al.  Stationarity of generalized autoregressive moving average models , 2011 .

[25]  Pravin K. Trivedi,et al.  Regression Analysis of Count Data: Preface , 1998 .

[26]  A Model-based Approach for Clustering Air Quality Monitoring Networks in Houston, Texas , 2009 .

[27]  S. Zeger A regression model for time series of counts , 1988 .

[28]  G. D. Roy,et al.  A Mathematical Model in Locating an Unknown Emission Source , 2002 .

[29]  Benjamin Kedem,et al.  Partial Likelihood Inference For Time Series Following Generalized Linear Models , 2004 .

[30]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[31]  D. Turner The Long Lifetime of the Dispersion Methods of Pasquill in U.S. Regulatory Air Modeling , 1997 .