Exploiting spatiotemporal patterns for accurate air quality forecasting using deep learning

Forecasting spatially correlated time series data is challenging because of the linear and non-linear dependencies in the temporal and spatial dimensions. Air quality forecasting is one canonical example of such tasks. Existing work, e.g., auto-regressive integrated moving average (ARIMA) and artificial neural network (ANN), either fails to model the non-linear temporal dependency or cannot effectively consider spatial relationships between multiple spatial time series data. In this paper, we present an approach for forecasting short-term PM2.5 concentrations using a deep learning model, the geo-context based diffusion convolutional recurrent neural network, GC-DCRNN. The model describes the spatial relationship by constructing a graph based on the similarity of the built environment between the locations of air quality sensors. The similarity is computed using the surrounding "important" geographic features regarding their impacts to air quality for each location (e.g., the area size of parks within a 1000-meter buffer, the number of factories within a 500-meter buffer). Also, the model captures the temporal dependency leveraging the sequence to sequence encoder-decoder architecture. We evaluate our model on two real-world air quality datasets and observe consistent improvement of 5%-10% over baseline approaches.

[1]  Yang Zhang,et al.  Real-time air quality forecasting, part II: State of the science, current research needs, and future prospects , 2012 .

[2]  F. Gilliland,et al.  Ambient Air Pollution and Atherosclerosis in Los Angeles , 2004, Environmental health perspectives.

[3]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[4]  Daniel J. Jacob,et al.  Correlations between fine particulate matter (PM2.5) and meteorological variables in the United States: implications for the sensitivity of PM2.5 to climate change. , 2010 .

[5]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[6]  W. Geoffrey Cobourn,et al.  Accuracy and reliability of an automated air quality forecast system for ozone in seven Kentucky metropolitan areas , 2007 .

[7]  J. Chow,et al.  A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: The case of Temuco, Chile , 2008 .

[8]  Victor R. Prybutok,et al.  Comparison of neural network models with ARIMA and regression models for prediction of Houston's daily maximum ozone concentrations , 2000, Eur. J. Oper. Res..

[9]  Yang Zhang,et al.  Real-time air quality forecasting, part I: History, techniques, and current status , 2012 .

[10]  Cyrus Shahabi,et al.  A brief overview of machine learning methods for short-term traffic forecasting and future directions , 2018, SIGSPACIAL.

[11]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[12]  Sukumar Devotta,et al.  Air quality forecasting using a hybrid autoregressive and nonlinear model , 2006 .

[13]  A. Woodcock,et al.  Study of modifiable risk factors for asthma exacerbations: virus infection and allergen exposure increase the risk of asthma hospital admissions in children , 2005, Thorax.

[14]  Aldo Cipriano,et al.  Forecasting ozone daily maximum levels at santiago, chile , 1998 .

[15]  D. Eatough,et al.  Indoor/Outdoor Relationships for Ambient PM2.5 and Associated Pollutants: Epidemiological Implications in Lindon, Utah , 2000, Journal of the Air & Waste Management Association.

[16]  L. Folinsbee Human health effects of air pollution. , 1993, Environmental health perspectives.

[17]  Cyrus Shahabi,et al.  Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting , 2017, ICLR.

[18]  Ming Li,et al.  Forecasting Fine-Grained Air Quality Based on Big Data , 2015, KDD.

[19]  J. Schauer,et al.  Seasonal trends in PM2.5 source contributions in Beijing, China , 2005 .

[20]  Ugur Demiryurek,et al.  Deep Learning: A Generic Approach for Extreme Condition Traffic Forecasting , 2017, SDM.

[21]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22]  Jorge Reyes,et al.  Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile , 2000 .

[23]  Ian G. McKendry,et al.  Evaluation of Artificial Neural Networks for Fine Particulate Pollution (PM10 and PM2.5) Forecasting , 2002, Journal of the Air & Waste Management Association.

[24]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[25]  José Luis Ambite,et al.  Mining Public Datasets for Modeling Intra-City PM2.5 Concentrations at a Fine Spatial Resolution , 2017, SIGSPATIAL/GIS.

[26]  Kai Meng Mok,et al.  KALMAN FILTER BASED PREDICTION SYSTEM FOR WINTERTIME PM10 CONCENTRATIONS IN MACAU , 2008 .

[27]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[28]  Zhili Zuo,et al.  PM2.5 in China: Measurements, sources, visibility and health effects, and mitigation , 2014 .

[29]  Ujjwal Kumar,et al.  ARIMA forecasting of ambient air pollutants (O3, NO, NO2 and CO) , 2010 .

[30]  Anikender Kumar,et al.  Forecasting of daily air quality index in Delhi. , 2011, The Science of the total environment.

[31]  W. Geoffrey Cobourn,et al.  An enhanced ozone forecasting model using air mass trajectory analysis , 1999 .

[32]  Suhartono,et al.  Seasonal ARIMA for forecasting air pollution index: a case study , 2012 .

[33]  Patricio Perez,et al.  PM2.5 forecasting in a large city: Comparison of three methods , 2008 .