A deep learning approach to real-time CO concentration prediction at signalized intersection

Abstract Vehicle exhaust emissions at signalized intersections are the essential source of traffic-related pollution to pedestrians. Therefore, it is critical to predicting traffic emissions, especially the hazardous CO gas, with practical and accurate methods. However, the CO emission and concentration at crosswalks can be influenced by the complex traffic conditions in a complicated way, making the prediction of CO concentration a challenging task for traditional statistical models. To this end, a hybrid machine learning framework is proposed in this study to investigate the concentration of CO emissions at pedestrian crosswalks. The proposed method firstly ranks key influencing factors with a random forest approach. Then a prediction model with Multi-Variate Long Short-Term Memory (LSTM) neural networks based on the selected factors is developed. Data is collected at the field intersection for model training and validation. The autoregressive integrated moving average (ARIMA), support vector machines (SVM), radial basis functions network (RBFN), nonlinear vector autoregressive (VAR) and gated recurrent unit ( GRU ) neural network are selected as the benchmark models to verify the performance of the proposed model. The Root Mean Square Errors (RMSE), Mean Absolute Error (MAE) and R square are calculated to evaluate the performance of models comprehensively. The results indicated that the proposed model overwhelms the benchmark models in terms of prediction accuracy.

[1]  Tim Appelhans,et al.  Improving the accuracy of rainfall rates from optical satellite sensors with machine learning — A random forests-based approach applied to MSG SEVIRI , 2014 .

[2]  Zhuoqun Sun,et al.  On-Road Bus Emission Comparison for Diverse Locations and Fuel Types in Real-World Operation Conditions , 2020, Sustainability.

[3]  Yunpeng Wang,et al.  Long short-term memory neural network for traffic speed prediction using remote microwave sensor data , 2015 .

[4]  F. Inal,et al.  Artificial Neural Network Prediction of Tropospheric Ozone Concentrations in Istanbul, Turkey , 2010 .

[5]  S. S. Matin,et al.  Modeling of free swelling index based on variable importance measurements of parent coal properties by random forest method , 2016 .

[6]  Rafał Weron,et al.  Market price of risk implied by Asian-style electricity options , 2005 .

[7]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[8]  Bin Li,et al.  A novel hybrid multivariate nonlinear grey model for forecasting the traffic-related emissions , 2020 .

[9]  F. Jansen,et al.  Tingkat Pencemaran Udara Co Akibat Lalu Lintas Dengan Model Prediksi Polusi Udara Skala Mikro , 2011 .

[10]  Michael Claggett,et al.  Predicting Near-Road PM2.5 Concentrations , 2009 .

[11]  Lars Kotthoff,et al.  Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[12]  Zhiyuan Liu,et al.  A tailored machine learning approach for urban transport network flow estimation , 2019, Transportation Research Part C: Emerging Technologies.

[13]  Jean-Philippe Vert,et al.  Consistency of Random Forests , 2014, 1405.2881.

[14]  Li Pan,et al.  Predicting Short-Term Traffic Flow by Long Short-Term Memory Recurrent Neural Network , 2015, 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity).

[15]  R. K. Pearson,et al.  Exploring process data , 2001 .

[16]  Miklós Virág,et al.  The effects of handling outliers on the performance of bankruptcy prediction models , 2019, Socio-Economic Planning Sciences.

[17]  Jie Bao,et al.  A spatiotemporal deep learning approach for citywide short-term crash risk prediction with multi-source data. , 2019, Accident; analysis and prevention.

[18]  Zhong-ren Peng,et al.  Fine-scale variations in PM2.5 and black carbon concentrations and corresponding influential factors at an urban road intersection , 2018 .

[19]  Meng Li,et al.  Short-term prediction of safety and operation impacts of lane changes in oscillations with empirical vehicle trajectories. , 2019, Accident; analysis and prevention.

[20]  Jean-Michel Poggi,et al.  Random Forests for Big Data , 2015, Big Data Res..

[21]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[22]  Ali Idri,et al.  Systematic Mapping Study of Ensemble Effort Estimation , 2016, ENASE.

[23]  Zhong-Ren Peng,et al.  A land use regression application into assessing spatial variation of intra-urban fine particulate matter (PM2.5) and nitrogen dioxide (NO2) concentrations in City of Shanghai, China. , 2016, The Science of the total environment.

[24]  Yu Xue,et al.  Prediction of particulate matter at street level using artificial neural networks coupling with chaotic particle swarm optimization algorithm , 2014 .

[25]  Fei-Yue Wang,et al.  Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[26]  Shaolong Sun,et al.  Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM2.5 concentration forecasting. , 2017, Journal of environmental management.

[27]  Yafeng Yin,et al.  Prediction of hourly air pollutant concentrations near urban arterials using artificial neural network approach , 2009 .

[28]  Y. Yin,et al.  Levels, seasonal variations, and health risks assessment of ambient air pollutants in the residential areas , 2013, International Journal of Environmental Science and Technology.

[29]  F. Benjamin Zhan,et al.  Spatially differentiated and source-specific population exposure to ambient urban air pollution , 2009 .

[30]  Dmitriy O. Afanasyev,et al.  On the impact of outlier filtering on the electricity price forecasting accuracy , 2019, Applied Energy.

[31]  Yingjiu Pan,et al.  Estimation of real-driving emissions for buses fueled with liquefied natural gas based on gradient boosted regression trees. , 2019, The Science of the total environment.

[32]  B. Brunekreef,et al.  Validity of residential traffic intensity as an estimate of long-term personal exposure to traffic-related air pollution among adults. , 2008, Environmental science & technology.

[33]  Y. Zuo,et al.  Trimmed and Winsorized means based on a scaled deviation , 2009 .

[34]  Chengcheng Xu,et al.  A geographically weighted regression approach to investigate the effects of traffic conditions and road characteristics on air pollutant emissions , 2019 .

[35]  Jakub Nowotarski,et al.  An empirical comparison of alternate schemes for combining electricity spot price forecasts , 2013 .

[36]  Mukesh Khare,et al.  A Review of Deterministic, Stochastic and Hybrid Vehicular Exhaust Emission Models , 2004 .

[37]  Ahmet Özmen,et al.  Distance and density based clustering algorithm using Gaussian kernel , 2017, Expert Syst. Appl..

[38]  Zhong-Ren Peng,et al.  Fine-scale estimation of carbon monoxide and fine particulate matter concentrations in proximity to a road intersection by using wavelet neural network with genetic algorithm , 2015 .

[39]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40]  Fang Liu,et al.  Outlier detection for process control data based on a non-linear Auto-Regression Hidden Markov Model method , 2012 .

[41]  S. Samarasinghe,et al.  Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering , 2014 .

[42]  Shikha Gupta,et al.  Linear and nonlinear modeling approaches for urban air quality prediction. , 2012, The Science of the total environment.

[43]  Zhizhong Mao,et al.  Detecting outliers in complex nonlinear systems controlled by predictive control strategy , 2017 .

[44]  D. Z. Zhang,et al.  Near-road fine particulate matter concentration estimation using artificial neural network approach , 2014, International Journal of Environmental Science and Technology.

[45]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[46]  Sharad Gokhale,et al.  Traffic flow pattern and meteorology at two distinct urban junctions with impacts on air quality , 2011 .

[47]  Pan Liu,et al.  The station-free sharing bike demand forecasting with a deep learning approach and large-scale datasets , 2018, Transportation Research Part C: Emerging Technologies.

[48]  Yang Liu,et al.  DeepPF: A deep learning based architecture for metro passenger flow prediction , 2019, Transportation Research Part C: Emerging Technologies.

[49]  Chao Chen,et al.  Prediction of outdoor PM2.5 concentrations based on a three-stage hybrid neural network model , 2020 .