Using wavelet transform and dynamic time warping to identify the limitations of the CNN model as an air quality forecasting system

Abstract. As the deep learning algorithm has become a popular data analytic technique, atmospheric scientists should have a balanced perception of its strengths and limitations so that they can provide a powerful analysis of complex data with well-established procedures. Despite the enormous success of the algorithm in numerous applications, certain issues related to its applications in air quality forecasting (AQF) require further analysis and discussion. This study addresses significant limitations of an advanced deep learning algorithm, the convolutional neural network (CNN), in two common applications: (i) a real-time AQF model, and (ii) a post-processing tool in a dynamical AQF model, the Community Multi-scale Air Quality Model (CMAQ). In both cases, the CNN model shows promising accuracy for ozone prediction 24 hours in advance in both the United States and South Korea (with an overall index of agreement exceeding 0.8). For the first case, we use the wavelet transform to determine the reasons behind the poor performance of CNN during the nighttime, cold months, and high ozone episodes. We find that when fine wavelet modes (hourly and daily) are relatively weak or when coarse wavelet modes (weekly) are strong, the CNN model produces less accurate forecasts. For the second case, we use the dynamic time warping (DTW) distance analysis to compare post-processed results with their CMAQ counterparts (as a base model). For CMAQ results that show a consistent DTW distance from the observation, the post-processing approach properly addresses the modeling bias with predicted IOAs exceeding 0.85. When the DTW distance of CMAQ-vs-observation is irregular, the post-processing approach is unlikely to perform satisfactorily. Awareness of the limitations in CNN models will enable scientists to develop more accurate regional or local air quality forecasting systems by identifying the affecting factors in high concentration episodes.

[1]  Toni Giorgino,et al.  Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation , 2009, Artif. Intell. Medicine.

[2]  Yunsoo Choi,et al.  Impact of high-resolution sea surface temperature, emission spikes and wind on simulated surface ozone in Houston, Texas during a high ozone episode , 2017 .

[3]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[4]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[5]  Yang Zhang,et al.  Real-time air quality forecasting, part I: History, techniques, and current status , 2012 .

[6]  Olufemi A. Omitaomu,et al.  Weighted dynamic time warping for time series classification , 2011, Pattern Recognit..

[7]  Ebrahim Eslami,et al.  Real-time 7-day forecast of pollen counts using a deep convolutional neural network , 2019, Neural Computing and Applications.

[8]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Anand Asundi,et al.  Comparison of Fourier transform, windowed Fourier transform, and wavelet transform methods for phase extraction from a single fringe pattern in fringe projection profilometry , 2010 .

[10]  Aslak Grinsted,et al.  Nonlinear Processes in Geophysics Application of the Cross Wavelet Transform and Wavelet Coherence to Geophysical Time Series , 2022 .

[11]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[12]  Yunsoo Choi The impact of satellite-adjusted NO x emissions on simulated NO x and O 3 discrepancies in the urban and outflow areas of the Pacific and Lower Middle US , 2013 .

[13]  Desmond J. Higham,et al.  Edinburgh Research Explorer Deep learning: an introduction for applied mathematicians , 2022 .

[14]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[15]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[16]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[17]  Ebrahim Eslami,et al.  Using a deep convolutional neural network to predict 2017 ozone concentrations, 24 hours in advance , 2020, Neural Networks.

[18]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[19]  Yunsoo Choi,et al.  A data ensemble approach for real-time air quality forecasting using extremely randomized trees and deep neural networks , 2019, Neural Computing and Applications.

[20]  Andrea Garzelli,et al.  Context-driven fusion of high spatial and spectral resolution images based on oversampled multiresolution analysis , 2002, IEEE Trans. Geosci. Remote. Sens..

[21]  Toni Giorgino,et al.  Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package , 2009 .

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  Yunsoo Choi,et al.  Potential impacts of electric vehicles on air quality and health endpoints in the Greater Houston Area in 2040 , 2019, Atmospheric Environment.

[24]  Yunsoo Choi,et al.  A real-time hourly ozone prediction system using deep convolutional neural network , 2019, Neural Computing and Applications.

[25]  Luis A. Bastidas,et al.  Downscaling and Forecasting of Evapotranspiration Using a Synthetic Model of Wavelets and Support Vector Machines , 2008, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Yunsoo Choi,et al.  Modeling the uncertainty of several VOC and its impact on simulated VOC and ozone in Houston, Texas , 2015 .

[27]  Andreas Kamilaris,et al.  Deep learning in agriculture: A survey , 2018, Comput. Electron. Agric..

[28]  Pat Dolwick,et al.  The effects of meteorology on ozone in urban areas and their use in assessing ozone trends , 2007 .