A Feature Extraction and Classification Method to Forecast the PM2.5 Variation Trend Using Candlestick and Visual Geometry Group Model

Currently, the continuous change prediction of PM2.5 concentration is an air pollution research hotspot. Combining physical methods and deep learning models to divide the pollution process of PM2.5 into effective multiple types is necessary to achieve a reliable prediction of the PM2.5 value. Therefore, a candlestick chart sample generator was designed to generate the candlestick chart from the online PM2.5 continuous monitoring data of the Guilin monitoring station site. After these generated candlestick charts were analyzed through the Gaussian diffusion model, it was found that the characteristics of the physical transmission process of PM2.5 pollutants can be reflected. Based on a set three-day period, using the time linear convolution method, 2188 sets of candlestick chart data were obtained from the 2013–2018 PM2.5 concentration data. There existed 16 categories generated by unsupervised classification that met the established classification judgment standards. After the statistical analysis, it was found that the accuracy rate of the change trend of these classifications reached 99.68% during the next period. Using the candlestick chart data as the training dataset, the Visual Geometry Group (VGG) model, an improved convolutional neural network model, was used for the classification. The experimental results showed that the overall accuracy (OA) value of the candlestick chart combination classification was 96.19%, and the Kappa coefficient was 0.960. IN the VGG model, the overall accuracy was improved by 1.93%, on average, compared with the support vector machines (SVM), LeNet, and AlexNet models. According to the experimental results, using the VGG classification method to classify continuous pollution data in the form of candlestick charts can more comprehensively retain the characteristics of the physical pollution process and provide a classification basis for accurately predicting PM2.5 values. At the same time, the statistical feasibility of this method has been proved.

[1]  N. Sugimoto,et al.  A method for estimating the fraction of mineral dust in particulate matter using PM2.5-to-PM10 ratios , 2016 .

[2]  Raymond Wai Pong Yuen High Low Candlestick Chart , 2013 .

[3]  N. Zhang,et al.  Policy-driven changes in the health risk of PM2.5 and O3 exposure in China during 2013-2018. , 2020, The Science of the total environment.

[4]  Weidong Zhang,et al.  Prediction of 24-hour-average PM(2.5) concentrations using a hidden Markov model with different emission distributions in Northern California. , 2013, The Science of the total environment.

[5]  Tanya S. Unger Holtz Introductory Digital Image Processing: A Remote Sensing Perspective, Third Edition , 2007 .

[6]  Sarbani Roy,et al.  Long-term time-series pollution forecast using statistical and deep learning methods , 2021, Neural Comput. Appl..

[7]  Wei Sun,et al.  Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. , 2017, Journal of environmental management.

[8]  Min-Yuh Day,et al.  Trading strategies in terms of continuous rising (falling) prices or continuous bullish (bearish) candlesticks emitted , 2018, Physica A: Statistical Mechanics and its Applications.

[9]  Çinar Nursan,et al.  Parent's knowledge and perceptions of the health effects of environmental hazards in Sakarya, Turkey. , 2014, JPMA. The Journal of the Pakistan Medical Association.

[10]  Kyungjik Lee,et al.  Expert system for predicting stock market timing using a candlestick chart , 1999 .

[11]  T. Fu,et al.  Neural network predictions of pollutant emissions from open burning of crop residues: Application to air quality forecasts in southern China , 2019, Atmospheric Environment.

[12]  Jalil Heidary Dahooie,et al.  Wrapper ANFIS-ICA method to do stock market timing and feature selection on the basis of Japanese Candlestick , 2015, Expert Syst. Appl..

[13]  Matilde Santos Peñas,et al.  A fuzzy decision system for money investment in stock markets based on fuzzy candlesticks pattern recognition , 2019, Expert Syst. Appl..

[14]  Yi-Chi Chen,et al.  Trend definition or holding strategy: What determines the profitability of candlestick charting? , 2015 .

[15]  Abu Kuandykov,et al.  The Solution of Semi-empirical Equation of Turbulent Diffusion in Problems of Polluting Impurity Transfer by Gauss Approach , 2016, FNC/MobiSPC.

[16]  Raymond Y. K. Lau,et al.  A formal approach to candlestick pattern classification in financial time series , 2019, Appl. Soft Comput..

[17]  S. Yook,et al.  Gaussian diffusion sphere model to predict mass transfer due to diffusional particle deposition on a flat surface in laminar flow regime , 2009 .

[18]  P. Goyal,et al.  Artificial intelligence based approach to forecast PM2.5 during haze episodes: A case study of Delhi, India , 2015 .

[19]  Haiyan Guan,et al.  Land-cover classification of multispectral LiDAR data using CNN with optimized hyper-parameters , 2020, ISPRS Journal of Photogrammetry and Remote Sensing.

[20]  James M. Wilczak,et al.  PM 2.5 analog forecast and Kalman filter post-processing for the Community Multiscale Air Quality (CMAQ) model , 2015 .

[21]  Shouyang Wang,et al.  A Comprehensive Look at the Predictive Information in Japanese Candlestick , 2012, ICCS.

[22]  Zhihua Wang,et al.  An ultrasensitive calcein sensor based on the implementation of a novel chemiluminescence system with modified kaolin , 2015 .

[23]  Shuiyuan Cheng,et al.  Characteristics and classification of PM2.5 pollution episodes in Beijing from 2013 to 2015. , 2018, The Science of the total environment.

[24]  Tadeusz Burczynski,et al.  Modeling and forecasting financial time series with ordered fuzzy candlesticks , 2014, Inf. Sci..

[25]  Frank K. Tittel,et al.  Leakage source location based on Gaussian plume diffusion model using a near-infrared sensor , 2020 .

[26]  Kun Luo,et al.  Recurrent Neural Network and random forest for analysis and accurate forecast of atmospheric pollutants: A case study in Hangzhou, China , 2019, Journal of Cleaner Production.

[27]  Binxu Zhai,et al.  Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China. , 2018, The Science of the total environment.

[28]  Feng Xu,et al.  Prediction of hourly PM 2.5 using a space-time support vector regression model , 2018 .

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Li-Chiu Chang,et al.  Seamless integration of convolutional and back-propagation neural networks for regional multi-step-ahead PM2.5 forecasting , 2020, Journal of Cleaner Production.

[31]  Li-Chiu Chang,et al.  Explore Regional PM2.5 Features and Compositions Causing Health Effects in Taiwan , 2020, Environmental Management.

[32]  Nicholas Good,et al.  Application of chemical transport model CMAQ to policy decisions regarding PM2.5 in the UK , 2014 .

[33]  Chao Chen,et al.  A hybrid framework for forecasting PM2.5 concentrations using multi-step deterministic and probabilistic strategy , 2019, Air Quality, Atmosphere & Health.

[34]  Yan Wang,et al.  Long-term Exposure to PM2.5 and Mortality Among Older Adults in the Southeastern US , 2017, Epidemiology.

[35]  Dan Zhang,et al.  Reversal Pattern Discovery in Financial Time Series Based on Fuzzy Candlestick Lines , 2011 .

[36]  Jianzhou Wang,et al.  A hybrid model for PM₂.₅ forecasting based on ensemble empirical mode decomposition and a general regression neural network. , 2014, The Science of the total environment.

[37]  Chih-Da Wu,et al.  Association Between Long-term Exposure to PM2.5 and Incidence of Type 2 Diabetes in Taiwan: A National Retrospective Cohort Study. , 2019, Epidemiology.

[38]  Jianzhou Wang,et al.  A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM2.5 and PM10 forecasting , 2019, Appl. Soft Comput..

[39]  Ping Jiang,et al.  A novel hybrid strategy for PM2.5 concentration analysis and prediction. , 2017, Journal of environmental management.

[40]  Li-Chiu Chang,et al.  Explore spatio-temporal PM2.5 features in northern Taiwan using machine learning techniques. , 2020, The Science of the total environment.

[41]  Tsung-Hsun Lu,et al.  The profitability of candlestick charting in the Taiwan stock market , 2014 .

[42]  G. Foody Assessing the Accuracy of Remotely Sensed Data: Principles and Practices , 2010 .

[43]  D. Byun,et al.  Review of the Governing Equations, Computational Algorithms, and Other Components of the Models-3 Community Multiscale Air Quality (CMAQ) Modeling System , 2006 .

[44]  Yu Zhou,et al.  The predictive power of Japanese candlestick charting in Chinese stock market , 2016 .

[45]  Hui Liu,et al.  PM2.5 concentrations forecasting using a new multi-objective feature selection and ensemble framework , 2020 .

[46]  Ling Feng,et al.  Using Candlestick Charts to Predict Adolescent Stress Trend on Micro-blog , 2015, EUSPN/ICTH.

[47]  A Gaussian Trajectory Atmospheric Diffusion Model for Complex Terrain , 1986 .

[48]  Chao Chen,et al.  Prediction of outdoor PM2.5 concentrations based on a three-stage hybrid neural network model , 2020 .

[49]  Taoying Li,et al.  A Hybrid CNN-LSTM Model for Forecasting Particulate Matter (PM2.5) , 2020, IEEE Access.

[50]  M. Minguillón,et al.  Fine and coarse PM composition and sources in rural and urban sites in Switzerland: local or regional pollution? , 2012, The Science of the total environment.

[51]  CHIH-FONG TSAI,et al.  Stock Prediction by Searching for Similarities in Candlestick Charts , 2014, ACM Trans. Manag. Inf. Syst..

[52]  H. Kan,et al.  The effect of atmospheric particulate matter on survival of breast cancer among US females , 2013, Breast Cancer Research and Treatment.

[53]  Yong Cheng,et al.  Hybrid algorithm for short-term forecasting of PM2.5 in China , 2019, Atmospheric Environment.