Comparative Assessment of Various Machine Learning‐Based Bias Correction Methods for Numerical Weather Prediction Model Forecasts of Extreme Air Temperatures in Urban Areas

Forecasts of maximum and minimum air temperatures are essential to mitigate the damage of extreme weather events such as heat waves and tropical nights. The Numerical Weather Prediction (NWP) model has been widely used for forecasting air temperature, but generally it has a systematic bias due to its coarse grid resolution and lack of parametrizations. This study used random forest (RF), support vector regression (SVR), artificial neural network (ANN) and a multi‐model ensemble (MME) to correct the Local Data Assimilation and Prediction System (LDAPS; a local NWP model over Korea) model outputs of next‐day maximum and minimum air temperatures (Tmaxtþ1 and Tmintþ1) in Seoul, South Korea. A total of 14 LDAPS model forecast data, the daily maximum and minimum air temperatures of in‐situ observations, and five auxiliary data were used as input variables. The results showed that the LDAPSmodel had an R of 0.69, a bias of −0.85 °C and an RMSE of 2.08 °C for Tmaxtþ1 forecast, whereas the proposed models resulted in the improvement with R from 0.75 to 0.78, bias from −0.16 to −0.07 °C and RMSE from 1.55 to 1.66 °C by hindcast validation. For forecasting Tmintþ1 , the LDAPS model had an R 2 of 0.77, a bias of 0.51 °C and an RMSE of 1.43 °C by hindcast, while the bias correction models showed R values ranging from 0.86 to 0.87, biases from −0.03 to 0.03 °C, and RMSEs from 0.98 to 1.02 °C. The MMEmodel had better generalization performance than the three single machine learning models by hindcast validation and leave‐one‐station‐out cross‐validation.

[1]  Xindong Wu,et al.  Support vector machines based on K-means clustering for real-time business intelligence systems , 2005, Int. J. Bus. Intell. Data Min..

[2]  Kostas Lagouvardos,et al.  Correcting temperature and humidity forecasts using Kalman filtering: potential for agricultural protection in Northern Greece , 2004 .

[3]  Orhan Dengiz,et al.  Comparing the efficiency of ordinary kriging and cokriging to estimate the Atterberg limits spatially using some soil physical properties , 2009, Clay Minerals.

[4]  Ponnuthurai N. Suganthan,et al.  Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article] , 2016, IEEE Computational Intelligence Magazine.

[5]  Tao Liu,et al.  Comparing fully convolutional networks, random forest, support vector machine, and patch-based deep convolutional neural networks for object-based wetland mapping using images from small unmanned aircraft system , 2018 .

[6]  J. Wallace,et al.  Reduction of systematic forecast errors in the ECMWF model through the introduction of an envelope orography , 1983 .

[7]  Jungho Im,et al.  Machine Learning Approaches for Estimating Forest Stand Height Using Plot-Based Observations and Airborne LiDAR Data , 2018 .

[8]  Joon-Woo Roh,et al.  Development of an Urban High-Resolution Air Temperature Forecast System for Local Weather Information Services Based on Statistical Downscaling , 2018 .

[9]  Ladislav Zjavka,et al.  Numerical weather prediction revisions using the locally trained differential polynomial network , 2016, Expert Syst. Appl..

[10]  S. Hajjam,et al.  Comparative Evaluation of Different Post Processing Methods for Numerical Prediction of Temperature Forecasts over Iran , 2010 .

[11]  Germano C. Vasconcelos,et al.  MLP ensembles improve long term prediction accuracy over single networks , 2011 .

[12]  K. Tomic Heat Wave: A Social Autopsy of Disaster in Chicago , 2003 .

[13]  S. Kawashima,et al.  Use of cokriging to estimate surface air temperature from elevation , 1993 .

[14]  Thomas P. Trappenberg,et al.  A Heuristic for Free Parameter Optimization with Support Vector Machines , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[15]  W. J. Steenburgh,et al.  An Evaluation of Mesoscale-Model-Based Model Output Statistics (MOS) during the 2002 Olympic and Paralympic Winter Games , 2004 .

[16]  Chris S. M. Turney,et al.  Construction of a 1961-1990 European climatology for climate change modelling and impact applications , 1995 .

[17]  Jui-Sheng Chou,et al.  Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength , 2013 .

[18]  H. K. Cigizoglu,et al.  Forecast of daily mean, maximum and minimum temperature time series by three artificial neural network methods , 2008 .

[19]  Shahaboddin Shamshirband,et al.  Daily global solar radiation prediction from air temperatures using kernel extreme learning machine: A case study for Iran , 2015 .

[20]  Stuart Webster,et al.  Improvements to the representation of orography in the Met Office Unified Model , 2003 .

[21]  Roman M. Balabin,et al.  Interpolation and extrapolation problems of multivariate regression in analytical chemistry: benchmarking the robustness on near-infrared (NIR) spectroscopy data. , 2012, The Analyst.

[22]  Jungho Im,et al.  Detection of deterministic and probabilistic convection initiation using Himawari-8 Advanced Himawari Imager data , 2016 .

[23]  Jungho Im,et al.  Detection of Tropical Overshooting Cloud Tops Using Himawari-8 Imagery , 2017, Remote. Sens..

[24]  Yang Shao,et al.  Comparison of support vector machine, neural network, and CART algorithms for the land-cover classification using limited training data points , 2012 .

[25]  Hilton Silveira Pinto,et al.  Kalman filter and correction of the temperatures estimated by PRECIS model , 2011 .

[26]  Zhe Zhu,et al.  Mapping forest change using stacked generalization: An ensemble approach , 2018 .

[27]  Jungho Im,et al.  A novel transferable individual tree crown delineation model based on Fishing Net Dragging and boundary classification , 2015 .

[28]  M. Diamantopoulou,et al.  Estimating tree bole volume using artificial neural network models for four species in Turkey. , 2010, Journal of environmental management.

[29]  Michiel C. van Wezel,et al.  Improved customer choice predictions using ensemble methods , 2005, Eur. J. Oper. Res..

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Pablo M. Granitto,et al.  Prediction of minimum temperatures in an alpine region by linear and non-linear post-processing of meteorological models , 2007 .

[32]  Jungho Im,et al.  Estimation of daily maximum and minimum air temperatures in urban landscapes using MODIS time series satellite data , 2018 .

[33]  Sebastian Sippel,et al.  Half a degree and rapid socioeconomic development matter for heatwave risk , 2019, Nature Communications.

[34]  Bruce K. Wylie,et al.  Geospatial data mining for digital raster mapping , 2019, GIScience and Remote Sensing.

[35]  Gerald Forkuor,et al.  Landsat-8 vs. Sentinel-2: examining the added value of sentinel-2’s red-edge bands to land-use and land-cover mapping in Burkina Faso , 2018 .

[36]  Rei Sonobe,et al.  Assessing the suitability of data from Sentinel-1A and 2A for crop classification , 2017 .

[37]  J. Carpenter,et al.  Practice of Epidemiology Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study , 2014 .

[38]  Jungho Im,et al.  Landfast sea ice monitoring using multisensor fusion in the Antarctic , 2015 .

[39]  Jun Yang,et al.  Simulation of landscape spatial layout evolution in rural-urban fringe areas: a case study of Ganjingzi District , 2018, GIScience & Remote Sensing.

[40]  Yongming Xu,et al.  Estimating daily maximum air temperature from MODIS in British Columbia, Canada , 2014 .

[41]  H. Ho,et al.  Mapping maximum urban air temperature on hot summer days , 2014 .

[42]  Tim Appelhans,et al.  Improving the accuracy of rainfall rates from optical satellite sensors with machine learning — A random forests-based approach applied to MSG SEVIRI , 2014 .

[43]  Sven F. Crone,et al.  A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns , 2006, IFIP AI.

[44]  Shom Prasad Das,et al.  A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting , 2015, International Journal of Machine Learning and Cybernetics.

[45]  Pagavathigounder Balasubramaniam,et al.  Delay-dependent asymptotic stability for stochastic delayed recurrent neural networks with time varying delays , 2008, Appl. Math. Comput..

[46]  L. Alexander,et al.  Increasing frequency, intensity and duration of observed global heatwaves and warm spells , 2012 .

[47]  M. Hutchinson,et al.  A comparison of two statistical methods for spatial interpolation of Canadian monthly mean climate data , 2000 .

[48]  Jungho Im,et al.  Classification and Mapping of Paddy Rice by Combining Landsat and SAR Time Series Data , 2018, Remote. Sens..

[49]  Doreen Eichel,et al.  Learning And Soft Computing Support Vector Machines Neural Networks And Fuzzy Logic Models , 2016 .

[50]  Dominique Tapsoba,et al.  Interpolation of monthly mean temperatures using cokriging in spherical coordinates , 2013 .

[51]  Gary William Flake,et al.  Efficient SVM Regression Training with SMO , 2002, Machine Learning.

[52]  Jungho Im,et al.  Arctic Sea Ice Thickness Estimation from CryoSat-2 Satellite Data Using Machine Learning-Based Lead Detection , 2016, Remote. Sens..

[53]  Fred C. Collins,et al.  A comparison of spatial interpolation techniques in temperature estimation , 1995 .

[54]  Sebahattin Tiryaki,et al.  An artificial neural network model for predicting compression strength of heat treated woods and comparison with a multiple linear regression model , 2014 .

[55]  Cheolhee Yoo,et al.  Comparison between convolutional neural networks and random forest for local climate zone classification in mega urban areas using Landsat images , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[56]  Jungho Im,et al.  Icing Detection over East Asia from Geostationary Satellite Data Using Machine Learning Approaches , 2018, Remote. Sens..

[57]  Hichem Omrani,et al.  Integrating the multi-label land-use concept and cellular automata with the artificial neural network-based Land Transformation Model: an integrated ML-CA-LTM modeling framework , 2017 .

[58]  Jungho Im,et al.  Downscaling of AMSR-E soil moisture with MODIS products using machine learning approaches , 2016, Environmental Earth Sciences.

[59]  D. S. Wilks,et al.  Chapter 8 - Forecast Verification , 2011 .

[60]  Lauchlan H. Fraser,et al.  A comparison of geographic datasets and field measurements to model soil carbon using random forests and stepwise regressions (British Columbia, Canada) , 2017 .

[61]  William A. Sprigg,et al.  The meteorological buoy and Coastal Marine Automated Network for the United States , 1998 .

[62]  Caren Marzban,et al.  Neural Networks for Postprocessing Model Output: ARPS , 2003 .

[63]  Lamin R. Mansaray,et al.  Optimising rice mapping in cloud-prone environments by combining quad-source optical with Sentinel-1A microwave satellite imagery , 2019, GIScience & Remote Sensing.

[64]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[65]  David J. Stensrud,et al.  Short-Range Ensemble Predictions of 2-m Temperature and Dewpoint Temperature over New England , 2003 .

[66]  John E. Wagner,et al.  Variable selection for hedonic model using machine learning approaches: A case study in Onondaga County, NY , 2012 .

[67]  Zong Woo Geem,et al.  Determination of Optimal Initial Weights of an Artificial Neural Network by Using the Harmony Search Algorithm: Application to Breakwater Armor Stones , 2016 .

[68]  John Turner,et al.  Met Office Unified Model high‐resolution simulations of a strong wind event in Antarctica , 2014 .

[69]  Wei Wang,et al.  Impacts of Land-Use Data on the Simulation of Surface Air Temperature in Northwest China , 2018, Journal of Meteorological Research.

[70]  Martin Kappas,et al.  Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data , 2017, Remote. Sens..

[71]  Dieu Tien Bui,et al.  Biomass estimation of Sonneratia caseolaris (l.) Engler at a coastal area of Hai Phong city (Vietnam) using ALOS-2 PALSAR imagery and GIS-based multi-layer perceptron neural networks , 2017 .

[72]  D. Andrew Brown,et al.  ROBERT E. KASS , URI T. EDEN , EMERY N. BROWN . Analysis of Neural Data . New York : Springer Science + Business Media , 2017 .

[73]  Isabel F. Trigo,et al.  Correction of 2 m-temperature forecasts using Kalman Filtering technique , 2008 .

[74]  Max Bramer Artificial Intelligence in Theory and Practice, IFIP 19th World Computer Congress, TC 12: IFIP AI 2006 Stream, August 21-24, 2006, Santiago, Chile , 2006, IFIP AI.

[75]  Heike Langenberg,et al.  Climate science: Urban heat , 2014 .

[76]  Zhang JingYong,et al.  Investigating the role of MODIS leaf area index and vegetation-climate interaction in regional climate simulations over Asia. , 2009 .

[77]  Weizhong Zheng,et al.  Improving the Stable Surface Layer in the NCEP Global Forecast System , 2017 .

[78]  B. Saavedra-Moreno,et al.  Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms , 2016, Theoretical and Applied Climatology.

[79]  Sanaz Moghim,et al.  Bias Correction of Climate Modeled Temperature and Precipitation Using Artificial Neural Networks , 2017 .

[80]  Lei Han,et al.  A Machine Learning Nowcasting Method based on Real-time Reanalysis Data , 2016, ArXiv.

[81]  Filip De Turck,et al.  Evolutionary Model Type Selection for Global Surrogate Modeling , 2009, J. Mach. Learn. Res..

[82]  Hugh G. Lewis,et al.  Superresolution mapping using a hopfield neural network with fused images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[83]  Tim Appelhans,et al.  Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania , 2015 .