Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data

Abstract Accurate estimation of reference evapotranspiration (ETo) is required in many fields, e.g. irrigation scheduling design, agricultural water management, crop growth modeling and drought assessment. Nevertheless, reliable estimation of ETo is difficult when lack of complete or long-term meteorological data at the target station. This study evaluated the efficiency of a new tree-based soft computing model, Light Gradient Boosting Machine (LightGBM), for estimating daily ET0 using limited local (target-station) and external (cross-station) meteorological data from 49 weather stations in humid subtropical region of China, including 16 in Jiangxi Province and other 33 in the region. The performance of LightGBM was compared with the tree-based M5 Model Tree (M5Tree) and Random Forests (RF) as well as four empirical models (Hargreaves-Samani, Tabari, Makkink and Trabert). Eight input combinations of daily meteorological data including maximum temperature (Tmax), minimum temperature (Tmin), relative humidity (Hr), wind speed at 2 m height (U2), extraterrestrial solar radiation (Ra) and global solar radiation (Rs) calculated from sunshine duration (n) for the period 2001–2015 were used to test the models. The results showed that LightGBM was superior to M5Tree and RF in local applications under all input combinations during testing, with average root mean square error (RMSE) of 0.08–0.58 mm d−1, 0.11–0.62 mm d−1and 0.13–0.60 mm d−1, respectively. M5Tree performed slightly better than RF under input combinations 6–8, whereas RF outperformed M5Tree under the other input combinations. However, all three soft computing models produced much better daily ETo estimates than the corresponding empirical models with the same input variables. Rs was the most influential meteorological variable for daily ETo estimation in this region, followed by Tmax and Tmin, Hr and finally U2. In external applications, LightGBM also generally performed better than the RF, M5Tree and empirical models. Soft computing models developed with meteorological data from Station 57894, having the most similar climatic characteristics to the other stations, gave satisfactory ETo estimates for the 15 cross stations in Jiangxi Province, even for the other 33 stations across the humid subtropical region of China. LightGBM was proved to be efficient and exhibit good generalization capability in both local and external applications, which was thus recommended as an alternative model for daily ETo estimation.

[1]  Performance of Twelve Mass Transfer Based Reference Evapotranspiration Models under Humid Climate , 2017 .

[2]  K. S. Yap,et al.  Extreme Learning Machines: A new approach for prediction of reference evapotranspiration , 2015 .

[3]  Jian Wang,et al.  A hybrid forecasting approach applied in wind speed forecasting based on a data processing strategy and an optimized artificial intelligence algorithm , 2018, Energy.

[4]  Ningbo Cui,et al.  Improvement of Makkink model for reference evapotranspiration estimation using temperature data in Northwest China , 2018, Journal of Hydrology.

[5]  Xin Ma,et al.  Evaluation and development of empirical models for estimating daily and monthly mean daily diffuse horizontal solar radiation for different climatic regions of China , 2019, Renewable and Sustainable Energy Reviews.

[6]  Dipti Srinivasan,et al.  Parallel and reliable probabilistic load forecasting via quantile regression forest and quantile determination , 2018, Energy.

[7]  Ozgur Kisi,et al.  Subset Modeling Basis ANFIS for Prediction of the Reference Evapotranspiration , 2018, Water Resources Management.

[8]  Ayse Irmak,et al.  Solar and Net Radiation-Based Equations to Estimate Reference Evapotranspiration in Humid Climates , 2003 .

[9]  Ozgur Kisi,et al.  Least squares support vector machine for modeling daily reference evapotranspiration , 2012, Irrigation Science.

[10]  Michael J. Savage,et al.  Seasonal variation of reference evapotranspiration and Priestley-Taylor coefficient in the eastern Free State, South Africa , 2017 .

[11]  Mohamed A. Mattar,et al.  Using gene expression programming in monthly reference evapotranspiration modeling: A case study in Egypt , 2018 .

[12]  George H. Hargreaves,et al.  Moisture availability and crop production. , 1975 .

[13]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[14]  Lifeng Wu,et al.  Daily reference evapotranspiration prediction based on hybridized extreme learning machine model with bio-inspired optimization algorithms: Application in contrasting climates of China , 2019, Journal of Hydrology.

[15]  Kazuki Saito,et al.  Evaluation of sixteen reference evapotranspiration methods under sahelian conditions in the Senegal River Valley , 2015 .

[16]  J. Grieser,et al.  Calibration of the Hargreaves–Samani method for the calculation of reference evapotranspiration in different Köppen climate classes , 2016 .

[17]  M. Heydari,et al.  Calibration of Hargreaves–Samani equation for estimating reference evapotranspiration in semiarid and arid regions , 2014 .

[18]  Ozgur Kisi,et al.  Local vs. external training of neuro-fuzzy and neural networks models for estimating reference evapotranspiration assessed through k-fold testing , 2015 .

[19]  H. L. Penman Natural evaporation from open water, bare soil and grass , 1948, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[20]  Özgür Kisi,et al.  Independent testing for assessing the calibration of the Hargreaves-Samani equation: New heuristic alternatives for Iran , 2015, Comput. Electron. Agric..

[21]  Saeid Mehdizadeh,et al.  Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration , 2017, Comput. Electron. Agric..

[22]  Lifeng Wu,et al.  Potential of kernel-based nonlinear extension of Arps decline model and gradient boosting with categorical features support for predicting daily global solar radiation in humid regions , 2019, Energy Conversion and Management.

[23]  Lifeng Wu,et al.  Daily pan evaporation modeling from local and cross-station data using three tree-based machine learning models , 2018, Journal of Hydrology.

[24]  Lifeng Wu,et al.  Hybrid support vector machines with heuristic algorithms for prediction of daily diffuse solar radiation in air-polluted regions , 2020 .

[25]  S. Alexandris,et al.  Solar radiation and relative humidity based, empirical method, to estimate hourly reference evapotranspiration , 2015 .

[26]  Jalal Shiri,et al.  Improving the performance of the mass transfer-based reference evapotranspiration estimation approaches through a coupled wavelet-random forest methodology , 2018, Journal of Hydrology.

[27]  Sungwon Kim,et al.  Modelling daily reference evapotranspiration in humid locations of South Korea using local and cross‐station data management scenarios , 2017 .

[28]  G. Mendicino,et al.  Worldwide assessment of the Penman–Monteith temperature approach for the estimation of monthly reference evapotranspiration , 2016, Theoretical and Applied Climatology.

[29]  J. Kriaučiūnienė,et al.  Assessment of uncertainty in estimating the evaporation from the Curonian Lagoon , 2013 .

[30]  Shakib Shahidian,et al.  Parametric calibration of the Hargreaves–Samani equation for use at new locations , 2013 .

[31]  Shervin Motamedi,et al.  Soft computing approaches for forecasting reference evapotranspiration , 2015, Comput. Electron. Agric..

[32]  H. Cai,et al.  Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China , 2018, Agricultural and Forest Meteorology.

[33]  J. E. Christiansen,et al.  Pan Evaporation and Evapotranspiration from Climatic Data , 1968 .

[34]  Mohamed A. Farahat,et al.  Renewable power integration: Experimental and simulation study to investigate the ability of integrating wave, solar and wind energies , 2019, Energy.

[35]  Vijay P. Singh,et al.  EVALUATION AND GENERALIZATION OF 13 MASS‐TRANSFER EQUATIONS FOR DETERMINING FREE WATER EVAPORATION , 1997 .

[36]  Guy Fipps,et al.  Deployment of artificial neural network for short-term forecasting of evapotranspiration using public weather forecast restricted messages , 2016 .

[37]  Gorka Landeras,et al.  Comparison of artificial neural network models and empirical and semi-empirical equations for daily reference evapotranspiration estimation in the Basque Country (Northern Spain) , 2008 .

[38]  Ningbo Cui,et al.  Evaluation of random forests and generalized regression neural networks for daily reference evapotranspiration modelling. , 2017 .

[39]  Xiaojun Ma,et al.  Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning , 2018, Electron. Commer. Res. Appl..

[40]  L. S. Pereira,et al.  Crop evapotranspiration : guidelines for computing crop water requirements , 1998 .

[41]  Guy Fipps,et al.  Gene-Expression Programming for Short-Term Forecasting of Daily Reference Evapotranspiration Using Public Weather Forecast Information , 2017, Water Resources Management.

[42]  Vassilis Z. Antonopoulos,et al.  Daily reference evapotranspiration estimates by artificial neural networks technique and empirical equations using limited input climate variables , 2017, Comput. Electron. Agric..

[43]  Patrick Willems,et al.  Validation and calibration of solar radiation equations for estimating daily reference evapotranspiration at cool semi-arid and arid locations , 2016 .

[44]  Huan Wang,et al.  A Comparative Study of Artificial Neural Networks, Support Vector Machines and Adaptive Neuro Fuzzy Inference System for Forecasting Groundwater Levels near Lake Okeechobee, Florida , 2015, Water Resources Management.

[45]  A. A. Alazba,et al.  Artificial neural networks versus gene expression programming for estimating reference evapotranspiration in arid climate , 2016 .

[46]  A. Guven,et al.  New algebraic formulations of evapotranspiration extracted from gene-expression programming in the tropical seasonally dry regions of West Africa , 2011, Irrigation Science.

[47]  W. Russell Hamon Estimating Potential Evapotranspiration , 1960 .

[48]  Mingxi Liu,et al.  A novel cryptocurrency price trend forecasting model based on LightGBM , 2020 .

[49]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[50]  L. S. Pereira,et al.  Daily reference crop evapotranspiration in the humid environments of Azores islands using reduced data sets: accuracy of FAO-PM temperature and Hargreaves-Samani methods , 2018, Theoretical and Applied Climatology.

[51]  Özgür Kisi,et al.  Pan evaporation modeling using four different heuristic approaches , 2017, Comput. Electron. Agric..

[52]  Yue Jia,et al.  National-scale assessment of pan evaporation models across different climatic zones of China , 2018, Journal of Hydrology.

[53]  Xin Ma,et al.  A novel fractional time delayed grey model with Grey Wolf Optimizer and its applications in forecasting the natural gas and coal consumption in Chongqing China , 2019, Energy.

[54]  Yu Feng,et al.  Comparison of ELM, GANN, WNN and empirical models for estimating reference evapotranspiration in humid region of Southwest China , 2016 .

[55]  O. Kisi,et al.  Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution , 2016 .

[56]  Lifeng Wu,et al.  Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China , 2019, Renewable and Sustainable Energy Reviews.

[57]  Lifeng Wu,et al.  Evaluation and development of temperature-based empirical models for estimating daily global solar radiation in humid regions , 2018 .

[58]  Maurizio Brocchini,et al.  Prediction of scour depth at breakwaters due to non-breaking waves using machine learning approaches , 2017 .

[59]  Ali Rahimikhoob,et al.  An Evaluation of Four Reference Evapotranspiration Models in a Subtropical Climate , 2012, Water Resources Management.

[60]  Lifeng Wu,et al.  Comparison of neuron-based, kernel-based, tree-based and curve-based machine learning models for predicting daily reference evapotranspiration , 2019, PloS one.

[61]  O. Kisi,et al.  SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment , 2012 .

[62]  E. Linacre Estimating U.S. Class A Pan Evaporation from Few Climate Data , 1994 .

[63]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[64]  Zhibin He,et al.  Support-Vector-Machine-Based Models for Modeling Daily Reference Evapotranspiration With Limited Climatic Data in Extreme Arid Regions , 2015, Water Resources Management.

[65]  George H. Hargreaves,et al.  Reference Crop Evapotranspiration from Temperature , 1985 .

[66]  Yang Zhang,et al.  LightGBM: An Effective miRNA Classification Method in Breast Cancer Patients , 2017, ICCBB.

[67]  Xavier Serra,et al.  Acoustic Scene Classification by Ensembling Gradient Boosting Machine and Convolutional Neural Networks , 2017, DCASE.

[68]  Yu Feng,et al.  Impacts of climatic variables on reference evapotranspiration during growing season in Southwest China , 2019, Agricultural Water Management.

[69]  Jalal Shiri,et al.  Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran , 2017 .