Monthly Streamflow Prediction by Metaheuristic Regression Approaches Considering Satellite Precipitation Data

In this study, the viability of three metaheuristic regression techniques, CatBoost (CB), random forest (RF) and extreme gradient tree boosting (XGBoost, XGB), is investigated for the prediction of monthly streamflow considering satellite precipitation data. Monthly streamflow data from three measuring stations in Turkey and satellite rainfall data derived from Tropical Rainfall Measuring Mission (TRMM) were used as inputs to the models to predict 1 month ahead streamflow. Such predictions are crucial for decision-making in water resource planning and management associated with water allocations, water market planning, restricting water supply and managing drought. The outcomes of the metaheuristic regression methods were compared with those of artificial neural networks (ANN) and nonlinear regression (NLR). The effect of the periodicity component was also investigated by importing the month number of the streamflow data as input. In the first part of the study, the streamflow at each station was predicted using CB, RF, XGB, ANN and NLR methods and considering TRMM data. In the second part, streamflow at the downstream station was predicted using data from upstream stations. In both parts, the CB and XGB methods generally provided similar accuracy and performed superior to the RF, ANN and NLR methods. It was observed that the use of TRMM rainfall data and the periodicity component considerably improved the efficiency of the metaheuristic regression methods in modeling (prediction) streamflow. The use of TRMM data as inputs improved the root mean square error (RMSE) of CB, RF and XGB by 36%, 31% and 24%, respectively, on average, while the corresponding values were 37%, 18% and 43% after introducing periodicity information into the model’s inputs.

[1]  Sujay Raghavendra Naganna,et al.  Comparative evaluation of deep learning and machine learning in modelling pan evaporation using limited inputs , 2022, Hydrological Sciences Journal.

[2]  Z. Zang,et al.  A Machine-Learning Approach Combining Wavelet Packet Denoising with Catboost for Weather Forecasting , 2021, Atmosphere.

[3]  M. Boucher,et al.  Review: Theory-guided machine learning applied to hydrogeology—state of the art, opportunities and future challenges , 2021, Hydrogeology Journal.

[4]  Hossein Sahour,et al.  Random forest and extreme gradient boosting algorithms for streamflow modeling using vessel features and tree-rings , 2021, Environmental Earth Sciences.

[5]  Jihong Qu,et al.  Examination and comparison of binary metaheuristic wrapper-based input variable selection for local and global climate information-driven one-step monthly streamflow forecasting , 2021, Journal of Hydrology.

[6]  O. Kisi,et al.  Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction , 2020, Stochastic Environmental Research and Risk Assessment.

[7]  John T. Hancock,et al.  CatBoost for big data: an interdisciplinary review , 2020, Journal of Big Data.

[8]  O. Kisi,et al.  Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data , 2020, Neural Computing and Applications.

[9]  Georgia Papacharalampous,et al.  Super ensemble learning for daily streamflow forecasting: large-scale demonstration and comparison with multiple machine learning algorithms , 2020, Neural Computing and Applications.

[10]  Dong Wang,et al.  Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model , 2020 .

[11]  Mohammad Najafzadeh,et al.  Riprap incipient motion for overtopping flows with machine learning models , 2020 .

[12]  D. Rupp,et al.  Climate change alters flood magnitudes and mechanisms in climatically-diverse headwaters across the northwestern United States , 2020, Environmental Research Letters.

[13]  D. Tarboton,et al.  Forests and Water Yield: A Synthesis of Disturbance Effects on Streamflow and Snowpack in Western Coniferous Forests , 2020, Journal of Forestry.

[14]  D. Ruzzante,et al.  Human‐induced habitat fragmentation effects on connectivity, diversity, and population persistence of an endemic fish, Percilia irwini, in the Biobío River basin (Chile) , 2019, Evolutionary applications.

[15]  Shibao Lu,et al.  A review of the impact of hydropower reservoirs on global climate change. , 2019, The Science of the total environment.

[16]  Heng Zhang,et al.  Dynamic Streamflow Simulation via Online Gradient-Boosted Regression Tree , 2019, Journal of Hydrologic Engineering.

[17]  W. Zeng,et al.  Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions , 2019, Journal of Hydrology.

[18]  Celso Augusto Guimarães Santos,et al.  Analysis of the use of discrete wavelet transforms coupled with ANN for short-term streamflow forecasting , 2019, Appl. Soft Comput..

[19]  Lili Wang,et al.  Improving the prediction accuracy of monthly streamflow using a data-driven model based on a double-processing strategy , 2019, Journal of Hydrology.

[20]  Xiangang Peng,et al.  A novel wind speed forecasting based on hybrid decomposition and online sequential outlier robust extreme learning machine , 2019, Energy Conversion and Management.

[21]  T. Pavelsky,et al.  Global extent of rivers and streams , 2018, Science.

[22]  Fang-Fang Li,et al.  Hybrid Models Combining EMD/EEMD and ARIMA for Long-Term Streamflow Forecasting , 2018, Water.

[23]  Qiang Zhang,et al.  Univariate streamflow forecasting using commonly used data-driven models: literature review and case study , 2018 .

[24]  Dahai Zhang,et al.  A Data-Driven Design for Fault Detection of Wind Turbines Using Random Forests and XGboost , 2018, IEEE Access.

[25]  R. Silva,et al.  Integrated spatiotemporal trends using TRMM 3B42 data for the Upper São Francisco River basin, Brazil , 2018, Environmental Monitoring and Assessment.

[26]  J. Vose,et al.  Continental U.S. streamflow trends from 1940 to 2009 and their relationships with watershed spatial characteristics , 2015 .

[27]  J. Schoonover,et al.  Fundamentals of watershed hydrology , 2015 .

[28]  P Burlando,et al.  Does internal climate variability overwhelm climate change signals in streamflow? The upper Po and Rhone basin case studies. , 2014, The Science of the total environment.

[29]  J. Adamowski,et al.  Multi-step streamflow forecasting using data-driven non-linear methods in contrasting climate regimes , 2014 .

[30]  Jing Shi,et al.  Evaluation of hybrid forecasting approaches for wind speed and power generation time series , 2012 .

[31]  Ozgur Kisi,et al.  River Flow Estimation and Forecasting by Using Two Different Adaptive Neuro-Fuzzy Approaches , 2012, Water Resources Management.

[32]  Kwok-Wing Chau,et al.  Data-driven models for monthly streamflow time series prediction , 2010, Eng. Appl. Artif. Intell..

[33]  O. Kisi River flow forecasting and estimation using different artificial neural network techniques , 2008 .

[34]  P. Gelder,et al.  Forecasting daily streamflow using hybrid ANN models , 2006 .

[35]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[36]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[37]  Yanhua Zhuang,et al.  Anthropogenic Impacts on Streamflow-Compensated Climate Change Effect in the Hanjiang River Basin, China , 2020 .

[38]  Hammadi Achour,et al.  Monthly assessment of TRMM 3B43 rainfall data with high-density gauge stations over Tunisia , 2019, Arabian Journal of Geosciences.

[39]  Shih-Chieh Kao,et al.  Effects of climate change on streamflow extremes and implications for reservoir inflow in the United States , 2018 .

[40]  Marie Frei,et al.  Fundamentals Of Hydrology , 2016 .

[41]  Lu Yang The Applicability Analysis of TRMM Precipitation Data in the Yarlung Zangbo River Basin , 2013 .

[42]  J. A. Ferreira,et al.  Singular spectrum analysis and forecasting of hydrological time series , 2006 .