Climate-driven Model Based on Long Short-Term Memory and Bayesian Optimization for Multi-day-ahead Daily Streamflow Forecasting

Many previous studies have developed decomposition and ensemble models to improve runoff forecasting performance. However, these decomposition-based models usually introduce large decomposition errors into the modeling process. Since the variation in runoff time series is greatly driven by climate change, many previous studies considering climate change focused on only rainfall-runoff modeling, with few meteorological factors as input. Therefore, a climate-driven streamflow forecasting (CDSF) framework was proposed to improve the runoff forecasting accuracy. This framework is realized using principal component analysis (PCA), long short-term memory (LSTM) and Bayesian optimization (BO) referred to as PCA-LSTM-BO. To validate the effectiveness and superiority of the PCA-LSTM-BO method with which one autoregressive LSTM model and two other CDSF models based on PCA, BO, and either support vector regression (SVR) or, gradient boosting regression trees (GBRT), namely, PCA-SVR-BO and PCA-GBRT-BO, respectively, were compared. A generalization performance index based on the Nash-Sutcliffe efficiency (NSE), called the GI(NSE) value, is proposed to evaluate the generalizability of the model. The results show that (1) the proposed model is significantly better than the other benchmark models in terms of the mean square error (MSE<=185.782), NSE>=0.819, and GI(NSE) <=0.223 for all the forecasting scenarios; (2) the PCA in the CDSF framework can improve the forecasting capacity and generalizability; (3) the CDSF framework is superior to the autoregressive LSTM models for all the forecasting scenarios; and (4) the GI(NSE) value is demonstrated to be effective in selecting the optimal model with a better generalizability.