Investigation of hydrometeorological influences on reservoir releases using explainable machine learning methods

Long short-term memory (LSTM) networks have demonstrated successful applications in accurately and efficiently predicting reservoir releases from hydrometeorological drivers including reservoir storage, inflow, precipitation, and temperature. However, due to its black-box nature and lack of process-based implementation, we are unsure whether LSTM makes good predictions for the right reasons. In this work, we use an explainable machine learning (ML) method, called SHapley Additive exPlanations (SHAP), to evaluate the variable importance and variable-wise temporal importance in the LSTM model prediction. In application to 30 reservoirs over the Upper Colorado River Basin, United States, we show that LSTM can accurately predict the reservoir releases with NSE ≥ 0.69 for all the considered reservoirs despite of their diverse storage sizes, functionality, elevations, etc. Additionally, SHAP indicates that storage and inflow are more influential than precipitation and temperature. Moreover, the storage and inflow show a relatively long-term influence on the release up to 7 days and this influence decreases as the lag time increases for most reservoirs. These findings from SHAP are consistent with our physical understanding. However, in a few reservoirs, SHAP gives some temporal importances that are difficult to interpret from a hydrological point of view, probably because of its ignorance of the variable interactions. SHAP is a useful tool for black-box ML model explanations, but the hydrological processes inferred from its results should be interpreted cautiously. More investigations of SHAP and its applications in hydrological modeling is needed and will be pursued in our future study.

[1]  N. Griffiths,et al.  Uncertainty quantification of machine learning models to improve streamflow prediction under changing climate and environmental conditions , 2023, Frontiers in Water.

[2]  M. Gómez-Gesteira,et al.  Comparison of machine learning techniques for reservoir outflow forecasting , 2022, Natural Hazards and Earth System Sciences.

[3]  Siyan Liu,et al.  Identifying Hydrometeorological Factors Influencing Reservoir Releases Using Machine Learning Methods , 2022, 2022 IEEE International Conference on Data Mining Workshops (ICDMW).

[4]  Y. Hao,et al.  Spatial-temporal behavior of precipitation driven karst spring discharge in a mountain terrain , 2022, Journal of Hydrology.

[5]  S. Sorooshian,et al.  Prediction of the outflow temperature of large-scale hydropower using theory-guided machine learning surrogate models of a high-fidelity hydrodynamics model , 2022, Journal of Hydrology.

[6]  V. R. Joseph Optimal ratio for data splitting , 2022, Stat. Anal. Data Min..

[7]  V. Babovic,et al.  Uncovering Flooding Mechanisms Across the Contiguous United States Through Interpretive Deep Learning on Representative Catchments , 2021, Water Resources Research.

[8]  Jens A. de Bruijn,et al.  Hydrological Concept Formation inside Long Short-Term Memory (LSTM) networks , 2021, Hydrology and Earth System Sciences.

[9]  Taereem Kim,et al.  A large-scale comparison of Artificial Intelligence and Data Mining (AI&DM) techniques in simulating reservoir releases over the Upper Colorado Region , 2021 .

[10]  Dagang Wang,et al.  A hybrid deep learning algorithm and its application to streamflow prediction , 2021 .

[11]  Guannan Zhang,et al.  PI3NN: Out-of-distribution-aware Prediction Intervals from Three Neural Networks , 2021, ICLR.

[12]  Yu Zhang,et al.  A forecast-driven decision-making model for long-term operation of a hydro-wind-photovoltaic hybrid system , 2021 .

[13]  S. Kao,et al.  Streamflow simulation in data-scarce basins using Bayesian and physics-informed machine learning models , 2021, Journal of Hydrometeorology.

[14]  Daniel Fryer,et al.  Shapley values for feature selection: The good, the bad, and the axioms , 2021, IEEE Access.

[15]  Mehdi Zolfaghari,et al.  Modeling and predicting the electricity production in hydropower using conjunction of wavelet transform, long short-term memory and random forest models , 2021 .

[16]  T. M. Chui,et al.  Modeling and interpreting hydrological responses of sustainable urban drainage systems with explainable machine learning methods , 2020, Hydrology and Earth System Sciences.

[17]  Soroosh Sorooshian,et al.  A Model Tree Generator (MTG) Framework for Simulating Hydrologic Systems: Application to Reservoir Routing , 2020, Water.

[18]  Himabindu Lakkaraju,et al.  Reliable Post hoc Explanations: Modeling Uncertainty in Explainability , 2020, NeurIPS.

[19]  Keith Beven,et al.  Deep learning, hydrological processes and the uniqueness of place , 2020, Hydrological Processes.

[20]  Xiaomang Liu,et al.  Simulating Hydropower Discharge using Multiple Decision Tree Methods and a Dynamical Model Merging Technique , 2020 .

[21]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[22]  Jie Li,et al.  Deriving reservoir operation rule based on Bayesian deep learning method considering multiple uncertainties , 2019 .

[23]  Himabindu Lakkaraju,et al.  Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods , 2019, AIES.

[24]  Mukund Sundararajan,et al.  The many Shapley values for model explanation , 2019, ICML.

[25]  Yong Yu,et al.  A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures , 2019, Neural Computation.

[26]  Wang Dongsheng,et al.  Simulating Reservoir Operation Using a Recurrent Neural Network Algorithm , 2019, Water.

[27]  Prabhat,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[28]  Chandan Singh,et al.  Definitions, methods, and applications in interpretable machine learning , 2019, Proceedings of the National Academy of Sciences.

[29]  Chuntian Cheng,et al.  Comparison of Multiple Linear Regression, Artificial Neural Network, Extreme Learning Machine, and Support Vector Machine in Deriving Operation Rule of Hydropower Reservoir , 2019, Water.

[30]  Soroosh Sorooshian,et al.  Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm , 2018, Journal of Hydrology.

[31]  Omid Bozorg-Haddad,et al.  Real-time reservoir operation using data mining techniques , 2018, Environmental Monitoring and Assessment.

[32]  Dirk Schwanenberg,et al.  Real-Time Flood Control by Tree-Based Model Predictive Control Including Forecast Uncertainty: A Case Study Reservoir in Turkey , 2018 .

[33]  Chaopeng Shen,et al.  A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists , 2017, Water Resources Research.

[34]  E. Stakhiv,et al.  Reservoir operations under climate change: Storage capacity options to mitigate risk , 2017 .

[35]  Xiaohu Guo,et al.  Multi-objective reservoir operation during flood season considering spillway optimization. , 2017 .

[36]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[37]  Soroosh Sorooshian,et al.  Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information , 2017 .

[38]  Bin He,et al.  Water Quality Assessment and Pollution Source Identification of the Eastern Poyang Lake Basin Using Multivariate Statistical Methods , 2016 .

[39]  Fi-John Chang,et al.  Modelling Intelligent Water Resources Allocation for Multi-users , 2016, Water Resources Management.

[40]  Soroosh Sorooshian,et al.  Simulating California reservoir operation using the classification and regression‐tree algorithm combined with a shuffled cross‐validation scheme , 2015 .

[41]  Bin He,et al.  Modeling suspended sediment sources and transport in the Ishikari River basin, Japan, using SPARROW , 2014 .

[42]  Bin He,et al.  Spatial and temporal trends in estimates of nutrient and suspended sediment loads in the Ishikari River, Japan, 1985 to 2010. , 2013, The Science of the total environment.

[43]  Hoshin Vijai Gupta,et al.  Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling , 2009 .

[44]  Dawei Lu,et al.  A Spatiotemporal-Aware Weighting Scheme for Improving Climate Model Ensemble Predictions , 2022, Journal of Machine Learning for Modeling and Computing.

[45]  D. Ricciuto,et al.  A N INTERPRETABLE MACHINE LEARNING MODEL FOR ADVANCING TERRESTRIAL ECOSYSTEM PREDICTIONS , 2022 .

[46]  R. Sankaran,et al.  An out-of-distribution-aware autoencoder model for reduced chemical kinetics , 2021, Discrete & Continuous Dynamical Systems - S.

[47]  Brandon M. Greenwell,et al.  Interpretable Machine Learning , 2019, Hands-On Machine Learning with R.

[48]  Jeffrey G. Arnold,et al.  Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations , 2007 .