Enhancing Streamflow Forecast and Extracting Insights Using Long‐Short Term Memory Networks With Data Integration at Continental Scales

Recent observations with varied schedules and types (moving average, snapshot, or regularly spaced) can help to improve streamflow forecast but it is difficult to effectively integrate them. Based on a long short-term memory (LSTM) streamflow model, we tested different formulations in a flexible method we call data integration (DI) to integrate recently discharge measurements to improve forecast. DI accepts lagged inputs either directly or through a convolutional neural network (CNN) unit. DI can ubiquitously elevate streamflow forecast performance to unseen levels, reaching a continental-scale median Nash-Sutcliffe coefficient of 0.86. Integrating moving-average discharge, discharge from a few days ago, or even average discharge of the last calendar month could all improve daily forecast. It turned out, directly using lagged observations as inputs was comparable in performance to using the CNN unit. Importantly, we obtained valuable insights regarding hydrologic processes impacting LSTM and DI performance. Before applying DI, the original LSTM worked well in mountainous regions and snow-dominated regions, but less so in regions with low discharge volumes (due to either low precipitation or high precipitation-energy synchronicity) and large inter-annual storage variability. DI was most beneficial in regions with high flow autocorrelation: it greatly reduced baseflow bias in groundwater-dominated western basins; it also improved the peaks for basins with dynamical surface water storage, e.g., the Prairie Potholes or Great Lakes regions. However, even DI cannot help high-aridity basins with one-day flash peaks. There is much promise with a deep-learning-based forecast paradigm due to its performance, automation, efficiency, and flexibility.

[1]  Martyn P. Clark,et al.  Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance , 2014 .

[2]  Daniel Kifer,et al.  Evaluating aleatoric and epistemic uncertainties of time series deep learning models for soil moisture predictions , 2019, ArXiv.

[3]  Giha Lee,et al.  Application of Long Short-Term Memory (LSTM) Neural Network for Flood Forecasting , 2019, Water.

[4]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[5]  Chaopeng Shen,et al.  Quantifying storage changes in regional Great Lakes watersheds using a coupled subsurface‐land surface process model and GRACE, MODIS products , 2014 .

[6]  Xiao Yang,et al.  Prolongation of SMAP to Spatiotemporally Seamless Coverage of Continental U.S. Using a Deep Learning Neural Network , 2017, 1707.06611.

[7]  Hui Li,et al.  Deep Learning with a Long Short-Term Memory Networks Approach for Rainfall-Runoff Simulation , 2018, Water.

[8]  D. Hubbard,et al.  Spring runoff retention in prairie pothole wetlands , 1986 .

[9]  Kuolin Hsu,et al.  HESS Opinions: Incubating deep-learning-powered hydrologic science advances as a community , 2018, Hydrology and Earth System Sciences.

[10]  W. J. Shuttleworth,et al.  Integration of soil moisture remote sensing and hydrologic modeling using data assimilation , 1998 .

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  H. Gupta,et al.  Real-Time Data Assimilation for Operational Ensemble Streamflow Forecasting , 2006 .

[13]  M. Rodell,et al.  Assimilation of GRACE Terrestrial Water Storage Data into a Land Surface Model: Results for the Mississippi River Basin , 2008 .

[14]  Nagiza F. Samatova,et al.  Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[15]  N. Verhoest,et al.  Correcting for forecast bias in soil moisture assimilation with the ensemble Kalman filter , 2007 .

[16]  Chaopeng Shen,et al.  Improving Budyko curve‐based estimates of long‐term water partitioning using hydrologic signatures from GRACE , 2016 .

[17]  Intermittent Surface Water Connectivity: Fill and Spill Vs. Fill and Merge Dynamics , 2016, Wetlands.

[18]  James C. Bennett,et al.  A strategy to overcome adverse effects of autoregressive updating of streamflow forecasts , 2015 .

[19]  E. Anderson,et al.  Calibration of Conceptual Hydrologic Models for Use in River Forecasting , 2002 .

[20]  Wei-keng Liao,et al.  Toward enhanced understanding and projections of climate extremes using physics-guided data mining techniques , 2014 .

[21]  Chaopeng Shen,et al.  The Value of SMAP for Long-Term Soil Moisture Estimation With the Help of Deep Learning , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[22]  L. Putnam,et al.  Hydrology of the Black Hills area, South Dakota , 2002 .

[23]  Jiancheng Shi,et al.  The Soil Moisture Active Passive (SMAP) Mission , 2010, Proceedings of the IEEE.

[24]  Hoshin Vijai Gupta,et al.  The quantity and quality of information in hydrologic models , 2015 .

[25]  David R. Maidment,et al.  Conceptual Framework for the National Flood Interoperability Experiment , 2017 .

[26]  M. Ye,et al.  Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas , 2018, Journal of Hydrology.

[27]  Ross Woods,et al.  Analytical model of seasonal climate impacts on snow hydrology: Continuous snowpacks , 2009 .

[28]  Chaopeng Shen,et al.  A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists , 2017, Water Resources Research.

[29]  James C. Bennett,et al.  Reliable long‐range ensemble streamflow forecasts: Combining calibrated climate forecasts with a conceptual runoff model and a staged error model , 2016 .

[30]  F. Santosa,et al.  Linear inversion of ban limit reflection seismograms , 1986 .

[31]  I. Ridwansyah,et al.  Long short term memory (LSTM) recurrent neural network (RNN) for discharge level prediction and forecast in Cimandiri river, Indonesia , 2019, IOP Conference Series: Earth and Environmental Science.

[32]  Scott G. Leibowitz,et al.  Temporal connectivity in a prairie pothole complex , 2003, Wetlands.

[33]  Rory Nathan,et al.  A Standard Approach to Baseflow Separation Using The Lyne and Hollick Filter , 2013 .

[34]  Anuj Karpatne,et al.  Physics Guided Recurrent Neural Networks For Modeling Dynamical Systems: Application to Monitoring Water Temperature And Quality In Lakes , 2018, ArXiv.

[35]  Lifeng Luo,et al.  A Multiscale Ensemble Filtering System for Hydrologic Data Assimilation. Part I: Implementation and Synthetic Experiment , 2009 .

[36]  P. Milly Climate, soil water storage, and the average annual water balance , 1994 .

[37]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Robert M. Hirsch,et al.  Flood trends: Not higher but more often , 2015 .

[39]  R. Vogel,et al.  Estimation of baseflow recession constants , 1996 .

[40]  D. Lettenmaier,et al.  The SWOT Mission and Its Capabilities for Land Hydrology , 2016, Surveys in Geophysics.

[41]  Karsten Schulz,et al.  Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks , 2018, Hydrology and Earth System Sciences.

[42]  Anuj Karpatne,et al.  Physics Guided RNNs for Modeling Dynamical Systems: A Case Study in Simulating Lake Temperature Profiles , 2018, SDM.

[43]  Hoshin Vijai Gupta,et al.  Toward improved identification of hydrological models: A diagnostic evaluation of the “abcd” monthly water balance model for the conterminous United States , 2010 .

[44]  R. Woods,et al.  Patterns of similarity of seasonal water balances: A window into streamflow variability over a range of time scales , 2014 .

[45]  J. Schaake,et al.  Correcting Errors in Streamflow Forecast Ensemble Mean and Spread , 2008 .

[46]  Sepp Hochreiter,et al.  Benchmarking a Catchment-Aware Long Short-Term Memory Network (LSTM) for Large-Scale Hydrological Modeling , 2019, ArXiv.

[47]  David J. Goodman,et al.  Personal Communications , 1994, Mobile Communications.

[48]  Hoshin Vijai Gupta,et al.  Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling , 2009 .

[49]  Bong-Chul Seo,et al.  Real-Time Flood Forecasting and Information System for the State of Iowa , 2017 .

[50]  Hoshin Vijai Gupta,et al.  A process‐based diagnostic approach to model evaluation: Application to the NWS distributed hydrologic model , 2008 .

[51]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[52]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[53]  R. Ibbitt,et al.  Hydrological data assimilation with the ensemble Kalman filter: Use of streamflow observations to update states in a distributed hydrological model , 2007 .

[54]  J. Melack,et al.  The fan of influence of streams and channel feedbacks to simulated land surface water and carbon dynamics , 2016 .

[55]  Chaopeng Shen,et al.  Near-Real-Time Forecast of Satellite-Based Soil Moisture Using Long Short-Term Memory with an Adaptive Data Integration Kernel , 2020 .

[56]  Chaopeng Shen,et al.  Evaluating controls on coupled hydrologic and vegetation dynamics in a humid continental climate watershed using a subsurface‐land surface processes model , 2013 .

[57]  Martyn P. Clark,et al.  Benchmarking of a Physically Based Hydrologic Model , 2017 .

[58]  T. Over,et al.  Bias correction of simulated historical daily streamflow at ungauged locations by using independently estimated flow duration curves , 2018, Hydrology and Earth System Sciences.

[59]  Chaopeng Shen,et al.  Quantifying the effects of data integration algorithms on the outcomes of a subsurface-land surface processes model , 2014, Environ. Model. Softw..

[60]  Sepp Hochreiter,et al.  NeuralHydrology - Interpreting LSTMs in Hydrology , 2019, Explainable AI.

[61]  Vijay P. Singh,et al.  The NWS River Forecast System - catchment modeling. , 1995 .

[62]  Chaopeng Shen,et al.  A process-based, distributed hydrologic model based on a large-scale method for surface–subsurface coupling , 2010 .

[63]  Chuntian Cheng,et al.  A comparison of performance of several artificial intelligence , 2009 .

[64]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[65]  G. O'Donnell,et al.  Integrating different types of information into hydrological model parameter estimation: Application to ungauged catchments and land use scenario analysis , 2012 .

[66]  B. Neff,et al.  Estimating Wetland Connectivity to Streams in the Prairie Pothole Region: An Isotopic and Remote Sensing Approach , 2018, Water resources research.

[67]  Bailing Li,et al.  Assimilation of GRACE Terrestrial Water Storage Observations into a Land Surface Model for the Assessment of Regional Flood Potential , 2015, Remote. Sens..

[68]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[69]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.