Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS

Abstract With recent developments in computational techniques, Data-driven Machine Learning Models (DMLs) have shown great potential in simulating streamflow and capturing the rainfall-runoff relationship in given watersheds, which are traditionally fulfilled by Process-based Hydrologic Models (PHMs). There are debates on whether the DMLs can outperform and possibly replace the classical PHMs for streamflow simulation and river forecasting, but no clear conclusions have been made. This study aims to investigate whether the newer DMLs have any potential in further improving the simulation accuracy of classical PHMs, and vice versa. To do this, we compared a few popular PHMs and DMLs over four watersheds across the Continental US (CONUS) that are associated with different input, climate, and regional conditions. A total of five hydrologic models were chosen, including (1) two classical lumped models, i.e., the Sacramento Soil Moisture Accounting (SAC-SMA) and Xinanjiang (XAJ); (2) one modern distributed model, termed Coupled Routing and Excess Storage (CREST); (3) and two DMLs including an Artificial Neural Networks (ANN) and a deep learning model, termed Long Short Term Memory (LSTM). Our results demonstrated that the DMLs still significantly biased when using the baseline input scenario with the PHMs. However, the DMLs fed with delayed input scenarios had great potential and can reach high simulation accuracy. The DMLs, especially the ANN, outperformed other employed models under the rainfall-runoff relationship in which rainfall dominantly drives. The DMLs also showed better performance in the high-flow regime, while the PHMs had a better performance for the low-flow regime, implying both PHMs and DMLs have their own merits and are worthy of joint development. In general, our study indicated a great potential of using DMLs to simulate streamflow, but further studies are still needed to verify the transferability and scalability of DMLs in large-scale experiments, such as the Distributed Model Intercomparison Projects 1&2 conducted by National Weather Services but to compare modern DMLs and PHMs.

[1]  Jose D. Salas,et al.  Shifting level modelling of hydrologic series , 1980 .

[2]  Faxin Wang,et al.  Comparative Analysis of ANN and SVM Models Combined with Wavelet Preprocess for Groundwater Depth Prediction , 2017 .

[3]  Wenyan Wu,et al.  A benchmarking approach for comparing data splitting methods for modeling water resources parameters using artificial neural networks , 2013 .

[4]  Soroosh Sorooshian,et al.  Short‐Term Precipitation Forecast Based on the PERSIANN System and LSTM Recurrent Neural Networks , 2018, Journal of Geophysical Research: Atmospheres.

[5]  Elena Shevliakova,et al.  Harnessing big data to rethink land heterogeneity in Earth system models , 2017, Hydrology and Earth System Sciences.

[6]  Momcilo Markus,et al.  PRECIPITATION-RUNOFF MODELING USING ARTIFICIAL NEURAL NETWORKS AND CONCEPTUAL MODELS , 2000 .

[7]  Thomas A. McMahon,et al.  Physically based hydrologic modeling: 1. A terrain‐based model for investigative purposes , 1992 .

[8]  Sadiq I. Khan,et al.  The coupled routing and excess storage (CREST) distributed hydrological model , 2011 .

[10]  Soroosh Sorooshian,et al.  An enhanced artificial neural network with a shuffled complex evolutionary global optimization with principal component analysis , 2017, Inf. Sci..

[11]  Li Zhijia,et al.  Study of the Xinanjiang Model Parameter Calibration , 2013 .

[12]  Soroosh Sorooshian,et al.  A 'User-Friendly' approach to parameter estimation in hydrologic models , 2006 .

[13]  L. S. Pereira,et al.  Crop evapotranspiration : guidelines for computing crop water requirements , 1998 .

[14]  Kuolin Hsu,et al.  Hydrological Modelling and the Water Cycle , 2008 .

[15]  Nancy B. Grimm,et al.  TEMPORAL SUCCESSION IN A DESERT STREAM ECOSYSTEM FOLLOWING FLASH FLOODING , 1982 .

[16]  Jasper A. Vrugt,et al.  High‐dimensional posterior exploration of hydrologic models using multiple‐try DREAM(ZS) and high‐performance computing , 2012 .

[17]  Vijay P. Singh,et al.  Hydrologic modeling: progress and future directions , 2018, Geoscience Letters.

[18]  H. H. Schumann,et al.  Water resources of the Sycamore Creek watershed, Maricopa County, Arizona , 1969 .

[19]  Slobodan P. Simonovic,et al.  Short term streamflow forecasting using artificial neural networks , 1998 .

[20]  Wei Chu,et al.  Comment on “High‐dimensional posterior exploration of hydrologic models using multiple‐try DREAM (ZS) and high‐performance computing” by Eric Laloy and Jasper A. Vrugt , 2014 .

[21]  Stefan Lessmann,et al.  A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data , 2018 .

[22]  Soroosh Sorooshian,et al.  Improving the shuffled complex evolution scheme for optimization of complex nonlinear hydrological systems: Application to the calibration of the Sacramento soil‐moisture accounting model , 2010 .

[23]  Wei Chu,et al.  A new evolutionary search strategy for global optimization of high-dimensional problems , 2011, Inf. Sci..

[24]  Nitin K. Tripathi,et al.  An artificial neural network model for rainfall forecasting in Bangkok, Thailand , 2008 .

[25]  Ali Reza Sepaskhah,et al.  Annual precipitation forecast for west, southwest, and south provinces of Iran using artificial neural networks , 2012, Theoretical and Applied Climatology.

[26]  S. Sorooshian,et al.  Calibration of a semi-distributed hydrologic model for streamflow estimation along a river system , 2004, Journal of Hydrology.

[27]  Kuolin Hsu,et al.  From lumped to distributed via semi-distributed: Calibration strategies for semi-distributed hydrologic models , 2012 .

[28]  Hanbeen Kim,et al.  Ensemble‐Based Neural Network Modeling for Hydrologic Forecasts: Addressing Uncertainty in the Model Structure and Input Variable Selection , 2020, Water Resources Research.

[29]  Alfred Stein,et al.  Assessment of a conceptual hydrological model and artificial neural networks for daily outflows forecasting , 2013, International Journal of Environmental Science and Technology.

[30]  H. Kling,et al.  Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios , 2012 .

[31]  Shenglian Guo,et al.  A modified xinanjiang model and its application in northern China , 2005 .

[32]  Soroosh Sorooshian,et al.  Toward improved streamflow forecasts: value of semidistributed modeling , 2001 .

[33]  Cheng Yao,et al.  Event-based hydrological modeling for detecting dominant hydrological process and suitable model strategy for semi-arid catchments , 2016 .

[34]  Tarik A. Rashid,et al.  Lecturer performance system using neural network with Particle Swarm Optimization , 2016, Comput. Appl. Eng. Educ..

[35]  John W. Nielsen-Gammon,et al.  The 2011 Texas Drought: A Briefing Packet for the Texas Legislature , 2011 .

[36]  Eric F. Wood,et al.  POLARIS: A 30-meter probabilistic soil series map of the contiguous United States , 2016 .

[37]  J. Heo,et al.  The Use of Large-Scale Climate Indices in Monthly Reservoir Inflow Forecasting and Its Application on Time Series and Artificial Intelligence Models , 2019, Water.

[38]  S. Sorooshian,et al.  A multistep automatic calibration scheme for river forecasting models , 2000 .

[39]  Jianwei Hu,et al.  Application of BP Neural Network Algorithm in Traditional Hydrological Model for Flood Forecasting , 2017 .

[40]  Dmitri Kavetski,et al.  A unified approach for process‐based hydrologic modeling: 1. Modeling concept , 2015 .

[41]  S. Sorooshian,et al.  Effective and efficient global optimization for conceptual rainfall‐runoff models , 1992 .

[42]  S. L. Sellars,et al.  “Grand Challenges” in Big Data and the Earth Sciences , 2018, Bulletin of the American Meteorological Society.

[43]  Hafzullah Aksoy,et al.  Artificial neural network models for forecasting intermittent monthly precipitation in arid regions , 2009 .

[44]  Soroosh Sorooshian,et al.  Calibration of rainfall‐runoff models: Application of global optimization to the Sacramento Soil Moisture Accounting Model , 1993 .

[45]  Michael Smith,et al.  Hydrology laboratory research modeling system (HL-RMS) of the US national weather service , 2004 .

[46]  Karsten Schulz,et al.  Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks , 2018, Hydrology and Earth System Sciences.

[47]  Wei Chu,et al.  A comprehensive evaluation of various sensitivity analysis methods: A case study with a hydrological model , 2014, Environ. Model. Softw..

[48]  Hafzullah Aksoy,et al.  Artificial neural network models for forecasting monthly precipitation in Jordan , 2009 .

[49]  Abdul Razzaq Ghumman,et al.  Impact Assessment of Rainfall-Runoff Simulations on the Flow Duration Curve of the Upper Indus River—A Comparison of Data-Driven and Hydrologic Models , 2018, Water.

[50]  Ioannis K. Tsanis,et al.  Comparison of an artificial neural network and a conceptual rainfall–runoff model in the simulation of ephemeral streamflow , 2016 .

[51]  Jiake Li,et al.  Comparison between the TOPMODEL and the Xin’anjiang model and their application to rainfall runoff simulation in semi-humid regions , 2018, Environmental Earth Sciences.

[52]  J. McNair,et al.  COMPARISON OF PROCESS‐BASED AND ARTIFICIAL NEURAL NETWORK APPROACHES FOR STREAMFLOW MODELING IN AN AGRICULTURAL WATERSHED 1 , 2006 .

[53]  Yaokui Cui,et al.  An Improved Coupled Routing and Excess Storage (CREST) Distributed Hydrological Model and Its Verification in Ganjiang River Basin, China , 2017 .

[54]  Jeffrey G. Arnold,et al.  Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations , 2007 .

[55]  Juan Diego Giraldo-Osorio,et al.  Temporal and spatial evaluation of satellite rainfall estimates over different regions in Latin-America , 2017, Atmospheric Research.

[56]  Jim E. Freer,et al.  Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores , 2019, Hydrology and Earth System Sciences.

[57]  Yang Hong,et al.  The FLASH Project: Improving the Tools for Flash Flood Monitoring and Prediction across the United States , 2017 .

[58]  Jeong-Hwan Kim,et al.  Deep learning for multi-year ENSO forecasts , 2019, Nature.

[59]  B. Davison,et al.  Low-Flows in Deterministic Modelling: A Brief Review , 2008 .

[60]  Soroosh Sorooshian,et al.  Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information , 2017 .

[61]  Zhao Ren-jun,et al.  The Xinanjiang model applied in China , 1992 .

[62]  Yun Xu,et al.  On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning , 2018, Journal of Analysis and Testing.

[63]  M. Valipour,et al.  Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir , 2013 .

[64]  Zong‐Liang Yang,et al.  Insights into Hydrometeorological Factors Constraining Flood Prediction Skill during the May and October 2015 Texas Hill Country Flood Events , 2018, Journal of Hydrometeorology.

[65]  P. Gentine,et al.  Evaluation and machine learning improvement of global hydrological model-based flood simulations , 2019, Environmental Research Letters.

[66]  Runoff components simulated by rainfallrunoff models , 1996 .

[67]  M. Jha,et al.  Hydrologic Time Series Analysis: Theory and Practice , 2012 .

[68]  Avi Ostfeld,et al.  Data-driven modelling: some past experiences and new approaches , 2008 .

[69]  Jian Zhao,et al.  Division-based rainfall-runoff simulations with BP neural networks and Xinanjiang model , 2009, Neurocomputing.

[70]  V. Singh,et al.  Mathematical Modeling of Watershed Hydrology , 2002 .

[71]  David G. Chandler,et al.  A comparison of SAC‐SMA and Adaptive Neuro‐fuzzy Inference System for real‐time flood forecasting in small urban catchments , 2018, Journal of Flood Risk Management.

[72]  Alex J. Cannon,et al.  Daily streamflow forecasting by machine learning methods with weather and climate inputs , 2012 .

[73]  Joachim Denzler,et al.  Deep learning and process understanding for data-driven Earth system science , 2019, Nature.

[74]  Chaopeng Shen,et al.  A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists , 2017, Water Resources Research.

[75]  Yang Hong,et al.  Statistical and hydrological evaluation of TRMM-based Multi-satellite Precipitation Analysis over the Wangchu Basin of Bhutan: Are the latest satellite precipitation products 3B42V7 ready for use in ungauged basins? , 2013 .

[76]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[77]  Demetris Koutsoyiannis,et al.  Climate change, the Hurst phenomenon, and hydrological statistics , 2003 .

[78]  J. Smith,et al.  Catastrophic Rainfall and Flooding in Texas , 2000 .

[79]  Cajo J. F. ter Braak,et al.  Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? , 2009 .

[80]  Qingyun Duan,et al.  Three decades of the Shuffled Complex Evolution (SCE-UA) optimization algorithm: Review and applications , 2019 .

[81]  Shahid Habib,et al.  An Integrated Modeling System for Estimating Glacier and Snow Melt Driven Streamflow from Remote Sensing and Earth System Data Products in the Himalayas , 2014 .

[82]  Yang Hong,et al.  Refining a Distributed Linear Reservoir Routing Method to Improve Performance of the CREST Model , 2017 .

[83]  Chuntian Cheng,et al.  A comparison of performance of several artificial intelligence , 2009 .

[84]  Feng Xu,et al.  A Flood Forecasting Model Based on Deep Learning Algorithm via Integrating Stacked Autoencoders with BP Neural Network , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[85]  Hatim O. Sharif,et al.  Hydrometeorology of the catastrophic Blanco river flood in South Texas, May 2015 , 2018 .

[86]  Wei Chu,et al.  A Solution to the Crucial Problem of Population Degeneration in High-Dimensional Evolutionary Optimization , 2011, IEEE Systems Journal.

[87]  Kuolin Hsu,et al.  Artificial Neural Network Modeling of the Rainfall‐Runoff Process , 1995 .

[88]  Kwok-wing Chau,et al.  Flood Prediction Using Machine Learning Models: Literature Review , 2018, Water.

[89]  G. Mendicino,et al.  Impact of high-resolution sea surface temperature representation on the forecast of small Mediterranean catchments' hydrological responses to heavy precipitation , 2020 .

[90]  Witold F. Krajewski,et al.  Rainfall forecasting in space and time using a neural network , 1992 .

[91]  A. T. C. Goh,et al.  Back-propagation neural networks for modeling complex systems , 1995, Artif. Intell. Eng..

[92]  Raymond M. Slade,et al.  Documented and potential extreme peak discharges and relation between potential extreme peak discharges and probable maximum flood peak discharges in Texas , 1995 .

[93]  Thomas C. Winter,et al.  Putting aquifers into atmospheric simulation models: an example from the Mill Creek Watershed, northeastern Kansas , 2002 .

[94]  V. T. Chow Handbook of applied hydrology , 2017 .

[95]  Pijush Samui,et al.  Forecasting monthly precipitation using sequential modelling , 2019, Hydrological Sciences Journal.

[96]  Dong-Jun Seo,et al.  The distributed model intercomparison project (DMIP): Motivation and experiment design , 2004 .

[97]  Shih-Chieh Kao,et al.  High‐resolution ensemble projections of near‐term regional climate over the continental United States , 2016 .

[98]  G. Villarini,et al.  The hydrology and hydrometeorology of flooding in the Delaware River basin. , 2010 .

[99]  Chong-Yu Xu,et al.  Evaluating the Temporal Dynamics of Uncertainty Contribution from Satellite Precipitation Input in Rainfall-Runoff Modeling Using the Variance Decomposition Method , 2018, Remote. Sens..

[100]  José Manuel Benítez,et al.  On the use of cross-validation for time series predictor evaluation , 2012, Inf. Sci..

[101]  G. Senay,et al.  Climate science and famine early warning , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[102]  Chuntian Cheng,et al.  Comparison of three global optimization algorithms for calibration of the Xinanjiang model parameters , 2013 .

[103]  Hoshin Vijai Gupta,et al.  Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling , 2009 .

[104]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[105]  Ali Aytek,et al.  An application of artificial intelligence for rainfall-runoff modeling , 2008 .

[106]  Markus Reichstein,et al.  Physics‐Constrained Machine Learning of Evapotranspiration , 2019, Geophysical Research Letters.

[107]  N. Null Artificial Neural Networks in Hydrology. I: Preliminary Concepts , 2000 .

[108]  Chong-yu Xu,et al.  The effect of rain gauge density and distribution on runoff simulation using a lumped hydrological modelling approach , 2018, Journal of Hydrology.

[109]  Minjiao Lu,et al.  Time scale dependent sensitivities of the XinAnJiang model parameters , 2014 .

[110]  Kuolin Hsu,et al.  Hydrologic evaluation of satellite precipitation products over a mid-size basin , 2011 .

[111]  Ke Zhang,et al.  Multiple hydrological models comparison and an improved Bayesian model averaging approach for ensemble prediction over semi-humid regions , 2018, Stochastic Environmental Research and Risk Assessment.

[112]  Guang-yuan Kan,et al.  Study on the Applicability of the Hargreaves Potential Evapotranspiration Estimation Method in CREST Distributed Hydrological Model (Version 3.0) Applications , 2018, Water.

[113]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[114]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[115]  Shih-Chieh Kao,et al.  A large-scale, high-resolution hydrological model parameter data set for climate change impact assessment for the conterminous US , 2013 .

[116]  J. Abbot,et al.  Application of artificial neural networks to rainfall forecasting in Queensland, Australia , 2012, Advances in Atmospheric Sciences.

[117]  Bernard Widrow,et al.  The basic ideas in neural networks , 1994, CACM.

[118]  Chaopeng Shen,et al.  Full‐flow‐regime storage‐streamflow correlation patterns provide insights into hydrologic functioning over the continental US , 2017 .

[119]  X. R. Liu,et al.  The Xinanjiang model. , 1995 .

[120]  Victor Koren,et al.  Parameterization of distributed hydrological models: learning from the experiences of lumped modeling , 2006 .

[121]  D. Lettenmaier,et al.  Estimation of the ARNO model baseflow parameters using daily streamflow data , 1999 .

[122]  S. Hochreiter,et al.  Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning , 2019, Water Resources Research.

[123]  B. Nelson,et al.  Evaluation of precipitation estimates over CONUS derived from satellite, radar, and rain gauge data sets at daily to annual scales (2002–2012) , 2015 .

[124]  C. Bretherton,et al.  Validation of Mesoscale Precipitation in the NCEP Reanalysis Using a New Gridcell Dataset for the Northwestern United States , 2000 .

[125]  Zhongbo Yu,et al.  Application of a Developed Grid-Xinanjiang Model to Chinese Watersheds for Flood Forecasting Purpose , 2009 .

[126]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[127]  Mark A. Kramer,et al.  Improvement of the backpropagation algorithm for training neural networks , 1990 .

[128]  Yang Hong,et al.  Estimating a-priori kinematic wave model parameters based on regionalization for flash flood forecasting in the Conterminous United States , 2016 .

[129]  M. Waseem,et al.  A REVIEW OF CRITERIA OF FIT FOR HYDROLOGICAL MODELS , 2017 .