What Role Does Hydrological Science Play in the Age of Machine Learning?

17 This paper is derived from a keynote talk given at the Google’s 2020 Flood Forecasting 18 Meets Machine Learning Workshop. Recent experiments applying deep learning to rainfall19 runoff simulation indicate that there is significantly more information in large-scale hy20 drological data sets than hydrologists have been able to translate into theory or mod21 els. While there is growing interest in machine learning in the hydrological sciences com22 munity, in many ways our community still holds deeply subjective and non-evidence-based 23 preferences for models based on a certain type of ‘process understanding’ that has his24 torically not translated into accurate theory, models, or predictions. This commentary 25 is a call to action for the hydrology community to focus on developing a quantitative un26 derstanding of where and when hydrological process understanding is valuable in a mod27 eling discipline increasingly dominated by machine learning. We offer some potential per28 spectives and preliminary examples about how this might be accomplished. 29

[1]  Ezio Todini,et al.  Comment on: ‘On undermining the science?’ by Keith Beven , 2007 .

[2]  James W. Taylor A Quantile Regression Neural Network Approach to Estimating the Conditional Density of Multiperiod Returns , 2000 .

[3]  S. Srihari Mixture Density Networks , 1994 .

[4]  Jingfeng Wang,et al.  A model of evapotranspiration based on the theory of maximum entropy production , 2011 .

[5]  Hoshin Vijai Gupta,et al.  A process‐based diagnostic approach to model evaluation: Application to the NWS distributed hydrologic model , 2008 .

[6]  Jichun Wu,et al.  Deep Autoregressive Neural Networks for High‐Dimensional Inverse Problems in Groundwater Contaminant Source Identification , 2018, Water Resources Research.

[7]  Keith Beven,et al.  Facets of uncertainty: epistemic uncertainty, non-stationarity, likelihood, hypothesis testing, and communication , 2016 .

[8]  P. Mantovan,et al.  Hydrological forecasting uncertainty assessment: Incoherence of the GLUE methodology , 2006 .

[9]  F. Pappenberger,et al.  Ignorance is bliss: Or seven reasons not to use uncertainty analysis , 2006 .

[10]  Keith Beven,et al.  On hypothesis testing in hydrology: Why falsification of models is still a really good idea , 2018 .

[11]  Demetris Koutsoyiannis,et al.  A blueprint for process‐based modeling of uncertain hydrological systems , 2012 .

[12]  Alberto Montanari,et al.  What do we mean by ‘uncertainty’? The need for a consistent wording about uncertainty assessment in hydrology , 2007 .

[13]  M. Clark,et al.  A philosophical basis for hydrological uncertainty , 2016 .

[14]  Jan Polcher,et al.  Acceleration of Land Surface Model Development over a Decade of Glass , 2011 .

[15]  Gabriel Abramowitz,et al.  Towards a benchmark for land surface models , 2005 .

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  Yuqiong Liu,et al.  Reconciling theory with observations: elements of a diagnostic approach to model evaluation , 2008 .

[18]  Evon M. O. Abu-Taieh,et al.  Comparative Study , 2020, Definitions.

[19]  Darren T. Drewry,et al.  Information Theory for Model Diagnostics: Structural Error is Indicated by Trade‐Off Between Functional and Predictive Performance , 2019, Water Resources Research.

[20]  Judea Pearl,et al.  Structural Counterfactuals: A Brief Introduction , 2013, Cogn. Sci..

[21]  T. Jackson,et al.  Estimating surface soil moisture from SMAP observations using a Neural Network technique. , 2018, Remote sensing of environment.

[22]  Victor R. Baker,et al.  Debates—Hypothesis testing in hydrology: Pursuing certainty versus pursuing uberty , 2017 .

[23]  Chaopeng Shen,et al.  The Value of SMAP for Long-Term Soil Moisture Estimation With the Help of Deep Learning , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Jery R. Stedinger,et al.  Appraisal of the generalized likelihood uncertainty estimation (GLUE) method , 2008 .

[25]  Demetris Koutsoyiannis,et al.  A Blueprint for Process-Based Modeling of , 2012 .

[26]  Chaopeng Shen,et al.  Enhancing Streamflow Forecast and Extracting Insights Using Long‐Short Term Memory Networks With Data Integration at Continental Scales , 2019, Water Resources Research.

[27]  G. Weinberg An Introduction to General Systems Thinking , 1975 .

[28]  Keith Beven,et al.  On the colour and spin of epistemic error (and what we might do about it) , 2011 .

[29]  Praveen Kumar,et al.  Typology of hydrologic predictability , 2011 .

[30]  Wade T. Crow,et al.  Information loss in approximately Bayesian estimation techniques: A comparison of generative and discriminative approaches to estimating agricultural productivity , 2013 .

[31]  Jeffrey P. Walker,et al.  THE GLOBAL LAND DATA ASSIMILATION SYSTEM , 2004 .

[32]  P. E. O'connell,et al.  IAHS Decade on Predictions in Ungauged Basins (PUB), 2003–2012: Shaping an exciting future for the hydrological sciences , 2003 .

[33]  M. Ek,et al.  Hyperresolution global land surface modeling: Meeting a grand challenge for monitoring Earth's terrestrial water , 2011 .

[34]  V. Singh Downstream Hydraulic Geometry , 2014 .

[35]  Andrew Binley,et al.  GLUE: 20 years on , 2014 .

[36]  Murugesu Sivapalan,et al.  Pattern, Process and Function: Elements of a Unified Theory of Hydrology at the Catchment Scale , 2006 .

[37]  Hoshin Vijai Gupta,et al.  Large-sample hydrology: a need to balance depth with breadth , 2013 .

[38]  D. Maidment,et al.  Towards Real‐Time Continental Scale Streamflow Simulation in Continuous and Discrete Space , 2018 .

[39]  Prabhat,et al.  Artificial Neural Network , 2018, Encyclopedia of GIS.

[40]  Murugesu Sivapalan,et al.  Scale issues in hydrological modelling: A review , 1995 .

[41]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[42]  Chaopeng Shen,et al.  Near-Real-Time Forecast of Satellite-Based Soil Moisture Using Long Short-Term Memory with an Adaptive Data Integration Kernel , 2020 .

[43]  Kenneth W. Harrison,et al.  A comparison of methods for a priori bias correction in soil moisture data assimilation , 2012 .

[44]  S. Hochreiter,et al.  Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning , 2019, Water Resources Research.

[45]  Dimitri Solomatine,et al.  Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning , 2020, Geophysical Research Letters.

[46]  J. Kirchner Getting the right answers for the right reasons: Linking measurements, analyses, and models to advance the science of hydrology , 2006 .

[47]  ChengXueqi,et al.  Knowledge Graph Embedding , 2017 .

[48]  Hoshin Vijai Gupta,et al.  Ensembles vs. information theory: supporting science under uncertainty , 2018, Frontiers of Earth Science.

[49]  J. McDonnell,et al.  A decade of Predictions in Ungauged Basins (PUB)—a review , 2013 .

[50]  Hoshin Vijai Gupta,et al.  Toward improved identification of hydrological models: A diagnostic evaluation of the “abcd” monthly water balance model for the conterminous United States , 2010 .

[51]  Keith Beven,et al.  So just why would a modeller choose to be incoherent , 2008 .

[52]  V. Klemeš,et al.  Dilettantism in hydrology: Transition or destiny? , 1986 .

[53]  Nagiza F. Samatova,et al.  Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data , 2016, IEEE Transactions on Knowledge and Data Engineering.

[54]  Vishnu S. Pendyala,et al.  Machine Learning Algorithms , 2018, Optimization Techniques and Applications with Examples.

[55]  F. Rawlins Episteme and Techne , 1950 .

[56]  Grey Nearing,et al.  Combining Parametric Land Surface Models with Machine Learning , 2020, ArXiv.

[57]  George Kuczera,et al.  Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors , 2010 .

[58]  V. Klemeš,et al.  Operational Testing of Hydrological Simulation Models , 2022 .

[59]  Martyn P. Clark,et al.  Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance , 2014 .

[60]  T. Jackson,et al.  The USDA Natural Resources Conservation Service Soil Climate Analysis Network (SCAN) , 2007 .

[61]  Jimmy Lin,et al.  The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction , 2019, Environ. Model. Softw..

[62]  Huan Wu,et al.  Evaluating Global Streamflow Simulations by a Physically-based Routing Model Coupled with the Community Land Model , 2013 .

[63]  Luis Samaniego,et al.  Scaling, Similarity, and the Fourth Paradigm for Hydrology , 2017, Hydrology and earth system sciences.

[64]  Grey S. Nearing Diagnostics and generalizations for parametric state estimation , 2013 .

[65]  Kuolin Hsu,et al.  HESS Opinions: Incubating deep-learning-powered hydrologic science advances as a community , 2018, Hydrology and Earth System Sciences.

[66]  E. Todini Hydrological catchment modelling: past, present and future , 2007 .

[67]  Wojciech Samek,et al.  Explainable AI: Interpreting, Explaining and Visualizing Deep Learning , 2019, Explainable AI.

[68]  Jiancheng Shi,et al.  The Soil Moisture Active Passive (SMAP) Mission , 2010, Proceedings of the IEEE.

[69]  The cause of the formation of meanders in the courses of rivers and of the so-called Baer’s law , 2000 .

[70]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[71]  H. Einstein,et al.  The Bed-Load Function for Sediment Transportation in Open Channel Flows , 1950 .

[72]  Ernest Nagel,et al.  The Structure of Science , 1962 .

[73]  Martyn P. Clark,et al.  Benchmarking and Process Diagnostics of Land Models , 2018, Journal of Hydrometeorology.

[74]  Zoubin Ghahramani,et al.  Learning Nonlinear Dynamical Systems Using an EM Algorithm , 1998, NIPS.

[75]  S. L. Sellars,et al.  “Grand Challenges” in Big Data and the Earth Sciences , 2018, Bulletin of the American Meteorological Society.

[76]  Hadi Meidani,et al.  Physics-Driven Regularization of Deep Neural Networks for Enhanced Engineering Design and Analysis , 2018, J. Comput. Inf. Sci. Eng..

[77]  Ali Ramadhan,et al.  Universal Differential Equations for Scientific Machine Learning , 2020, ArXiv.

[78]  Nans Addor,et al.  Legacy, Rather Than Adequacy, Drives the Selection of Hydrological Models , 2019, Water Resources Research.

[79]  Richard P. Hooper,et al.  Moving beyond heterogeneity and process complexity: A new vision for watershed hydrology , 2007 .

[80]  Grey S. Nearing,et al.  Comment on “A blueprint for process‐based modeling of uncertain hydrological systems” by Alberto Montanari and Demetris Koutsoyiannis , 2014 .

[81]  Pierre Gentine,et al.  Could Machine Learning Break the Convection Parameterization Deadlock? , 2018, Geophysical Research Letters.

[82]  Vijay P. Singh,et al.  Downstream hydraulic geometry relations: 1. Theoretical development , 2003 .

[83]  Kuolin Hsu,et al.  Artificial Neural Network Modeling of the Rainfall‐Runoff Process , 1995 .

[84]  Keith Beven,et al.  Searching for the Holy Grail of scientific hydrology: Q t =( S, R, Δt ) A as closure , 2006 .