Upper and lower benchmarks in hydrological modelling

Whenassessing theperformanceofahydrologicalmodel, aquestionthat can be raised is, how good is really good? Despite several calls to use benchmarks (Pappenberger, Ramos, Cloke, & Fredrik, 2014; Schaefli & Gupta, 2007; Seibert, 2001), model performance in the scientific literature, conference presentations, and discussions among hydrological modellers is still often solely judged based on the value of some performancemeasure.Forinstance,amodelisratedaswell‐performingbecause model efficiency (Nash & Sutcliffe, 1970) values are above 0.7. Some authors (e.g., Moriasi et al., 2007; Ritter & Muñoz‐Carpena, 2013) even suggestperformance classes basedonmodel efficiencyvalues.Basedon our experiences with the application of hydrological models for catchmentswith largely varying characteristics,we argue that such judgments on model performance can only be made if model performances are relatedtobenchmarksthatrepresentwhatcouldandshouldbeexpected. The idea of using benchmarks is by no means new and actually the most commonly used performance measure in hydrological modelling, the model efficiency or Nash‐Sutcliffe efficiency (Nash & Sutcliffe, 1970), can be interpreted as the comparison of model simulations with a constant streamflow equal to the observed mean streamflow (lower benchmark) and a perfect fit (upper benchmark). Obviously, this lower benchmark is not too hard to beat, whereas this upper benchmark is hardly achievable in practice. To better evaluate how good model simulations are, more informative lower benchmarks have been suggested (Garrick, Cunnane, & Nash, 1978; Schaefli & Gupta, 2007; Seibert, 2001). However, the use of benchmarks that are taking into account what is possible with the data, that is, what could and should be expected, is still not common practice in hydrological modelling. In hydrological modelling, it is never possible to obtain a perfect model fit. This is partly due to the complexity of processes in nature but also due to errors in observations of the driving data and streamflow. Therefore, the upper benchmark should not be an unrealistic perfect simulation but take potential errors in the data into account. On the other hand, there is usually also a lower limit on how bad a model can be, simply because the driving data ensure that

[1]  M. Perry,et al.  The generation of monthly gridded datasets for a range of climatic variables over the UK , 2005 .

[2]  Perrine Hamel,et al.  Potential effects of landscape change on water supplies in the presence of reservoir storage , 2017 .

[3]  J. Nash,et al.  A criterion of efficiency for rainfall-runoff models , 1978 .

[4]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[5]  Martyn P. Clark,et al.  Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance , 2014 .

[6]  Maria-Helena Ramos,et al.  How do I know if my forecasts are better? Using benchmarks in hydrological ensemble prediction , 2015 .

[7]  P. J. Smith,et al.  A novel framework for discharge uncertainty quantification applied to 500 UK gauging stations , 2015, Water resources research.

[8]  J. Refsgaard,et al.  Operational Validation and Intercomparison of Different Types of Hydrological Models , 1996 .

[9]  V. Bell,et al.  From Catchment to National Scale Rainfall-Runoff Modelling: Demonstration of a Hydrological Modelling Framework , 2014 .

[10]  Jan Seibert,et al.  On the need for benchmarks in hydrological modelling , 2001 .

[11]  Martijn J. Booij,et al.  Catchment Variability and Parameter Estimation in Multi-Objective Regionalisation of a Rainfall–Runoff Model , 2010 .

[12]  John Ewen,et al.  SHETRAN: Distributed River Basin Flow and Transport Modeling System , 2000 .

[13]  Jan Seibert,et al.  Regionalisation of parameters for a conceptual rainfall-runoff model , 1999 .

[14]  Jan Seibert,et al.  Teaching hydrological modeling with a user-friendly catchment-runoff-model software package , 2012 .

[15]  Jeffrey G. Arnold,et al.  Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations , 2007 .

[16]  Göran Lindström,et al.  Development and test of the distributed HBV-96 hydrological model , 1997 .

[17]  H. Fowler,et al.  Development of a system for automated setup of a physically-based, spatially-distributed hydrological model for catchments in Great Britain , 2018, Environ. Model. Softw..

[18]  R. Muñoz‐Carpena,et al.  Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments , 2013 .

[19]  L. Hay,et al.  Hydrometeorological dataset for the contiguous USA , 2014 .

[20]  M. A. O. Ignacio,et al.  How to cite this article , 2016 .

[21]  J. Seibert Multi-criteria calibration of a conceptual runoff model using a genetic algorithm , 2000 .

[22]  J. Seibert,et al.  Information content of stream level class data for hydrological model calibration , 2017 .

[23]  Hoshin Vijai Gupta,et al.  Do Nash values have value? , 2007 .