An evaluation of automated GPD threshold selection methods for hydrological extremes across different scales

This study investigated core components of an extreme value methodology for the estimation of high-flow frequencies from agricultural surface water run-off. The Generalized Pareto distribution (GPD) was used to model excesses in time-series data that resulted from the ‘Peaks Over Threshold’ (POT) method. First, the performance of eight different GPD parameter estimators was evaluated through a Monte Carlo experiment. Second, building on the estimator comparison, two existing automated GPD threshold selection methods were evaluated against a proposed approach that automates the threshold stability plots. For this second experiment, methods were applied to discharge measured at a highly-instrumented agricultural research facility in the UK. By averaging fine-resolution 15-minute data to hourly, 6-hourly and daily scales, we were also able to determine the effect of scale on threshold selection, as well as the performance of each method. The results demonstrate the advantages of the proposed threshold selection method over two commonly applied methods, while at the same time providing useful insights into the effect of the choice of the scale of measurement on threshold selection. The results can be generalised to similar water monitoring schemes and are important for improved characterisations of flood events and the design of associated disaster management protocols.

[1]  D. Gamerman,et al.  Bayesian analysis of extreme events with threshold estimation , 2004 .

[2]  Souvik Ghosh,et al.  Weak limits for exploratory plots in the analysis of extremes , 2010, 1008.2639.

[3]  J. Hosking,et al.  Parameter and quantile estimation for the generalized pareto distribution , 1987 .

[4]  J. Teugels,et al.  Tail Index Estimation, Pareto Quantile Plots, and Regression Diagnostics , 1996 .

[5]  Alberto Luceño,et al.  Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators , 2006, Comput. Stat. Data Anal..

[6]  T. Stocker,et al.  Managing the risks of extreme events and disasters to advance climate change adaptation. Special report of the Intergovernmental Panel on Climate Change. , 2012 .

[7]  R. J. Orr,et al.  Roles of instrumented farm-scale trials in trade-off assessments of pasture-based ruminant production systems. , 2018, Animal : an international journal of animal bioscience.

[8]  C. Willmott ON THE VALIDATION OF MODELS , 1981 .

[9]  A. Jenkinson The frequency distribution of the annual maximum (or minimum) values of meteorological elements , 1955 .

[10]  Fayçal Bouraoui,et al.  Impact of Climate Change on the Water Cycle and Nutrient Losses in a Finnish Catchment , 2004 .

[11]  AbuBakr S. Bahaj,et al.  A comparison of estimators for the generalised Pareto distribution , 2011 .

[12]  Peter Hall,et al.  Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems , 1990 .

[13]  James H. Brown,et al.  Impact of an extreme climatic event on community assembly , 2008, Proceedings of the National Academy of Sciences.

[14]  Jan Beirlant,et al.  Estimation of the extreme-value index and generalized quantile plots , 2005 .

[15]  Huajun Li,et al.  An automated threshold selection method based on the characteristic of extrapolated significant wave heights , 2019, Coastal Engineering.

[16]  Francesco Laio,et al.  Can continuous streamflow data support flood frequency analysis? An alternative to the partial duration series approach , 2003 .

[17]  J. R. Wallis,et al.  Probability weighted moments compared with some traditional techniques in estimating Gumbel Parameters and quantiles , 1979 .

[18]  S. Coles,et al.  An Introduction to Statistical Modeling of Extreme Values , 2001 .

[19]  S. Kotz,et al.  Parameter estimation of the generalized Pareto distribution—Part II , 2010 .

[20]  R. Reiss,et al.  Statistical Analysis of Extreme Values-with applications to insurance , 1997 .

[21]  M. Clarke,et al.  Hindcasting extreme events: the occurrence and expression of damaging floods and landslides in Southern Italy , 2006 .

[22]  Carl Scarrott,et al.  A Review of Extreme Value Threshold Estimation and Uncertainty Quantification , 2012 .

[23]  Jin Zhang,et al.  LIKELIHOOD MOMENT ESTIMATION FOR THE GENERALIZED PARETO DISTRIBUTION , 2007 .

[24]  J. Pickands Statistical Inference Using Extreme Order Statistics , 1975 .

[25]  R. Fisher,et al.  Limiting forms of the frequency distribution of the largest or smallest member of a sample , 1928, Mathematical Proceedings of the Cambridge Philosophical Society.

[26]  P. Prescott,et al.  Maximum likeiihood estimation of the parameters of the three-parameter generalized extreme-value distribution from censored samples , 1983 .

[27]  P. Todorovic,et al.  Stochastic models of floods , 1978 .

[28]  Richard L. Smith Maximum likelihood estimation in a class of nonregular cases , 1985 .

[29]  Julie Josse,et al.  Handling missing values in exploratory multivariate data analysis methods , 2012 .

[30]  Richard L. Smith,et al.  Models for exceedances over high thresholds , 1990 .

[31]  Jonathan A. Tawn,et al.  Statistical models for overdispersion in the frequency of peaks over threshold data for a flow series , 2010 .

[32]  A. Ledford,et al.  Diagnostics for dependence within time series extremes , 2003 .

[33]  S. Beguerı́a Uncertainties in partial duration series modelling of extremes related to the choice of the threshold value , 2005 .

[34]  Vartan Choulakian,et al.  Goodness-of-Fit Tests for the Generalized Pareto Distribution , 2001, Technometrics.

[35]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[36]  Yi Liu,et al.  Modelling field scale spatial variation in water run-off, soil moisture, N2O emissions and herbage biomass of a grazed pasture using the SPACSYS model , 2018, Geoderma.

[37]  T. Ferguson,et al.  Kendall's tau for serial dependence , 2000 .

[38]  L. Haan,et al.  Using a Bootstrap Method to Choose the Sample Fraction in Tail Index Estimation , 2000 .

[39]  Julian Stander,et al.  Automated threshold selection methods for extreme wave analysis , 2009 .

[40]  Arnoldo Frigessi,et al.  Practical Extreme Value Modelling of Hydrological Floods and Droughts: A Case Study , 2004 .

[41]  Jan Beirlant,et al.  LINKING PARETO-TAIL KERNEL GOODNESS-OF-FIT STATISTICS WITH TAIL INDEX AT OPTIMAL THRESHOLD AND SECOND ORDER ESTIMATION , 2008 .

[42]  Bernard Bobée,et al.  Towards operational guidelines for over-threshold modeling , 1999 .

[43]  Abdelhak Zoglat,et al.  Managing Hydrological Risks with Extreme Modeling: Application of Peaks over Threshold Model to the Loukkos Watershed, Morocco , 2014 .

[44]  F. Ashkar,et al.  Comparison of automatic procedures for selecting flood peaks over threshold based on goodness‐of‐fit tests , 2018, Hydrological Processes.

[45]  M. Faramarzi,et al.  Assessment of the Combined Effects of Threshold Selection and Parameter Estimation of Generalized Pareto Distribution with Applications to Flood Frequency Analysis , 2017 .

[46]  M. A. Losada,et al.  A unified statistical model for hydrological variables including the selection of threshold for the peak over threshold method , 2012 .

[47]  Roberto Deidda,et al.  A multiple threshold method for fitting the generalized Pareto distribution to rainfall time series , 2010 .

[48]  P. Krause,et al.  COMPARISON OF DIFFERENT EFFICIENCY CRITERIA FOR HYDROLOGICAL MODEL ASSESSMENT , 2005 .

[49]  Mohamed S. El-Sherif Optimal prediction of the Nile River flow using neural networks , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[50]  Caston Sigauke,et al.  Modelling non-stationary time series using a peaks over threshold distribution with time varying covariates and threshold: An application to peak electricity demand , 2017 .

[51]  B. M. Hill,et al.  A Simple General Approach to Inference About the Tail of a Distribution , 1975 .

[52]  Wei-Xin Ren,et al.  Threshold selection for extreme value estimation of vehicle load effect on bridges , 2018, Int. J. Distributed Sens. Networks.

[53]  A. Walden,et al.  Maximum likelihood estimation of the parameters of the generalized extreme-value distribution , 1980 .

[54]  B. Bates,et al.  Climate change and water. , 2008 .

[55]  H. Madsen,et al.  Comparison of annual maximum series and partial duration series methods for modeling extreme hydrologic events: 1. At‐site modeling , 1997 .

[56]  L. Haan,et al.  On the Estimation of the Extreme-Value Index and Large Quantile Estimation , 1989 .

[57]  C. J. Eyles,et al.  The North Wyke Farm Platform: effect of temperate grassland farming systems on soil moisture contents, runoff and associated water quality dynamics , 2016, European journal of soil science.

[58]  Holger Rootzén,et al.  Univariate and bivariate GPD methods for predicting extreme wind storm losses , 2009 .

[59]  Seokhoon Yun On a generalized Pickands estimator of the extreme value index , 2002 .

[60]  Johan Segers,et al.  Generalized Pickands estimators for the extreme value index , 2005 .

[61]  S. Coles,et al.  Likelihood-Based Inference for Extreme Value Models , 1999 .

[62]  Michelangelo Puliga,et al.  Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database , 2016 .

[63]  J. Beirlant,et al.  A goodness-of-fit statistic for Pareto-type behaviour , 2006 .

[64]  Michelangelo Puliga,et al.  Sensitivity of goodness-of-fit statistics to rainfall data rounding off , 2006 .

[65]  Miguel A. Losada,et al.  Peaks Over Threshold (POT): A methodology for automatic threshold estimation using goodness of fit p‐value , 2017 .

[66]  Fahim Ashkar,et al.  Revisiting some estimation methods for the generalized Pareto distribution , 2007 .

[67]  C. Cunnane,et al.  A note on the Poisson assumption in partial duration series models , 1979 .

[68]  M. A. Yurdusev,et al.  River flow estimation from upstream flow records by artificial intelligence methods. , 2009 .

[69]  H. Thode Testing For Normality , 2002 .

[70]  J. R. Wallis,et al.  Probability Weighted Moments: Definition and Relation to Parameters of Several Distributions Expressable in Inverse Form , 1979 .