Model selection techniques for the frequency analysis of hydrological extremes

[1] The frequency analysis of hydrological extremes requires fitting a probability distribution to the observed data to suitably represent the frequency of occurrence of rare events. The choice of the model to be used for statistical inference is often based on subjective criteria, or it is considered a matter of probabilistic hypotheses testing. In contrast, specific tools for model selection, like the well-known Akaike information criterion (AIC) and the Bayesian information criterion (BIC), are seldom used in hydrological applications. The objective of this study is to verify whether the AIC and BIC work correctly when they are applied to identifying the probability distribution of hydrological extremes, i.e., when the available samples are small and the parent distribution is highly asymmetric. An additional model selection criterion, based on the Anderson-Darling goodness-of-fit test statistic, is here proposed, and the performances of the three methods are compared through an extensive numerical analysis. The capability of the three criteria to recognize the correct parent distribution from the available data samples varies from case to case, and it is rather good in some cases (in particular when the parent is a two-parameter distribution) and unsatisfactory in others. An application to flood peak time series from 1000 catchments located in the United Kingdom provides some further information on the qualities and drawbacks of the considered criteria. From the numerical simulations and data-based analyses it can be concluded that the three model selection techniques considered here produce results of comparable quality.

[1]  J. R. Wallis,et al.  Regional frequency analysis , 1997 .

[2]  Genshiro Kitagawa,et al.  Bayesian Information Criteria , 2008 .

[3]  Browne,et al.  Cross-Validation Methods. , 2000, Journal of mathematical psychology.

[4]  P. Claps,et al.  A comparison of homogeneity tests for regional frequency analysis , 2007 .

[5]  R. Vogel,et al.  Probability Distribution of Low Streamflow Series in the United States , 2002 .

[6]  D. Parkinson,et al.  Bayesian Methods in Cosmology: Model selection and multi-model inference , 2009 .

[7]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[8]  Walter Zucchini,et al.  Model Selection , 2011, International Encyclopedia of Statistical Science.

[9]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[10]  K. Turkman The choice of extremal models by Akaike's information criterion , 1985 .

[11]  F. Mutua,et al.  The use of the Akaike Information Criterion in the identification of an optimum flood frequency model. , 1994 .

[12]  M. Bayazit,et al.  Best-fit distributions of largest available flood samples , 1995 .

[13]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[14]  J. Stedinger Frequency analysis of extreme events , 1993 .

[15]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  Bernard Bobée,et al.  Une approche pour la sélection des distributions statistiques : application au bassin hydrographique du Saguenay - Lac St-Jean , 1999 .

[18]  V. Singh,et al.  Three procedures for selection of annual flood peak distribution , 2006 .

[19]  J. R. Wallis,et al.  Regional Frequency Analysis: An Approach Based on L-Moments , 1997 .

[20]  N. Sugiura Further analysts of the data by akaike' s information criterion and the finite corrections , 1978 .

[21]  Francis H. S. Chiew,et al.  Floodflow frequency model selection in Australia , 1993 .

[22]  Francesco Laio,et al.  Design flood estimation using model selection criteria , 2009 .

[23]  V. Singh,et al.  Non-stationary approach to at-site flood frequency modelling I. Maximum likelihood estimation , 2001 .

[25]  A. Brath,et al.  Relationships between statistics of rainfall extremes and mean annual precipitation: an application for design-storm estimation in northern central Italy , 2005 .

[26]  J. Vrijling,et al.  Assessment of an L-Kurtosis-Based Criterionfor Quantile Estimation , 2001 .

[27]  S. Konishi,et al.  Bayesian information criteria and smoothing parameter selection in radial basis function networks , 2004 .

[28]  G. Kitagawa,et al.  Generalised information criteria in model selection , 1996 .

[29]  Anthony T. Cahill,et al.  Significance of AIC differences for precipitation intensity distributions , 2003 .

[30]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[31]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[32]  Robert G. Aykroyd,et al.  Empirical Bayes estimation for archaeological stratigraphy , 1999 .

[33]  Vijay P. Singh,et al.  Asymptotic bias of estimation methods caused by the assumption of false probability distribution , 2002 .

[34]  Richard M. Vogel,et al.  Flood-Flow Frequency Model Selection in Southwestern United States , 1993 .

[35]  J. Lindsey,et al.  Some statistical heresies , 1999 .

[36]  F. Laio Cramer–von Mises and Anderson‐Darling goodness of fit tests for extreme value distributions with unknown parameters , 2004 .

[37]  R. Beverton,et al.  Institute of Hydrology , 1972, Nature.

[38]  Ja-Yong Koo,et al.  A note on bootstrap model selection criterion , 1996 .

[39]  Zucchini,et al.  An Introduction to Model Selection. , 2000, Journal of mathematical psychology.

[40]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[41]  Richard L. Smith Maximum likelihood estimation in a class of nonregular cases , 1985 .

[42]  Armando Brath,et al.  Reliability of different depth‐duration‐frequency equations for estimating short‐duration design storms , 2006 .

[43]  Vijay P. Singh,et al.  Probability of correct selection from lognormal and convective diffusion models based on the likelihood ratio , 2006 .

[44]  Renzo Rosso,et al.  Statistics, Probability and Reliability for Civil and Environmental Engineers , 1997 .