Kernel bandwidth selection for a first order nonparametric streamflow simulation model

Abstract A new approach for streamflow simulation using nonparametric methods was described in a recent publication (Sharma et al. 1997). Use of nonparametric methods has the advantage that they avoid the issue of selecting a probability distribution and can represent nonlinear features, such as asymmetry and bimodality that hitherto were difficult to represent, in the probability structure of hydrologic variables such as streamflow and precipitation. The nonparametric method used was kernel density estimation, which requires the selection of bandwidth (smoothing) parameters. This study documents some of the tests that were conduced to evaluate the performance of bandwidth estimation methods for kernel density estimation. Issues related to selection of optimal smoothing parameters for kernel density estimation with small samples (200 or fewer data points) are examined. Both reference to a Gaussian density and data based specifications are applied to estimate bandwidths for samples from bivariate normal mixture densities. The three data based methods studied are Maximum Likelihood Cross Validation (MLCV), Least Square Cross Validation (LSCV) and Biased Cross Validation (BCV2). Modifications for estimating optimal local bandwidths using MLCV and LSCV are also examined. We found that the use of local bandwidths does not necessarily improve the density estimate with small samples. Of the global bandwidth estimators compared, we found that MLCV and LSCV are better because they show lower variability and higher accuracy while Biased Cross Validation suffers from multiple optimal bandwidths for samples from strongly bimodal densities. These results, of particular interest in stochastic hydrology where small samples are common, may have importance in other applications of nonparametric density estimation methods with similar sample sizes and distribution shapes.

[1]  Upmanu Lall,et al.  A Nearest Neighbor Bootstrap For Resampling Hydrologic Time Series , 1996 .

[2]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[3]  Upmanu Lall,et al.  Streamflow simulation: A nonparametric approach , 1997 .

[4]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[5]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[6]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[7]  M. Wand,et al.  Multivariate plug-in bandwidth selection , 1994 .

[8]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[9]  Ian Abramson On Bandwidth Variation in Kernel Estimates-A Square Root Law , 1982 .

[10]  Shean-Tsong Chiu Why bandwidth selectors tend to choose smaller bandwidths, and a remedy , 1990 .

[11]  David G. Tarboton,et al.  Disaggregation procedures for stochastic hydrology based on nonparametric density estimation , 1998 .

[12]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[13]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[14]  D. W. Scott,et al.  Cross-Validation of Multivariate Densities , 1994 .

[15]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[16]  William H. Press,et al.  Numerical recipes : the art of scientific computing : FORTRAN version , 1989 .

[17]  D. W. Scott,et al.  Biased and Unbiased Cross-Validation in Density Estimation , 1987 .

[18]  Shean-Tsong Chiu,et al.  Bandwidth selection for kernel density estimation , 1991 .

[19]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[20]  M. C. Jones,et al.  Comparison of Smoothing Parameterizations in Bivariate Kernel Density Estimation , 1993 .

[21]  Upmanu Lall,et al.  Recent advances in nonparametric function estimation: Hydrologic applications , 1995 .

[22]  Upmanu Lall,et al.  Kernel flood frequency estimators: Bandwidth selection and kernel choice , 1993 .

[23]  Eugene F. Schuster,et al.  Incorporating support constraints into nonparametric estimators of densities , 1985 .