Cross-Validation of Multivariate Densities

Abstract In recent years, the focus of study in smoothing parameter selection for kernel density estimation has been on the univariate case, while multivariate kernel density estimation has been largely neglected. In part, this may be due to the perception that calibrating multivariate densities is substantially more difficult. In this article, we explicitly derive and compare multivariate versions of the bootstrap method of Taylor, the least-squares cross-validation method developed by Bowman and Rudemo, and a biased cross-validation method similar to that of Scott and Terrell for multivariate kernel estimation using the product kernel estimator. The theoretical behavior of these cross-validation algorithms is shown to improve (surprisingly) as the dimension increases, approaching the best rate of O(n −1/2). Simulation studies suggest that the new biased cross-validation method performs quite well and with reasonable variability as compared to the other two methods. Bivariate examples with heart disease ...

[1]  D. W. Scott,et al.  Plasma lipids as collateral risk factors in coronary artery disease--a study of 371 males with chest pain. , 1978, Journal of chronic diseases.

[2]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[3]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[4]  P. Hall Central limit theorem for integrated square error of multivariate nonparametric density estimators , 1984 .

[5]  C. J. Stone,et al.  An Asymptotically Optimal Window Selection Rule for Kernel Density Estimates , 1984 .

[6]  D. W. Scott,et al.  Oversmoothed Nonparametric Density Estimates , 1985 .

[7]  J. Marron,et al.  Extent to which least-squares cross-validation minimises integrated square error in nonparametric density estimation , 1987 .

[8]  D. W. Scott,et al.  Biased and Unbiased Cross-Validation in Density Estimation , 1987 .

[9]  James Stephen Marron,et al.  Estimation of integrated squared density derivatives , 1987 .

[10]  Charles C. Taylor,et al.  Bootstrap choice of the smoothing parameter in kernel density estimation , 1989 .

[11]  Bruce J. Worton,et al.  Optimal smoothing parameters for multivariate fized and adaptive kernel methods , 1989 .

[12]  G. Terrell The Maximal Smoothing Principle in Density Estimation , 1990 .

[13]  Peter Hall,et al.  Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems , 1990 .

[14]  J. Faraway,et al.  Bootstrap choice of bandwidth for density estimation , 1990 .

[15]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[16]  M. C. Jones,et al.  On optimal data-based bandwidth selection in kernel density estimation , 1991 .

[17]  M. C. Jones,et al.  On a class of kernel density estimate bandwidth selectors , 1991 .

[18]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[19]  J. Marron,et al.  Smoothed cross-validation , 1992 .

[20]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[21]  M. Wand,et al.  EXACT MEAN INTEGRATED SQUARED ERROR , 1992 .

[22]  M. Wand,et al.  Multivariate plug-in bandwidth selection , 1994 .