论文信息 - Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment

Ideal bootstrap estimation of expected prediction error for k-nearest neighbor classifiers: Applications for classification and error assessment

Euclidean distance k-nearest neighbor (k-NN) classifiers are simple nonparametric classification rules. Bootstrap methods, widely used for estimating the expected prediction error of classification rules, are motivated by the objective of calculating the ideal bootstrap estimate of expected prediction error. In practice, bootstrap methods use Monte Carlo resampling to estimate the ideal bootstrap estimate because exact calculation is generally intractable. In this article, we present analytical formulae for exact calculation of the ideal bootstrap estimate of expected prediction error for k-NN classifiers and propose a new weighted k-NN classifier based on resampling ideas. The resampling-weighted k-NN classifier replaces the k-NN posterior probability estimates by their expectations under resampling and predicts an unclassified covariate as belonging to the group with the largest resampling expectation. A simulation study and an application involving remotely sensed data show that the resampling-weighted k-NN classifier compares favorably to unweighted and distance-weighted k-NN classifiers.

David A. Patterson | Brian M. Steele

[1] Steven A. Orszag,et al. CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS , 1978 .

[2] Andrew Luk,et al. A Re-Examination of the Distance-Weighted k-Nearest Neighbor Classification Rule , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[3] R. Tibshirani,et al. Combining Estimates in Regression and Classification , 1996 .

[4] B. Efron. Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[5] Sahibsingh A. Dudani. The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[6] Anil K. Jain,et al. NOTE ON DISTANCE-WEIGHTED k-NEAREST NEIGHBOR RULES. , 1978 .

[7] G. McLachlan. Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[8] M. Mojirsheibani. Combining Classifiers via Discretization , 1999 .

[9] B. Efron. The jackknife, the bootstrap, and other resampling plans , 1987 .

[10] R. Tibshirani,et al. Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .