Monte Carlo cross‐validation for selecting a model and estimating the prediction error in multivariate calibration

A new simple and effective method named Monte Carlo cross validation (MCCV) has been introduced and evaluated for selecting a model and estimating the prediction ability of the model selected. Unlike the leave‐one‐out procedure widely used in chemometrics for cross‐validation (CV), the Monte Carlo cross‐validation developed in this paper is an asymptotically consistent method of model selection. It can avoid an unnecessarily large model and therefore decreases the risk of overfitting of the model. The results obtained from a simulation study showed that MCCV has an obviously larger probability than leave‐one‐out CV (LOO‐CV) of selecting the model with best prediction ability and that a corrected MCCV (CMCCV) could give a more accurate estimation of prediction ability than LOO‐CV or MCCV. The results obtained with real data sets demonstrated that MCCV could successfully select an appropriate model and that CMCCV could assess the prediction ability of the selected model with satisfactory accuracy. Copyright © 2004 John Wiley & Sons, Ltd.