Evaluation of model predictive ability by external validation techniques

This paper deals with the problem of evaluating the predictive ability of regression models. In some cases, model validation by internal cross‐validation technique is not enough and validation by an external test set has been suggested as an effective way of evaluating the model predictive ability. Different functions for calculating the predictive squared correlation coefficient Q2 from an external set were proposed, which lead to occasionally different estimates of the model predictive ability and therefore to contrasting decisions about model adequacy. In this paper, advantages and drawbacks of these functions in estimating model predictive ability from some simulated datasets are discussed by comparison. Copyright © 2010 John Wiley & Sons, Ltd.

[1]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[2]  Ralph Kühne,et al.  External Validation and Prediction Employing the Predictive Squared Correlation Coefficient Test Set Activity Mean vs Training Set Activity Mean , 2008, J. Chem. Inf. Model..

[3]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[4]  S. Wold,et al.  Statistical Validation of QSAR Results , 1995 .

[5]  Roberto Todeschini,et al.  Comments on the Definition of the Q2 Parameter for QSAR Validation , 2009, J. Chem. Inf. Model..

[6]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[7]  S. Wold Validation of QSAR's , 1991 .

[8]  B. Efron Better Bootstrap Confidence Intervals , 1987 .

[9]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[10]  L. A. Stone,et al.  Computer Aided Design of Experiments , 1969 .

[11]  Weida Tong,et al.  QSAR Models Using a Large Diverse Set of Estrogens , 2001, J. Chem. Inf. Comput. Sci..

[12]  P Willett,et al.  Comparison of algorithms for dissimilarity-based compound selection. , 1997, Journal of molecular graphics & modelling.

[13]  Peter C Jurs,et al.  Assessing the reliability of a QSAR model's predictions. , 2005, Journal of molecular graphics & modelling.

[14]  Rajarshi Guha,et al.  Generation of QSAR sets with a self-organizing map. , 2004, Journal of molecular graphics & modelling.

[15]  Alexander Golbraikh,et al.  Rational selection of training and test sets for the development of validated QSAR models , 2003, J. Comput. Aided Mol. Des..