A Stepwise Approach for Defining the Applicability Domain of SAR and QSAR Models

A stepwise approach for determining the model applicability domain is proposed. Four stages are applied to account for the diversity and complexity of the current SAR/QSAR models, reflecting their mechanistic rationality (including metabolic activation of chemicals) and transparency. General parametric requirements are imposed in the first stage, specifying in the domain only those chemicals that fall in the range of variation of the physicochemical properties of the chemicals in the training set. The second stage defines the structural similarity between chemicals that are correctly predicted by the model. The structural neighborhood of atom-centered fragments is used to determine this similarity. The third stage in defining the domain is based on a mechanistic understanding of the modeled phenomenon. Here, the model domain combines the reliability of specific reactive groups hypothesized to cause the effect and the domain of explanatory variables determining the parametric requirements in order for functional groups to elicit their reactivity. Finally, the reliability of simulated metabolism (metabolites, pathways, and maps) is taken into account in assessing the reliability of predictions, if metabolic activation of chemicals is a part of the (Q)SAR model. Some of the stages of the proposed approach for defining the model domain can be eliminated depending on the availability and quality of the experimental data used to derive the model, the specificity of (Q)SARs, and the goals of their ultimate application. The performance of the proposed definition of the model domain is tested using several examples of (Q)SARs that have been externally validated, including models for predicting acute toxicity, skin sensitization, and biodegradation. The results clearly showed that credibility in predictions of QSAR models for chemicals belonging to their domain is much higher than for chemicals outside this domain.

[1]  Robert P. Sheridan,et al.  Similarity to Molecules in the Training Set Is a Good Discriminator for Prediction Accuracy in QSAR , 2004, J. Chem. Inf. Model..

[2]  Ovanes Mekenyan,et al.  Global modeling of narcotic chemicals: ciliate and fish toxicity , 2003 .

[3]  J E Ridings,et al.  Computer prediction of possible toxic action from chemical structure: an update on the DEREK system. , 1996, Toxicology.

[4]  D. Sanderson,et al.  Computer Prediction of Possible Toxic Action from Chemical Structure; The DEREK System , 1991, Human & experimental toxicology.

[5]  Osman Balci,et al.  Verification, validation, and accreditation , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).

[6]  Petra S. Kern,et al.  Skin Sensitization: Modeling Based on Skin Metabolism Simulation and Formation of Protein Conjugates , 2005, International journal of toxicology.

[7]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[8]  Mark T. D. Cronin,et al.  A Framework for Promoting the Acceptance and Regulatory Use of ( Quantitative) Structure- Activity Relationships , 2004 .

[9]  Rajarshi Guha,et al.  Determining the Validity of a QSAR Model - A Classification Approach , 2005, J. Chem. Inf. Model..

[10]  S D Dimitrov,et al.  Interspecies Modeling of Narcotics Toxicity to Aquatic Animals , 2000, Bulletin of environmental contamination and toxicology.

[11]  S Dimitrov,et al.  Probabilistic assessment of biodegradability based on metabolic pathways: CATABOL System , 2002, SAR and QSAR in environmental research.

[12]  Paola Gramatica,et al.  Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. , 2003, Environmental health perspectives.

[13]  Robert G. Sargent,et al.  Verifying and validating simulation models , 1996, Proceedings Winter Simulation Conference.

[14]  Ovanes Mekenyan,et al.  Interspecies quantitative structure‐activity relationship model for aldehydes: Aquatic toxicity , 2004, Environmental toxicology and chemistry.

[15]  Ovanes Mekenyan,et al.  Identification of the structural requirements for mutagenicity by incorporating molecular flexibility and metabolic activation of chemicals I: TA100 model. , 2004, Chemical research in toxicology.

[16]  Jack D. Tubbs,et al.  A note on binary template matching , 1989, Pattern Recognit..

[17]  John D. Walker,et al.  Predicting the biodegradation products of perfluorinated chemicals using CATABOL , 2004, SAR and QSAR in environmental research.

[18]  Sabcho D Dimitrov,et al.  A systematic approach to simulating metabolism in computational toxicology. I. The TIMES heuristic modelling framework. , 2004, Current pharmaceutical design.

[19]  S D Dimitrov,et al.  Non-linear modeling of bioconcentration using partition coefficients for narcotic chemicals , 2002, SAR and QSAR in environmental research.