The Impact of variable selection on the modelling of oestrogenicity

Many oestrogenic chemicals exert their activity via specific interactions with the oestrogen receptor (ER). The objective of the present study was to identify significant descriptors associated with the ER binding affinities of a large and diverse set of compounds to drive quantitative structure–activity relationships (QSARs). To this end, a variety of statistical methods were employed for variable selection. These included stepwise regression and partial least squares (PLS) analyses, as well as a non-linear recursive partitioning method (Formal Inference-based Recursive Modelling). A total of 157 molecular descriptors including quantum mechanical, graph theoretical, indicator variables and log P were used in the study. Furthermore, cluster analysis of variables was performed to identify groups of descriptors representing similar molecular features. Hierarchical PLS analyses were performed, where the scores of the significant components of either PLS or principle component analysis (PCA), performed separately on each cluster, were used as the variables for the top model. This reduced the number of the variables representing the larger clusters, leading to a similar number of descriptors for each distinct molecular feature. The results showed that the most important molecular properties for stronger ER binding affinity are molecular size and shape, the presence of a phenol moiety as well as other aromatic groups, hydrophobicity and presence of double bonds. The best PLS model obtained, in terms of predictive ability, was a hierarchical PLS model. However, a rigorous validation study showed that the MLR model using descriptors selected by stepwise regression has greater predictive power than the PLS models.

[1]  Wolfgang Sippl,et al.  Binding affinity prediction of novel estrogen receptor ligands using receptor-based 3-D QSAR methods. , 2002, Bioorganic & medicinal chemistry.

[2]  Kailin Tang,et al.  Combining PLS with GA-GP for QSAR , 2002 .

[3]  John D. Walker,et al.  Use of QSARs in international decision-making frameworks to predict health effects of chemical substances. , 2003, Environmental health perspectives.

[4]  G. V. Kass,et al.  AUTOMATIC INTERACTION DETECTION , 1982 .

[5]  Hugo Kubinyi,et al.  From Narcosis to Hyperspace: The History of QSAR , 2002 .

[6]  J. Ashby,et al.  Obstacles to the prediction of estrogenicity from chemical structure: assay-mediated metabolic transformation and the apparent promiscuous nature of the estrogen receptor. , 2000, Biochemical pharmacology.

[7]  Weida Tong,et al.  Phytoestrogens and mycoestrogens bind to the rat uterine estrogen receptor. , 2002, The Journal of nutrition.

[8]  Weida Tong,et al.  Receptor-Mediated Toxicity: QSARs for Estrogen Receptor Binding and Priority Setting of Potential Estrogenic Endocrine Disruptors , 2004 .

[9]  E. Barreiro,et al.  Toward a platelet-activating factor pseudoreceptor 2. Three-dimensional semiempirical models for agonist and antagonist binding , 1999 .

[10]  Q Xie,et al.  Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. , 2001, Chemical research in toxicology.

[11]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[12]  John D. Walker,et al.  Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances. , 2003, Environmental health perspectives.

[13]  Weida Tong,et al.  QSAR Models Using a Large Diverse Set of Estrogens , 2001, J. Chem. Inf. Comput. Sci..

[14]  Joseph S. Verducci,et al.  On Combining Recursive Partitioning and Simulated Annealing To Detect Groups of Biologically Active Compounds , 2002, J. Chem. Inf. Comput. Sci..

[15]  John A. Katzenellenbogen,et al.  The estradiol pharmacophore: Ligand structure-estrogen receptor binding affinity relationships and a model for the receptor binding site , 1997, Steroids.

[16]  Paul Labute,et al.  Binary Quantitative Structure-Activity Relationship (QSAR) Analysis of Estrogen Receptor Ligands , 1999, J. Chem. Inf. Comput. Sci..

[17]  H Fang,et al.  The estrogen receptor relative binding affinities of 188 natural and xenochemicals: structural diversity of ligands. , 2000, Toxicological sciences : an official journal of the Society of Toxicology.

[18]  G. Cruciani,et al.  Predictive ability of regression models. Part II: Selection of the best predictive PLS model , 1992 .

[19]  F. Schueler,et al.  The relationship between estrogenic action and chemical constitution in a group of azomethine derivatives. , 1950, Journal of the American Pharmaceutical Association. American Pharmaceutical Association.

[20]  T. Wayne Schultz,et al.  Molecular Quantum Similarity Analysis of Estrogenic Activity , 2003, J. Chem. Inf. Comput. Sci..