Binary classification models for endocrine disrupter effects mediated through the estrogen receptor

Endocrine disrupters (EDs) form an interesting field of application attracting great attention in the recent years. They represent a number of exogenous substances interfering with the function of the endocrine system, including the interfering with developmental processes. In particular EDs are mentioned as substances requiring a more detailed control and specific authorization within REACH, the new European legislation on chemicals, together with other groups of chemicals of particular concern. QSAR represents a challenging method to approach data gap which is foreseen by REACH. The aim of this study was to provide an insight into the use of QSAR models to address ED effects mediated through the estrogen receptor (ER). New predictive models were derived to assess estrogenicity for a very large and heterogeneous dataset of chemical compounds. QSAR binary classifiers were developed based on different data mining techniques such as classification trees, decision forest, fuzzy logic, neural networks and support vector machines. The focus was given to multiple endpoints to better characterize the effects of EDs evaluating both binding (RBA) and transcriptional activity (RA). A possible combination of the models was also explored. A very good accuracy was reached for both RA and RBA models (higher than 80%). †Presented at the 13th International Workshop on QSARs in the Environmental Sciences (QSAR 2008), 8–12 June 2008, Syracuse, USA.

[1]  Marjana Novic,et al.  Variable Selection and Interpretation in Structure-Affinity Correlation Modeling of Estrogen Receptor Binders , 2005, J. Chem. Inf. Model..

[2]  Weida Tong,et al.  Decision Forest: Combining the Predictions of Multiple Independent Decision Tree Models , 2003, J. Chem. Inf. Comput. Sci..

[3]  J. Devillers,et al.  SAR and QSAR modeling of endocrine disruptors , 2006, SAR and QSAR in environmental research.

[4]  Paola Gramatica,et al.  QSAR prediction of estrogen activity for a large set of diverse chemicals under the guidance of OECD principles. , 2006, Chemical research in toxicology.

[5]  Kristin P. Bennett,et al.  Support vector machines: hype or hallelujah? , 2000, SKDD.

[6]  F Ros,et al.  Database mining applied to central nervous system (CNS) activity. , 2001, European journal of medicinal chemistry.

[7]  John A. Katzenellenbogen,et al.  The estradiol pharmacophore: Ligand structure-estrogen receptor binding affinity relationships and a model for the receptor binding site , 1997, Steroids.

[8]  Jure Zupan,et al.  Kohonen and counterpropagation artificial neural networks in analytical chemistry , 1997 .

[9]  R Serafimova,et al.  QSAR and mechanistic interpretation of estrogen receptor binding , 2007, SAR and QSAR in environmental research.

[10]  R. Hubbard,et al.  A structural biologist's view of the oestrogen receptor , 2000, The Journal of Steroid Biochemistry and Molecular Biology.

[11]  Hannu Toivonen,et al.  Statistical evaluation of the predictive toxicology challenge , 2000 .

[12]  E. Benfenati,et al.  Ecotoxicity prediction by adaptive fuzzy partitioning: comparing descriptors computed on 2D and 3D structures , 2006, SAR and QSAR in environmental research.

[13]  U. Egner,et al.  Ligand-binding domain of estrogen receptors. , 1999, Current opinion in biotechnology.

[14]  M. Cronin,et al.  The Impact of variable selection on the modelling of oestrogenicity , 2005, SAR and QSAR in environmental research.

[15]  Mikko Kolehmainen,et al.  Structure-based classification of active and inactive estrogenic compounds by decision tree, LVQ and kNN methods. , 2006, Chemosphere.

[16]  P. Moran Notes on continuous stochastic phenomena. , 1950, Biometrika.

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[19]  Michael S Lajiness,et al.  Enhancement of binary QSAR analysis by a GA-based variable selection method. , 2002, Journal of molecular graphics & modelling.

[20]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[21]  Ashwin Srinivasan,et al.  Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001 , 2003, Bioinform..

[22]  Johann Gasteiger,et al.  Fingal: A Novel Approach to Geometric Fingerprinting and a Comparative Study of Its Application to 3D‐QSAR Modelling , 2005 .

[23]  M. Ringnér,et al.  Classification of Genomic and Proteomic Data Using Support Vector Machines , 2007 .

[24]  B. J. Danzo,et al.  Environmental xenobiotics may disrupt normal endocrine function by interfering with the binding of physiological ligands to steroid receptors and binding proteins. , 1997, Environmental health perspectives.

[25]  Paola Gramatica,et al.  In silico screening of estrogen-like chemicals based on different nonlinear classification models. , 2007, Journal of molecular graphics & modelling.

[26]  M. Pintore,et al.  Molecular descriptor selection combining genetic algorithms and fuzzy logic: application to database mining procedures , 2002 .

[27]  Chris L. Waller,et al.  A Comparative QSAR Study Using CoMFA, HQSAR, and FRED/SKEYS Paradigms for Estrogen Receptor Binding Affinities of Structurally Diverse Compounds , 2004, J. Chem. Inf. Model..

[28]  R. Saracci,et al.  Describing the validity of carcinogen screening tests. , 1979, British Journal of Cancer.

[29]  M. Cronin,et al.  Pitfalls in QSAR , 2003 .

[30]  D M Sheehan,et al.  QSAR models for binding of estrogenic compounds to estrogen receptor alpha and beta subtypes. , 1997, Endocrinology.

[31]  O. Taboureau,et al.  Development of predictive models by adaptive fuzzy partitioning. Application to compounds active on the central nervous system , 2003 .