A ternary classification using machine learning methods of distinct estrogen receptor activities within a large collection of environmental chemicals.

Endocrine-disrupting chemicals (EDCs), which can threaten ecological safety and be harmful to human beings, have been cause for wide concern. There is a high demand for efficient methodologies for evaluating potential EDCs in the environment. Herein an evaluation platform was developed using novel and statistically robust ternary models via different machine learning models (i.e., linear discriminant analysis, classification and regression tree, and support vector machines). The platform is aimed at effectively classifying chemicals with agonistic, antagonistic, or no estrogen receptor (ER) activities. A total of 440 chemicals from the literature were selected to derive and optimize the three-class model. One hundred and nine new chemicals appeared on the 2014 EPA list for EDC screening, which were used to assess the predictive performances by comparing the E-screen results with the predicted results of the classification models. The best model was obtained using support vector machines (SVM) which recognized agonists and antagonists with accuracies of 76.6% and 75.0%, respectively, on the test set (with an overall predictive accuracy of 75.2%), and achieved a 10-fold cross-validation (CV) of 73.4%. The external predicted accuracy validated by the E-screen assay was 87.5%, which demonstrated the application value for a virtual alert for EDCs with ER agonistic or antagonistic activities. It was demonstrated that the ternary computational model could be used as a faster and less expensive method to identify EDCs that act through nuclear receptors, and to classify these chemicals into different mechanism groups.

[1]  Ni Ai,et al.  Identification of previously unrecognized antiestrogenic chemicals using a novel virtual screening approach. , 2006, Chemical research in toxicology.

[2]  Douglas M. Hawkins,et al.  The Problem of Overfitting , 2004, J. Chem. Inf. Model..

[3]  D. Dix,et al.  The ToxCast program for prioritizing toxicity testing of environmental chemicals. , 2007, Toxicological sciences : an official journal of the Society of Toxicology.

[4]  T. Nyrönen,et al.  Three-dimensional structure-activity relationships of nonsteroidal ligands in complex with androgen receptor ligand-binding domain. , 2005, Journal of medicinal chemistry.

[5]  Quan Zhang,et al.  Characterization of estrogen receptor α activities in polychlorinated biphenyls by in vitro dual-luciferase reporter gene assay. , 2014, Environmental pollution.

[6]  Eiji Katsura,et al.  Screening for estrogen and androgen receptor activities in 200 pesticides by in vitro reporter gene assays using Chinese hamster ovary cells. , 2004, Environmental health perspectives.

[7]  Daniel L Villeneuve,et al.  Adverse outcome pathways: A conceptual framework to support ecotoxicology research and risk assessment , 2010, Environmental toxicology and chemistry.

[8]  Mathieu Vinken,et al.  The adverse outcome pathway concept: a pragmatic tool in toxicology. , 2013, Toxicology.

[9]  Mohammad S. Iqbal,et al.  A QSPR study of drug release from an arabinoxylan using ab initio optimization and neural networks , 2012 .

[10]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[11]  Oleg Devinyak,et al.  3D-MoRSE descriptors explained. , 2014, Journal of molecular graphics & modelling.

[12]  K. F. Fong,et al.  Thermal performance of natural airflow window in subtropical and temperate climate zones – A comparative study , 2009 .

[13]  Y. Heyden,et al.  Classification models for neocryptolepine derivatives as inhibitors of the β-haematin formation. , 2011, Analytica chimica acta.

[14]  Yang Zhao,et al.  Correction for population stratification in random forest analysis. , 2012, International journal of epidemiology.

[15]  B. Fan,et al.  QSAR study of natural, synthetic and environmental endocrine disrupting compounds for binding to the androgen receptor , 2005, SAR and QSAR in environmental research.

[16]  C. Screttas,et al.  Group 15 element imido and phosphido cages: Coordination chemistry and synthetic applications , 2003 .

[17]  Leandro Martínez,et al.  Only subtle protein conformational adaptations are required for ligand binding to thyroid hormone receptors: simulations using a novel multipoint steered molecular dynamics approach. , 2008, The journal of physical chemistry. B.

[18]  Na Li,et al.  Hormone Activity of Hydroxylated Polybrominated Diphenyl Ethers on Human Thyroid Receptor-β: In Vitro and In Silico Investigations , 2009, Environmental health perspectives.

[19]  Paola Gramatica,et al.  QSAR classification models for the prediction of endocrine disrupting activity of brominated flame retardants. , 2011, Journal of hazardous materials.

[20]  Lu Sun,et al.  Computational models to predict endocrine-disrupting chemical binding with androgen or oestrogen receptors. , 2014, Ecotoxicology and environmental safety.

[21]  T. Matsuda,et al.  Inhibitory effects of azole-type fungicides on interleukin-17 gene expression via retinoic acid receptor-related orphan receptors α and γ. , 2012, Toxicology and applied pharmacology.

[22]  William H. Bisson,et al.  Disruptive environmental chemicals and cellular mechanisms that confer resistance to cell death. , 2015, Carcinogenesis.

[23]  Daniela Schuster,et al.  In silico methods in the discovery of endocrine disrupting chemicals , 2013, The Journal of Steroid Biochemistry and Molecular Biology.

[24]  John P. Giesy,et al.  An automated enantioselective isolation system for the study of estrogenic potencies: Study of the estrogenic activity of α‐hexachlorocyclohexane , 2003 .

[25]  R. Lewis An Introduction to Classification and Regression Tree (CART) Analysis , 2000 .

[26]  P Willett,et al.  Docking small-molecule ligands into active sites. , 1995, Current opinion in biotechnology.

[27]  Yanli Wang,et al.  Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem , 2008, BMC Bioinformatics.

[28]  Quan Zhang,et al.  Potential estrogenic effects of phosphorus-containing flame retardants. , 2014, Environmental science & technology.

[29]  C DeRosa,et al.  The U.S. federal framework for research on endocrine disruptors and an analysis of research programs supported during fiscal year 1996. , 1998, Environmental health perspectives.

[30]  P. Fair,et al.  Comparison of in vitro cytotoxicity, estrogenicity and anti‐estrogenicity of triclosan, perfluorooctane sulfonate and perfluorooctanoic acid , 2013, Journal of applied toxicology : JAT.

[31]  Ramon D ´ õaz-Uriarte,et al.  Variable selection from random forests: application to gene expression data , 2005 .

[32]  Hua Yuan,et al.  Prediction of Skin Sensitization with a Particle Swarm Optimized Support Vector Machine , 2009, International journal of molecular sciences.

[33]  F. Collins,et al.  Transforming Environmental Health Protection , 2008, Science.

[34]  Ruili Huang,et al.  CERAPP: Collaborative Estrogen Receptor Activity Prediction Project , 2016, Environmental health perspectives.

[35]  S. Schantz,et al.  Cognitive effects of endocrine-disrupting chemicals in animals. , 2001, Environmental health perspectives.

[36]  C Helma,et al.  Fragment generation and support vector machines for inducing SARs , 2002, SAR and QSAR in environmental research.

[37]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[38]  B. Whitcomb,et al.  Environmental PCB exposure and risk of endometriosis. , 2005, Human reproduction.

[39]  Juhani Ruuskanen,et al.  Consensus kNN QSAR: a versatile method for predicting the estrogenic activity of organic compounds in silico. A comparative study with five estrogen receptors and a large, diverse set of ligands. , 2004, Environmental science & technology.

[40]  C Sonnenschein,et al.  The E-SCREEN assay as a tool to identify estrogens: an update on estrogenic environmental pollutants. , 1995, Environmental health perspectives.

[41]  Y. Kim,et al.  Classification of prefrontal and motor cortex signals for three-class fNIRS–BCI , 2015, Neuroscience Letters.

[42]  Liansheng Chen,et al.  Developing in vitro reporter gene assays to assess the hormone receptor activities of chemicals frequently detected in drinking water , 2012, Journal of applied toxicology : JAT.

[43]  P Gramatica,et al.  QSAR classification of estrogen receptor binders and pre-screening of potential pleiotropic EDCs , 2010, SAR and QSAR in environmental research.

[44]  Gerald Brenner-Weiss,et al.  A Chemical Screening System for Glucocorticoid Stress Hormone Signaling in an Intact Vertebrate , 2012, ACS chemical biology.

[45]  Richard S. Judson,et al.  Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure-Activity Relationship and Machine Learning Methods , 2013, J. Chem. Inf. Model..