Predicting toxic action mechanisms of phenols using AdaBoost Learner

Abstract AdaBoost Learner is employed to investigate Structure–Activity Relationships of phenols based on molecular descriptors. In this paper, the performance of AdaBoost Learner is compared with support vector machine (SVM), artificial neural networks (ANNs) and K nearest neighbors (KNNs), which are the most common algorithms used for SARs analysis. AdaBoost Learner performed better than SVM, ANNs and KNNs in predicting the mechanism of toxicity of phenols based on molecular descriptors. It can be concluded that AdaBoost has a potential to improve the performance of SARs analysis. We believe that AdaBoost Learner will play an important and complementary role to the existing algorithms for the prediction of the mechanisms of toxicity based on SARs. We have developed an online web server for the prediction of ecotoxicity mechanisms of phenols, accessible at http://chemdata.shu.edu.cn/ecotoxity/ .

[1]  P. Hopke,et al.  Multiple regression for environmental data: nonlinearities and prediction bias , 1999 .

[2]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[3]  S. Bradbury,et al.  Fish acute toxicity syndromes and their use in the QSAR approach to hazard assessment. , 1987, Environmental health perspectives.

[4]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jon Gabrielsson,et al.  Different multivariate approaches to material discovery, process development, PAT and environmental process monitoring , 2006 .

[6]  Pentti Minkkinen,et al.  Estimation of the variance of sampling of process analytical and environmental emissions measurements , 2007 .

[7]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[8]  Dezhao Chen,et al.  Ensemble classifier system based on ant colony algorithm and its application in chemical pattern classification , 2006 .

[9]  C. Russom,et al.  Predicting modes of toxic action from chemical structure: Acute toxicity in the fathead minnow (Pimephales promelas) , 1997 .

[10]  T W Schultz,et al.  Structure-activity relationships of selected pyridines. III. Log Kow analysis. , 1987, Ecotoxicology and environmental safety.

[11]  Kuo-Chen Chou,et al.  Predicting protein structural class with AdaBoost Learner. , 2006, Protein and peptide letters.

[12]  Jürgen W. Einax,et al.  Multivariate correlation analysis - a method for the analysis of multidimensional time series in environmental studies , 1996 .

[13]  Haralambos Sarimveis,et al.  Prediction of toxicity using a novel RBF neural network training methodology , 2006, Journal of molecular modeling.

[14]  G. Veith,et al.  Rules for distinguishing toxicants that cause type I and type II narcosis syndromes. , 1990, Environmental health perspectives.

[15]  Mark T. D. Cronin,et al.  Multivariate Discrimination between Modes of Toxic Action of Phenols , 2002 .

[16]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[17]  Bjørn K. Alsberg,et al.  Cross model validated feature selection based on gene clusters , 2006 .

[18]  Francisco Torrens,et al.  A novel approach to predict aquatic toxicity from molecular structure. , 2008, Chemosphere.

[19]  Dick de Zwart,et al.  Novel view on predicting acute toxicity: decomposing toxicity data in species vulnerability and chemical potency. , 2007, Ecotoxicology and environmental safety.

[20]  Lester Packer,et al.  Flavonoids in health and disease , 2003 .

[21]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[22]  T W Schultz,et al.  Structure-activity relationships of selected pyridines. I. Substituent constant analysis. , 1985, Ecotoxicology and environmental safety.

[23]  S Ren,et al.  Determining the mechanisms of toxic action of phenols to Tetrahymena pyriformis , 2002, Environmental toxicology.

[24]  S. L. Larin,et al.  A non-statistical approach in systematic error estimation at some metal ions determination in environmental objects by stripping voltammetry , 2007 .

[25]  Stephen Muggleton,et al.  A Novel Logic-Based Approach for Quantitative Toxicology Prediction , 2007, J. Chem. Inf. Model..

[26]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[27]  David P. Helmbold,et al.  A geometric approach to leveraging weak learners , 2002, Theor. Comput. Sci..

[28]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[29]  Mark T D Cronin,et al.  Comparative assessment of methods to develop QSARs for the prediction of the toxicity of phenols to Tetrahymena pyriformis. , 2002, Chemosphere.

[30]  Olav M. Kvalheim,et al.  Multi-way exploration of regular environmental monitoring surveys , 2005 .

[31]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[32]  John C. Dearden,et al.  In silico prediction of drug toxicity , 2003, J. Comput. Aided Mol. Des..

[33]  J. Devillers,et al.  Practical applications of quantitative structure-activity relationships (QSAR) in environmental chemistry and toxicology , 1990 .

[34]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[35]  Robert F. O'Brien,et al.  Characterizing environmental pressures along the US/Mexico border: An application of the Toxics Release Inventory in environmentrics , 1997 .

[36]  Harald Martens,et al.  Reducing over-optimism in variable selection by cross-model validation , 2006 .

[37]  Gerrit Schüürmann,et al.  Quantitative structure-activity relationships in environmental sciences, VII , 1997 .

[38]  Richard E. Korf,et al.  Best-First Minimax Search , 1996, Artif. Intell..

[39]  Uko Maran,et al.  Modeling the Toxicity of Chemicals to Tetrahymena pyriformis Using Heuristic Multilinear Regression and Heuristic Back-Propagation Neural Networks , 2007, J. Chem. Inf. Model..

[40]  Shijin Ren,et al.  Ecotoxicity prediction using mechanism- and non-mechanism-based QSARs: a preliminary study. , 2003, Chemosphere.

[41]  Enrique Romero,et al.  Margin maximization with feed-forward neural networks: a comparative study with SVM and AdaBoost , 2004, Neurocomputing.