In silico prediction of chemical toxicity on avian species using chemical category approaches.

Avian species are sensitive to pesticides and industrial chemicals, and hence used as model species in evaluation of chemical toxicity. In present study, we assessed the toxicity of more than 663 diverse chemicals on 17 avian species. All the chemicals were classified into three categories, i.e. highly toxic, slightly toxic and non-toxic, based on the toxicity classification criteria of the United States Environmental Protection Agency (EPA). To evaluate these chemicals, the toxicity prediction models were built using chemical category approaches with molecular descriptors and five commonly used fingerprints, in which five machine learning methods were performed on two standard test species: aquatic bird mallard duck and terrestrial bird northern bobwhite quail. The support vector machine (SVM) method with Pubchem fingerprint performed best as revealed by 5-fold cross-validation and the external validation set on Japanese quail. No species difference existed in our database despite several chemicals with different toxicity on some avian species. The best model had an overall accuracy at 0.851 for the prediction of toxicity on avian species, which outperformed the work of Mazzatorta et al. Furthermore, several representative substructures for characterizing avian toxicity were identified via information gain (IG) method. This study would provide a new tool for chemical safety assessment.

[1]  Douglas A. Horton,et al.  The Combinatorial Synthesis of Bicyclic Privileged Structures or Privileged Substructures , 2003 .

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Kunal Roy,et al.  Exploring QSARs with Extended Topochemical Atom (ETA) indices for modeling chemical and drug toxicity. , 2010, Current pharmaceutical design.

[4]  Frederick P. Roth,et al.  Chemical substructures that enrich for biological activity , 2008, Bioinform..

[5]  Paola Gramatica,et al.  Daphnia and fish toxicity of (benzo)triazoles: validated QSAR models, and interspecies quantitative activity-activity modelling. , 2013, Journal of hazardous materials.

[6]  Emilio Benfenati,et al.  A QSAR Study of Avian Oral Toxicity using Support Vector Machines and Genetic Algorithms , 2006 .

[7]  Feixiong Cheng,et al.  In silico Prediction of Chemical Ames Mutagenicity , 2012, J. Chem. Inf. Model..

[8]  F. Gobas,et al.  Food Web–Specific Biomagnification of Persistent Organic Pollutants , 2007, Science.

[9]  Naomi L Kruhlak,et al.  Progress in QSAR toxicity screening of pharmaceutical impurities and other FDA regulated products. , 2007, Advanced drug delivery reviews.

[10]  I. Piedade,et al.  In silico prediction of , 2014 .

[11]  K. Voigt,et al.  Molecular identification of fungi. , 2010 .

[12]  Lemont B. Kier,et al.  Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information , 1995, J. Chem. Inf. Comput. Sci..

[13]  Yadi Zhou,et al.  Prediction of chemical-protein interactions: multitarget-QSAR versus computational chemogenomic methods. , 2012, Molecular bioSystems.

[14]  Yue Yu,et al.  In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods. , 2011, Chemosphere.

[15]  Berith F. Jensen,et al.  In silico prediction of cytochrome P450 2D6 and 3A4 inhibition using Gaussian kernel weighted k-nearest neighbor and extended connectivity fingerprints, including structural fragment analysis of inhibitors versus noninhibitors. , 2007, Journal of medicinal chemistry.

[16]  M. Natália D. S. Cordeiro,et al.  Two New Parameters Based on Distances in a Receiver Operating Characteristic Chart for the Selection of Classification Models , 2011, J. Chem. Inf. Model..

[17]  J. Hengstler,et al.  The REACH concept and its impact on toxicological sciences. , 2006, Toxicology.

[18]  Milan Randic,et al.  On molecular identification numbers , 1984, J. Chem. Inf. Comput. Sci..

[19]  Didier Raoult,et al.  Molecular identification by , 2000 .

[20]  Uko Maran,et al.  Open Computing Grid for Molecular Science and Engineering , 2006, J. Chem. Inf. Model..

[21]  Walker Ch Neurotoxic Pesticides and Behavioural Effects Upon Birds , 2003 .

[22]  Jie Shen,et al.  Estimation of ADME Properties with Substructure Pattern Recognition , 2010, J. Chem. Inf. Model..

[23]  Dariusz Plewczynski,et al.  Assessing Different Classification Methods for Virtual Screening , 2006, J. Chem. Inf. Model..

[24]  Ivan Rusyn,et al.  The Use of Cell Viability Assay Data Improves the Prediction Accuracy of Conventional Quantitative Structure Activity Relationship Models of Animal Carcinogenicity , 2007 .

[25]  Alex Alves Freitas,et al.  Coping with Unbalanced Class Data Sets in Oral Absorption Models , 2013, J. Chem. Inf. Model..

[26]  Enrico Mombelli,et al.  Exploring an ecotoxicity database with the OECD (Q)SAR Toolbox and DRAGON descriptors in order to prioritise testing on algae, daphnids, and fish. , 2011, The Science of the total environment.

[27]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[28]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[29]  J. Jaworska,et al.  Summary of a workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. , 2003, Environmental health perspectives.

[30]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[31]  Dora M Schnur,et al.  Are target-family-privileged substructures truly privileged? , 2006, Journal of medicinal chemistry.

[32]  C. Cox Pesticides and Birds: From DDT to Today's Poisons , 1991 .

[33]  L. Hall,et al.  Molecular connectivity in chemistry and drug research , 1976 .

[34]  E. Benfenati,et al.  QSAR models of quail dietary toxicity based on the graph of atomic orbitals. , 2006, Bioorganic & medicinal chemistry letters.

[35]  Jie Shen,et al.  In Silico Assessment of Chemical Biodegradability , 2012, J. Chem. Inf. Model..

[36]  John D. Walker,et al.  Use of QSARs in international decision-making frameworks to predict health effects of chemical substances. , 2003, Environmental health perspectives.

[37]  Steven L. Salzberg,et al.  Book Review: C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993 , 1994, Machine Learning.

[38]  F. Gobas,et al.  Supporting Online Material for Food Web-Specific Biomagnification of Persistent Organic Pollutants , 2007 .

[39]  K Roy,et al.  On some novel extended topochemical atom (ETA) parameters for effective encoding of chemical information and modelling of fundamental physicochemical properties , 2011, SAR and QSAR in environmental research.