Predicting aquatic toxicities of benzene derivatives in multiple test species using local, global and interspecies QSTR modeling approaches

Benzene derivatives (BDs) are widely used industrial chemicals and their toxic effects are well documented. The experimental toxicology being cost and time intensive, this emphasizes the need for the development of computational methods. In this study, we have established local and global quantitative structure–toxicity relationship (L-QSTR and G-QSTR) and interspecies correlation (ISC) based quantitative activity–activity relationship (QAAR) models for predicting the aquatic toxicities of BDs in single and multiple test species using the toxicity data in Tetrahymena pyriformis, Pimephales promelas, Poecilia reticulata, and Rana japonica in accordance with the Organization for Economic Cooperation Development (OECD) guidelines. The decision tree boost (DTB) and support vector machines (SVM) based models were constructed using molecular descriptors. The constructed models were validated using several statistical coefficients derived for the test data and the prediction and generalization abilities of these models were evaluated. In the L-QSTR and G-QSTR models, molecular weight was the most influential descriptor. The constructed L-QSTR (R2 > 0.896), G-QSTR (R2 > 0.846), and ISC QAAR (R2 > 0.754) models yielded considerably high correlations between the measured and model predicted endpoint toxicity values in the test data. The obtained results indicate the superiority of DTB based QSTR models over the SVM models. Furthermore, the chemical applicability domains of these models were determined via leverage approach. The results suggest for the appropriateness of the developed QSTR/QAAR models to reliably predict the aquatic toxicity of structurally diverse BDs and can be used for screening and prioritization of chemicals.

[1]  L. Lin Assay Validation Using the Concordance Correlation Coefficient , 1992 .

[2]  Halil Ibrahim Erdal,et al.  Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms , 2013 .

[3]  Nikita Basant,et al.  QSTR modeling for predicting aquatic toxicity of pharmacological active compounds in multiple test species for regulatory purpose. , 2015, Chemosphere.

[4]  K. Roy,et al.  Exploring quantitative structure–activity relationship studies of antioxidant phenolic compounds obtained from traditional Chinese medicinal plants , 2010 .

[5]  Shikha Gupta,et al.  Predicting toxicities of ionic liquids in multiple test species – an aid in designing green chemicals , 2014 .

[6]  A. Furuhama,et al.  Interspecies quantitative structure–activity–activity relationships (QSAARs) for prediction of acute aquatic toxicity of aromatic amines and phenols , 2015, SAR and QSAR in environmental research.

[7]  Alessio Micheli,et al.  Modeling of the Acute Toxicity of Benzene Derivatives by Complementary QSAR Methods , 2013 .

[8]  David W. Opitz,et al.  Use of Statistical and Neural Net Approaches in Predicting Toxicity of Chemicals , 2000, J. Chem. Inf. Comput. Sci..

[9]  Xiao-dong Wang,et al.  Holographic quantitative structure-activity relationship for prediction acute toxicity of benzene derivatives to the guppy (Poecilia reticulata). , 2004, Journal of environmental sciences.

[10]  L. Hall,et al.  E-State Modeling of Fish Toxicity Independent of 3D Structure Information , 2003, SAR and QSAR in environmental research.

[11]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[12]  Mark T D Cronin,et al.  Comparative assessment of methods to develop QSARs for the prediction of the toxicity of phenols to Tetrahymena pyriformis. , 2002, Chemosphere.

[13]  Francisco Torrens,et al.  A novel approach to predict aquatic toxicity from molecular structure. , 2008, Chemosphere.

[14]  Gerald T Ankley,et al.  Overview of data and conceptual approaches for derivation of quantitative structure‐activity relationships for ecotoxicological effects of organic chemicals , 2003, Environmental toxicology and chemistry.

[15]  Shikha Gupta,et al.  In silico prediction of cellular permeability of diverse chemicals using qualitative and quantitative SAR modeling approaches , 2015 .

[16]  Gaoxue Wang,et al.  PREDICTION OF THE AQUATIC TOXICITY OF PHENOLS TO TETRAHYMENA PYRIFORMIS FROM MOLECULAR DESCRIPTORS , 2011 .

[17]  S. Dyer,et al.  Interspecies correlation estimates predict protective environmental concentrations. , 2006, Environmental science & technology.

[18]  T. W. Schultz,et al.  TETRATOX: TETRAHYMENA PYRIFORMIS POPULATION GROWTH IMPAIRMENT ENDPOINTA SURROGATE FOR FISH LETHALITY , 1997 .

[19]  S. Mekelleche,et al.  QSAR study of the toxicity of nitrobenzenes to Tetrahymena pyriformis using quantum chemical descriptors , 2016 .

[20]  Alexander Golbraikh,et al.  Development of kNN QSAR Models for 3-Arylisoquinoline Antitumor Agents , 2011 .

[21]  Haralambos Sarimveis,et al.  Prediction of toxicity using a novel RBF neural network training methodology , 2006, Journal of molecular modeling.

[22]  Nikita Basant,et al.  Predicting aquatic toxicities of chemical pesticides in multiple test species using nonlinear QSTR modeling approaches. , 2015, Chemosphere.

[23]  Dawei Han,et al.  Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction , 2011 .

[24]  Gerta Rücker,et al.  y-Randomization and Its Variants in QSPR/QSAR , 2007, J. Chem. Inf. Model..

[25]  Yue Yu,et al.  In silico prediction of Tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods. , 2011, Chemosphere.

[26]  Shikha Gupta,et al.  Predicting acute aquatic toxicity of structurally diverse chemicals in fish using artificial intelligence approaches. , 2013, Ecotoxicology and environmental safety.

[27]  George Kollias,et al.  Ligand-based virtual screening procedure for the prediction and the identification of novel β-amyloid aggregation inhibitors using Kohonen maps and Counterpropagation Artificial Neural Networks. , 2011, European journal of medicinal chemistry.

[28]  Roberto Todeschini,et al.  Comments on the Definition of the Q2 Parameter for QSAR Validation , 2009, J. Chem. Inf. Model..

[29]  E. Benfenati,et al.  Comparative Quantitative Structure–Activity–Activity Relationships for Toxicity to Tetrahymena pyriformis and Pimephales promelas , 2007, Alternatives to laboratory animals : ATLA.

[30]  Paola Gramatica,et al.  Real External Predictivity of QSAR Models: How To Evaluate It? Comparison of Different Validation Criteria and Proposal of Using the Concordance Correlation Coefficient , 2011, J. Chem. Inf. Model..

[31]  K. P. Singh,et al.  Support vector machines in water quality management. , 2011, Analytica chimica acta.

[32]  Kunal Roy,et al.  Some case studies on application of “rm2” metrics for judging quality of quantitative structure–activity relationship predictions: Emphasis on scaling of response data , 2013, J. Comput. Chem..

[33]  P C Jurs,et al.  Linear regression and computational neural network prediction of tetrahymena acute toxicity for aromatic compounds from molecular structure. , 2001, Chemical research in toxicology.

[34]  Paola Gramatica,et al.  Real External Predictivity of QSAR Models. Part 2. New Intercomparable Thresholds for Different Validation Criteria and the Need for Scatter Plot Inspection , 2012, J. Chem. Inf. Model..

[35]  Weihua Li,et al.  In silico prediction of chemical aquatic toxicity with chemical category approaches and substructural alerts , 2015 .

[36]  Weida Tong,et al.  QSAR Models Using a Large Diverse Set of Estrogens , 2001, J. Chem. Inf. Comput. Sci..

[37]  Paola Gramatica,et al.  Daphnia and fish toxicity of (benzo)triazoles: validated QSAR models, and interspecies quantitative activity-activity modelling. , 2013, Journal of hazardous materials.

[38]  James J. P. Stewart,et al.  Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters , 2012, Journal of Molecular Modeling.

[39]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[40]  X. Y. Zhang,et al.  Application of support vector machine (SVM) for prediction toxic activity of different data sets. , 2006, Toxicology.

[41]  Ralph Kühne,et al.  External Validation and Prediction Employing the Predictive Squared Correlation Coefficient Test Set Activity Mean vs Training Set Activity Mean , 2008, J. Chem. Inf. Model..

[42]  S. Grunwald,et al.  Tree-based modeling of complex interactions of phosphorus loadings and environmental factors. , 2009, The Science of the total environment.

[43]  Dinesh Mohan,et al.  Multispecies QSAR modeling for predicting the aquatic toxicity of diverse organic chemicals for regulatory toxicology. , 2014, Chemical research in toxicology.

[44]  G Patlewicz,et al.  Toxmatch–a new software tool to aid in the development and evaluation of chemically similar groups , 2008, SAR and QSAR in environmental research.

[45]  Jui-Sheng Chou,et al.  Optimizing the Prediction Accuracy of Concrete Compressive Strength Based on a Comparison of Data-Mining Techniques , 2011, J. Comput. Civ. Eng..

[46]  Xiao-dong Wang,et al.  Acute toxicity of benzene derivatives to the tadpoles (Rana japonica) and QSAR analyses. , 2003, Chemosphere.

[47]  Qiang Chen,et al.  A molecular fragments variable connectivity index for studying the toxicity (Vibrio fischeri pT50) of substituted-benzenes , 2009, Journal of environmental science and health. Part A, Toxic/hazardous substances & environmental engineering.

[48]  Judith C. Madden,et al.  In Silico Toxicology , 2010 .

[49]  Scott D. Kahn,et al.  Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure-Activity Relationships , 2005, Alternatives to laboratory animals : ATLA.

[50]  J. Friedman Stochastic gradient boosting , 2002 .

[51]  S. Pramanik,et al.  Predictive modeling of chemical toxicity towards Pseudokirchneriella subcapitata using regression and classification based approaches. , 2014, Ecotoxicology and environmental safety.

[52]  A. Salibián,et al.  Tadpoles Assay: Its Application to a Water Toxicity Assessment of a Polluted Urban River , 2001, Environmental monitoring and assessment.