Identification of Nontoxic Substructures: A New Strategy to Avoid Potential Toxicity Risk

Avoidance of structural alerts (SAs) might reduce the risk of failure in drug discovery. However, there are still some marketed drugs containing SA, which indicates that SA should be analyzed carefully to avoid their excessive uses. Several detection systems, including automatic mining methods and expert systems, have been developed to identify SA. These methods only focus on toxic compounds that support the SA without consideration of nontoxic ones. Here, we proposed a frequency-based substructure detection protocol that learns from the nontoxic compounds containing SA to get nontoxic substructures (NTSs), whose appearance will reduce the probability of a compound becoming toxic. Kazius and Hansen's Ames mutagenicity dataset was used as an example to demonstrate the protocol. SARpy and ToxAlerts were first employed to obtain the potential SA. Then 2 kinds of NTS were exploited: reverse effect substructures (RESs) and conjugate effect substructures. Contribution and prediction performance of the substructures were evaluated via neural network and rule-based methods. We also compared substructure-based methods with the conventional machine learning-based methods. The results demonstrated that most substructures contributed as supposed and substructure-based methods performed better in the resistance of overfitting. This work indicated that the protocol could effectively reduce the false positive rate in prediction of chemical mutagenicity, and possibly extend to other endpoints.

[1]  Hongbin Yang,et al.  Corrigendum: In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts , 2018, Front. Chem..

[2]  E Benfenati,et al.  Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction , 2013, SAR and QSAR in environmental research.

[3]  Hongbin Yang,et al.  In Silico Prediction of Chemicals Binding to Aromatase with Machine Learning Methods. , 2017, Chemical research in toxicology.

[4]  Emilio Benfenati,et al.  New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds , 2016, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[5]  Laura Robinson,et al.  The Material Safety Data Sheet , 2009 .

[6]  R. Tennant,et al.  Chemical structure, Salmonella mutagenicity and extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222 chemicals tested in rodents by the U.S. NCI/NTP. , 1988, Mutation research.

[7]  Lutz Müller,et al.  Erratum to “Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity” [Mut. Res. 584 (2005) 1–256] , 2005 .

[8]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[9]  Xiao Li,et al.  In Silico Prediction of Chemical Acute Oral Toxicity Using Multi-Classification Methods , 2014, J. Chem. Inf. Model..

[10]  Isidro Cortes-Ciriano,et al.  Bioalerts: a python library for the derivation of structural alerts from bioactivity and toxicity data sets , 2016, Journal of Cheminformatics.

[11]  Klaus-Robert Müller,et al.  Benchmark Data Set for in Silico Prediction of Ames Mutagenicity , 2009, J. Chem. Inf. Model..

[12]  Jie Li,et al.  Evaluation of Different Methods for Identification of Structural Alerts Using Chemical Ames Mutagenicity Data Set as a Benchmark. , 2017, Chemical research in toxicology.

[13]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[14]  Igor V. Tetko,et al.  ToxAlerts: A Web Server of Structural Alerts for Toxic Chemicals and Compounds with Potential Adverse Reactions , 2012, J. Chem. Inf. Model..

[15]  Jianfeng Pei,et al.  Deep Learning Based Regression and Multiclass Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction , 2017, J. Chem. Inf. Model..

[16]  Weihua Li,et al.  In Silico Prediction of Chemical Toxicity for Drug Design Using Machine Learning Methods and Structural Alerts , 2018, Front. Chem..

[17]  Vinicius M. Alves,et al.  Alarms about structural alerts. , 2016, Green chemistry : an international journal and green chemistry resource : GC.

[18]  Nigel Greene,et al.  In silico methods combined with expert knowledge rule out mutagenic potential of pharmaceutical impurities: an industry survey. , 2012, Regulatory toxicology and pharmacology : RTP.

[19]  J E Ridings,et al.  Computer prediction of possible toxic action from chemical structure: an update on the DEREK system. , 1996, Toxicology.

[20]  N. Skovgaard Safety evaluation of certain food additives and contaminants , 2000 .

[21]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[22]  Jie Shen,et al.  Estimation of ADME Properties with Substructure Pattern Recognition , 2010, J. Chem. Inf. Model..

[23]  Aleksey Buzmakov,et al.  Discovering Structural Alerts for Mutagenicity Using Stable Emerging Molecular Patterns , 2015, J. Chem. Inf. Model..

[24]  A. Bailey,et al.  A tiered approach to threshold of regulation. , 1999, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[25]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[26]  Ruifeng Liu,et al.  Data-driven identification of structural alerts for mitigating the risk of drug-induced human liver injuries , 2015, Journal of Cheminformatics.

[27]  Ronan Bureau,et al.  Emerging Patterns as Structural Alerts for Computational Toxicology , 2013, Contrast Data Mining.

[28]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[29]  Lutz Müller,et al.  Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. , 2005, Mutation research.

[30]  R. M. Owen,et al.  An analysis of the attrition of drug candidates from four major pharmaceutical companies , 2015, Nature Reviews Drug Discovery.

[31]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[32]  Johann Gasteiger,et al.  New Publicly Available Chemical Query Language, CSRML, To Support Chemotype Representations for Application to Data Mining and Modeling , 2015, J. Chem. Inf. Model..

[33]  Feixiong Cheng,et al.  In silico Prediction of Chemical Ames Mutagenicity , 2012, J. Chem. Inf. Model..

[34]  Chen Zhang,et al.  In silico prediction of hERG potassium channel blockage by chemical category approaches. , 2016, Toxicology research.

[35]  Deepak Dalvie,et al.  Predicting toxicities of reactive metabolite-positive drug candidates. , 2015, Annual review of pharmacology and toxicology.

[36]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[37]  Ferenc Darvas,et al.  HazardExpert: An Expert System for Predicting Chemical Toxicity , 1992 .

[38]  Tingjun Hou,et al.  ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling , 2016, Journal of Cheminformatics.

[39]  Thomas Bäck,et al.  Substructure Mining Using Elaborate Chemical Representation , 2006, J. Chem. Inf. Model..

[40]  Lu Sun,et al.  Computational models to predict endocrine-disrupting chemical binding with androgen or oestrogen receptors. , 2014, Ecotoxicology and environmental safety.

[41]  S. Epstein,et al.  Chemical Mutagenesis , 1971, Nature.

[42]  Enrico Mombelli,et al.  An Evaluation of the Predictive Ability of the QSAR Software Packages, DEREK, HAZARDEXPERT and TOPKAT, to Describe Chemically-induced Skin Irritation , 2008, Alternatives to laboratory animals : ATLA.

[43]  Scott Boyer,et al.  Computational Derivation of Structural Alerts from Large Toxicology Data Sets , 2014, J. Chem. Inf. Model..