New clues on carcinogenicity-related substructures derived from mining two large datasets of chemical compounds

ABSTRACT In this study, new molecular fragments associated with genotoxic and nongenotoxic carcinogens are introduced to estimate the carcinogenic potential of compounds. Two rule-based carcinogenesis models were developed with the aid of SARpy: model R (from rodents' experimental data) and model E (from human carcinogenicity data). Structural alert extraction method of SARpy uses a completely automated and unbiased manner with statistical significance. The carcinogenicity models developed in this study are collections of carcinogenic potential fragments that were extracted from two carcinogenicity databases: the ANTARES carcinogenicity dataset with information from bioassay on rats and the combination of ISSCAN and CGX datasets, which take into accounts human-based assessment. The performance of these two models was evaluated in terms of cross-validation and external validation using a 258 compound case study dataset. Combining R and H predictions and scoring a positive or negative result when both models are concordant on a prediction, increased accuracy to 72% and specificity to 79% on the external test set. The carcinogenic fragments present in the two models were compared and analyzed from the point of view of chemical class. The results of this study show that the developed rule sets will be a useful tool to identify some new structural alerts of carcinogenicity and provide effective information on the molecular structures of carcinogenic chemicals.

[1]  Kimmo Louekari,et al.  In Vitro Tests within the REACH Information Strategies , 2006, Alternatives to laboratory animals : ATLA.

[2]  A. Izzotti,et al.  Molecular Fingerprints of Environmental Carcinogens in Human Cancer , 2015, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[3]  Romualdo Benigni,et al.  Structure-activity models of chemical carcinogens: state of the art, and new directions. , 2006, Annali dell'Istituto superiore di sanita.

[4]  Romualdo Benigni,et al.  Structure alerts for carcinogenicity, and the Salmonella assay system: a novel insight through the chemical relational databases technology. , 2008, Mutation research.

[5]  Emilio Benfenati,et al.  Hierarchical Rules for Read-Across and In Silico Models of Mutagenicity , 2015, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[6]  W. Tong,et al.  Quantitative structure‐activity relationship methods: Perspectives on drug discovery and toxicology , 2003, Environmental toxicology and chemistry.

[7]  Jonathan Balcombe,et al.  Animal Carcinogenicity Studies: 2. Obstacles to Extrapolation of Data to Humans , 2006, Alternatives to laboratory animals : ATLA.

[8]  P. Stolpman,et al.  Environmental Protection Agency , 2020, The Grants Register 2022.

[9]  E Benfenati,et al.  Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction , 2013, SAR and QSAR in environmental research.

[10]  Jonathan Balcombe,et al.  Animal Carcinogenicity Studies: 1. Poor Human Predictivity , 2006, Alternatives to laboratory animals : ATLA.

[11]  E C Miller,et al.  Searches for ultimate chemical carcinogens and their reactions with cellular macromolecules , 1981, Cancer.

[12]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[13]  Lorenzo Tomatis,et al.  Identification of Carcinogenic Agents and Primary Prevention of Cancer , 2006, Annals of the New York Academy of Sciences.

[14]  R. Tennant,et al.  Chemical structure, Salmonella mutagenicity and extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222 chemicals tested in rodents by the U.S. NCI/NTP. , 1988, Mutation research.

[15]  B. Ames,et al.  Chemical carcinogenesis: too many rodent carcinogens. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[16]  J M Ward Rat or Mouse Cancer Bioassay—or None of the Above? , 1996, Toxicologic pathology.

[17]  Yin-tak Woo,et al.  Mechanisms of Action of Chemical Carcinogens and Their Role in Structure-Activity Relationships (SAR) Analysis and Risk Assessment 2 , 2003 .

[18]  Yin-tak Woo,et al.  Chemical induction of cancer : modulation and combination effects : an inventory of the many factors which influence carcinogenesis , 1995 .

[19]  Alessandro Giuliani,et al.  Alternative Toxicity Testing: Analyses on Skin Sensitization, ToxCast Phases I and II, and Carcinogenicity Provide Indications on How to Model Mechanisms Linked to Adverse Outcome Pathways , 2015, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[20]  Lutz Müller,et al.  Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. , 2005, Mutation research.

[21]  D. Gál,et al.  Chemical induction of cancer modulation and combination effects , 1996 .

[22]  E Benfenati,et al.  ToxRead: A tool to assist in read across and its use to assess mutagenicity of chemicals£ , 2014, SAR and QSAR in environmental research.

[23]  Emilio Benfenati,et al.  Evaluation of QSAR Models for the Prediction of Ames Genotoxicity: A Retrospective Exercise on the Chemical Substances Registered Under the EU REACH Regulation , 2014, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[24]  Paola Gramatica,et al.  The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models , 2003 .

[25]  Romualdo Benigni,et al.  Predictivity and Reliability of QSAR Models: The Case of Mutagens and Carcinogens , 2008, Toxicology mechanisms and methods.

[26]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[27]  J. Huff,et al.  Long‐Term Chemical Carcinogenesis Bioassays Predict Human Cancer Hazards: Issues, Controversies, and Uncertainties , 1999, Annals of the New York Academy of Sciences.

[28]  Romualdo Benigni,et al.  Predictivity of QSAR , 2008, J. Chem. Inf. Model..

[29]  Alessandro Giuliani,et al.  Alternatives to the carcinogenicity bioassay: in silico methods, and the in vitro and in vivo mutagenicity assays , 2010, Expert opinion on drug metabolism & toxicology.

[30]  Associazione Chimica Farmaceutica Lombarda tra Titolari di Farmacia,et al.  Mononucleosi: lo sai mamma? La farmacia al fianco delle mamme , 2014 .

[31]  Giuseppina C. Gini,et al.  Mining toxicity structural alerts from SMILES: A new way to derive Structure Activity Relationships , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[32]  E Spencer Williams,et al.  The transgenic mouse assay as an alternative test method for regulatory carcinogenicity studies--implications for REACH. , 2009, Regulatory toxicology and pharmacology : RTP.

[33]  Robert Combes,et al.  Integrated Decision-tree Testing Strategies for Mutagenicity and Carcinogenicity with Respect to the Requirements of the EU REACH Legislation , 2008, Alternatives to laboratory animals : ATLA.

[34]  J. Ward The Two-Year Rodent Carcinogenesis Bioassay — Will It Survive? , 2007 .

[35]  M. Reagan,et al.  CAUSES OF CANCER , 2019, Cancer.

[36]  J. Ashby,et al.  The influence of chemical structure on the extent and sites of carcinogenesis for 522 rodent carcinogens and 55 different human carcinogen exposures. , 1993, Mutation research.

[37]  Svetlana V. Ukraintseva,et al.  Cancer in rodents: does it tell us about cancer in humans? , 2005, Nature Reviews Cancer.

[38]  R. Benigni,et al.  Nongenotoxic carcinogenicity of chemicals: mechanisms of action and early recognition through a new set of structural alerts. , 2013, Chemical reviews.