Novel naïve Bayes classification models for predicting the carcinogenicity of chemicals.

The carcinogenicity prediction has become a significant issue for the pharmaceutical industry. The purpose of this investigation was to develop a novel prediction model of carcinogenicity of chemicals by using a naïve Bayes classifier. The established model was validated by the internal 5-fold cross validation and external test set. The naïve Bayes classifier gave an average overall prediction accuracy of 90 ± 0.8% for the training set and 68 ± 1.9% for the external test set. Moreover, five simple molecular descriptors (e.g., AlogP, Molecular weight (MW), No. of H donors, Apol and Wiener) considered as important for the carcinogenicity of chemicals were identified, and some substructures related to the carcinogenicity were achieved. Thus, we hope the established naïve Bayes prediction model could be applied to filter early-stage molecules for this potential carcinogenicity adverse effect; and the identified five simple molecular descriptors and substructures of carcinogens would give a better understanding of the carcinogenicity of chemicals, and further provide guidance for medicinal chemists in the design of new candidate drugs and lead optimization, ultimately reducing the attrition rate in later stages of drug development.

[1]  Junmei Wang,et al.  Structure – ADME relationship : still a long way to go ? , 2008 .

[2]  Ivan Rusyn,et al.  The Use of Cell Viability Assay Data Improves the Prediction Accuracy of Conventional Quantitative Structure Activity Relationship Models of Animal Carcinogenicity , 2007 .

[3]  Luis G Valerio,et al.  Prediction of rodent carcinogenic potential of naturally occurring chemicals in the human diet using high-throughput QSAR predictive modeling. , 2007, Toxicology and applied pharmacology.

[4]  Youyong Li,et al.  ADMET evaluation in drug discovery. 12. Development of binary classification models for prediction of hERG potassium channel blockage. , 2012, Molecular pharmaceutics.

[5]  J. Dearden The History and Development of Quantitative Structure-Activity Relationships (QSARs) , 2016 .

[6]  Premanjali Rai,et al.  Predicting carcinogenicity of diverse chemicals using probabilistic neural network modeling approaches. , 2013, Toxicology and applied pharmacology.

[7]  J. Huff,et al.  Long‐Term Chemical Carcinogenesis Bioassays Predict Human Cancer Hazards: Issues, Controversies, and Uncertainties , 1999, Annals of the New York Academy of Sciences.

[8]  Romualdo Benigni,et al.  Simple and α,β‐unsaturated aldehydes: Correct prediction of genotoxic activity through structure–activity relationship models , 2005 .

[9]  Kunal Roy,et al.  Development and validation of a robust QSAR model for prediction of carcinogenicity of drugs. , 2011, Indian journal of biochemistry & biophysics.

[10]  Matthew D Segall,et al.  Addressing toxicity risk when designing and selecting compounds in early drug discovery. , 2014, Drug discovery today.

[11]  Romualdo Benigni,et al.  Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens , 2003 .

[12]  Wei Xie,et al.  Computer-Aided Prediction of Rodent Carcinogenicity by PASS and CISOC-PSCT , 2009 .

[13]  Hao Zhu,et al.  ESP: A Method To Predict Toxicity and Pharmacological Properties of Chemicals Using Multiple MCASE Databases , 2004, J. Chem. Inf. Model..

[14]  Naomi L Kruhlak,et al.  Comparison of MC4PC and MDL-QSAR rodent carcinogenicity predictions and the enhancement of predictive performance by combining QSAR models. , 2007, Regulatory toxicology and pharmacology : RTP.

[15]  Ralph Kühne,et al.  Quantitative and qualitative models for carcinogenicity prediction for non-congeneric chemicals using CP ANN method for regulatory uses , 2010, Molecular Diversity.

[16]  Junmei Wang,et al.  Structure – ADME relationship: still a long way to go? , 2008, Expert opinion on drug metabolism & toxicology.

[17]  Romualdo Benigni,et al.  Mechanisms of chemical carcinogenicity and mutagenicity: a review with implications for predictive toxicology. , 2011, Chemical reviews.

[18]  J. Huff,et al.  The carcinogenesis bioassay in perspective: application in identifying human cancer hazards. , 1995, Environmental health perspectives.

[19]  Aixia Yan,et al.  Carcinogenicity prediction of noncongeneric chemicals by a support vector machine. , 2013, Chemical research in toxicology.

[20]  N X Tan,et al.  Prediction of chemical carcinogenicity by machine learning approaches , 2009, SAR and QSAR in environmental research.

[21]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[22]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[23]  T. Kennedy Managing the drug discovery/development interface , 1997 .

[24]  A. Sasco,et al.  The multitude and diversity of environmental carcinogens. , 2007, Environmental research.

[25]  J. Contrera,et al.  Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. , 2003, Regulatory toxicology and pharmacology : RTP.

[26]  Kunal Roy,et al.  First report on development of quantitative interspecies structure-carcinogenicity relationship models and exploring discriminatory features for rodent carcinogenicity of diverse organic chemicals using OECD guidelines. , 2012, Chemosphere.

[27]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[28]  Ji Zhang,et al.  Prediction of drug-induced eosinophilia adverse effect by using SVM and naïve Bayesian approaches , 2015, Medical & Biological Engineering & Computing.

[29]  Maykel Pérez González,et al.  A topological substructural approach applied to the computational prediction of rodent carcinogenicity. , 2005, Bioorganic & medicinal chemistry.

[30]  Yin-tak Woo,et al.  Chemical induction of cancer : modulation and combination effects : an inventory of the many factors which influence carcinogenesis , 1995 .

[31]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[32]  Ji Zhang,et al.  In silico prediction of drug-induced myelotoxicity by using Naïve Bayes method , 2015, Molecular Diversity.

[33]  S. Wolfe,et al.  Timing of new black box warnings and withdrawals for prescription medications. , 2002, JAMA.

[34]  Igor V. Tetko,et al.  Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set , 2010, J. Chem. Inf. Model..

[35]  Emilio Benfenati,et al.  New public QSAR model for carcinogenicity , 2010, Chemistry Central journal.

[36]  Romualdo Benigni,et al.  Designing safer drugs: (Q)SAR-based identification of mutagens and carcinogens. , 2003, Current topics in medicinal chemistry.

[37]  Vladimir V Poroikov,et al.  Computer-aided rodent carcinogenicity prediction. , 2005, Mutation research.

[38]  Yin-tak Woo,et al.  OncoLogic: A Mechanism-Based Expert System for Predicting the Carcinogenic Potential of Chemicals , 2005 .

[39]  R Posthumus,et al.  Validity and validation of expert (Q)SAR systems. , 2005, SAR and QSAR in environmental research.

[40]  G. Mangiatordi,et al.  Applicability Domain for QSAR models: where theory meets reality , 2016 .

[41]  J. Huff,et al.  Scientific concepts, value, and significance of chemical carcinogenesis studies. , 1991, Annual review of pharmacology and toxicology.

[42]  James O. Berger Statistical Decision Theory , 1980 .

[43]  Feng Luan,et al.  Classification of the carcinogenicity of N-nitroso compounds based on support vector machines and linear discriminant analysis. , 2005, Chemical research in toxicology.

[44]  Romualdo Benigni,et al.  Simple and α,β‐unsaturated aldehydes: Correct prediction of genotoxic activity through structure–activity relationship models , 2005 .

[45]  Joseph C. Arcos,et al.  Role of structure-activity relationship analysis in evaluation of pesticides for potential carcinogenicity , 1989 .

[46]  Giuseppina C. Gini,et al.  Predictive Carcinogenicity: A Model for Aromatic Compounds, with Nitrogen-Containing Substituents, Based on Molecular Descriptors Using an Artificial Neural Network , 1999, J. Chem. Inf. Comput. Sci..

[47]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[48]  J. Contrera,et al.  A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. , 1998, Regulatory toxicology and pharmacology : RTP.

[49]  A. Hayes,et al.  Reassessing the two-year rodent carcinogenicity bioassay: a review of the applicability to human risk and current perspectives. , 2014, Regulatory toxicology and pharmacology : RTP.

[50]  Romualdo Benigni,et al.  The Benigni / Bossa rulebase for mutagenicity and carcinogenicity - a module of Toxtree , 2008 .

[51]  G. Jena,et al.  Regulatory requirements and ICH guidelines on carcinogenicity testing of pharmaceuticals: A review on current status , 2005 .