Counter propagation artificial neural network categorical models for prediction of carcinogenicity for non-congeneric chemicals

One of the main goals of the new chemical regulation REACH (Registration, Evaluation and Authorization of Chemicals) is to fill the gaps on the toxicological properties of chemicals that affect human health. Carcinogenicity is one of the endpoints under consideration. The information obtained from (quantitative) structure–activity relationship ((Q)SAR) models is accepted as an alternative solution to avoid expensive and time-consuming animal tests. The reported results were obtained within the framework of the European project ‘Computer Assisted Evaluation of industrial chemical Substances According to Regulations (CAESAR)’. In this article, we demonstrate intermediate results for counter propagation artificial neural network (CP ANN) models for the prediction category of the carcinogenic potency using two-dimensional (2D) descriptors from different software programs. A total of 805 non-congeneric chemicals were extracted from the Carcinogenic Potency Database (CPDBAS). The resulting models had prediction accuracies for internal (training) and external (test) sets as high as 91–93% and 68–70%, respectively. The sensitivity and specificity of the test set were 69–73 and 63–72% correspondingly. High specificity is critical in models for regulatory use that are aimed at ensuring public safety. Thus, the errors that give rise to false negatives are much more relevant. We discuss how we can increase the number of correctly predicted carcinogens using the correlation between the threshold and the values of the sensitivity and specificity.

[1]  J. Dearden,et al.  Predicting Fate-Related Physicochemical Properties , 2007 .

[2]  Philip K. Hopke,et al.  Variable selection in classification of environmental soil samples for partial least square and neural network models , 2001 .

[3]  Emilio Benfenati,et al.  Predictive Carcinogenicity: A Model for Aromatic Compounds, with Nitrogen‐Containing Substituents, Based on Molecular Descriptors Using an Artificial Neural Network. , 2000 .

[4]  E Benfenati,et al.  Computational predictive programs (expert systems) in toxicology. , 1997, Toxicology.

[5]  Vladimir V Poroikov,et al.  Computer-aided rodent carcinogenicity prediction. , 2005, Mutation research.

[6]  Alexandru T. Balaban,et al.  From chemical topology to three-dimensional geometry , 2002 .

[7]  J. Contrera,et al.  A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. , 1998, Regulatory toxicology and pharmacology : RTP.

[8]  Selection procedures in linear models , 1996 .

[9]  David Hartsough,et al.  Toward an Optimal Procedure for Variable Selection and QSAR Model Building , 2001, J. Chem. Inf. Comput. Sci..

[10]  Yu-Shan Shih,et al.  Variable selection bias in regression trees with constant fits , 2004, Comput. Stat. Data Anal..

[11]  Ernesto Estrada Characterization of 3D molecular structure , 2000 .

[12]  Hiroshi Ichikawa,et al.  Hierarchy neural networks as applied to pharmaceutical problems. , 2003, Advanced drug delivery reviews.

[13]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[14]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[15]  Jouko Yliruusi,et al.  Prediction of physicochemical properties based on neural network modelling. , 2003, Advanced drug delivery reviews.

[16]  G Schneider,et al.  Artificial neural networks for computer-based molecular design. , 1998, Progress in biophysics and molecular biology.

[17]  Gisbert Schneider,et al.  Neural networks are useful tools for drug design , 2000, Neural Networks.

[18]  Maykel Pérez González,et al.  The Prediction of Carcinogenicity from Molecular Structure , 2005 .

[19]  Grace Patlewicz,et al.  Quantitative structure‐activity relationships for predicting mutagenicity and carcinogenicity , 2003, Environmental toxicology and chemistry.

[20]  Romualdo Benigni,et al.  QSARs for the Mutagenicity and Carcinogenicity of the Aromatic Amines , 2003 .

[21]  Alexander Tropsha,et al.  Novel Variable Selection Quantitative Structure-Property Relationship Approach Based on the k-Nearest-Neighbor Principle , 2000, J. Chem. Inf. Comput. Sci..

[22]  J. Contrera,et al.  Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. , 2003, Regulatory toxicology and pharmacology : RTP.

[23]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[24]  A. Walmsley,et al.  Improved variable selection procedure for multivariate linear regression , 1997 .

[25]  R Benigni,et al.  Quantitative structure-based modeling applied to characterization and prediction of chemical toxicity. , 1998, Methods.

[26]  Romualdo Benigni,et al.  Collection and Evaluation of (Q)SAR Models for Mutagenicity and Carcinogenicity , 2007 .

[27]  Gilles Klopman,et al.  MC4PC—An Artificial Intelligence Approach to the Discovery of Quantitative Structure–Toxic Activity Relationships , 2005 .

[28]  Marjan Vracko,et al.  A Study of Structure-Carcinogenic Potency Relationship with Artificial Neural Networks. The Using of Descriptors Related to Geometrical and Electronic Structures , 1997, J. Chem. Inf. Comput. Sci..

[29]  Tom Fawcett,et al.  Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions , 1997, KDD.

[30]  Harry L. Van Trees,et al.  Detection, Estimation, and Modulation Theory, Part I , 1968 .

[31]  M. Randic,et al.  On Characterization of 3D Molecular Structure , 2002 .

[32]  Ann M Richard,et al.  Distributed structure-searchable toxicity (DSSTox) public database network: a proposal. , 2002, Mutation research.

[33]  Giovanna Castellano,et al.  Variable selection using neural-network models , 2000, Neurocomputing.

[34]  Romualdo Benigni,et al.  Quantitative Structure-Activity Relationship (QSAR) Models of Mutagens and Carcinogens , 2003 .

[35]  J. Devillers,et al.  Strengths and Weaknesses of the Backpropagation Neural Network in QSAR and QSPR Studies , 1996 .

[36]  G H Loew,et al.  Computer-assisted mechanistic structure-activity studies: application to diverse classes of chemical carcinogens. , 1985, Environmental health perspectives.

[37]  Wen‐Jun Zhang,et al.  Comparison of different methods for variable selection , 2001 .

[38]  Didier Villemin,et al.  Predicting Carcinogenicity of Polycyclic Aromatic Hydrocarbons from Back-Propagation Neural Network , 1994, Journal of chemical information and computer sciences.

[39]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[40]  Jure Zupan,et al.  Kohonen and counterpropagation artificial neural networks in analytical chemistry , 1997 .

[41]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[42]  John A. Swets,et al.  Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers , 1996 .

[43]  Romualdo Benigni,et al.  Predictivity of QSAR , 2008, J. Chem. Inf. Model..

[44]  M. Melamed Detection , 2021, SETI: Astronomy as a Contact Sport.

[45]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[46]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[47]  Alan R. Katritzky,et al.  A New Efficient Approach for Variable Selection Based on Multiregression: Prediction of Gas Chromatographic Retention Times and Response Factors , 1999, J. Chem. Inf. Comput. Sci..

[48]  A M Richard,et al.  Structure-based methods for predicting mutagenicity and carcinogenicity: are we there yet? , 1998, Mutation research.

[49]  Chris L. Waller,et al.  Development and Validation of a Novel Variable Selection Technique with Application to Multidimensional Quantitative Structure-Activity Relationship Studies , 1999, J. Chem. Inf. Comput. Sci..

[50]  James Devillers,et al.  Neural Networks in QSAR and Drug Design , 1996 .

[51]  Hao Zhu,et al.  ESP: A Method To Predict Toxicity and Pharmacological Properties of Chemicals Using Multiple MCASE Databases , 2004, J. Chem. Inf. Model..

[52]  Kailin Tang,et al.  Comparison of different partial least-squares methods in quantitative structure–activity relationships , 2003 .

[53]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[54]  Desire L. Massart,et al.  Variable selection for neural networks in multivariate calibration , 1998 .

[55]  Robert Combes,et al.  Integrated Decision-tree Testing Strategies for Mutagenicity and Carcinogenicity with Respect to the Requirements of the EU REACH Legislation , 2008, Alternatives to laboratory animals : ATLA.

[56]  Yin-tak Woo,et al.  OncoLogic: A Mechanism-Based Expert System for Predicting the Carcinogenic Potential of Chemicals , 2005 .

[57]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[58]  R. Saracci,et al.  Describing the validity of carcinogen screening tests. , 1979, British Journal of Cancer.

[59]  R Benigni,et al.  QSARS of mutagens and carcinogens: two case studies illustrating problems in the construction of models for noncongeneric chemicals. , 1996, Mutation research.

[60]  Yutaka Kano,et al.  Variable selection for structural models , 2002 .

[61]  A. Worster,et al.  Understanding receiver operating characteristic (ROC) curves. , 2006, CJEM.

[62]  Emilio Benfenati,et al.  Modeling Toxicity by Using Supervised Kohonen Neural Networks , 2003, J. Chem. Inf. Comput. Sci..

[63]  Alessandro Giuliani,et al.  Putting the Predictive Toxicology Challenge Into Perspective: Reflections on the Results , 2003, Bioinform..

[64]  Jure Zupan,et al.  Neural networks in chemistry , 1993 .

[65]  C Helma,et al.  Predictive Models for Carcinogenicity and Mutagenicity: Frameworks, State-of-the-Art, and Perspectives , 2009, Journal of environmental science and health. Part C, Environmental carcinogenesis & ecotoxicology reviews.

[66]  Maykel Pérez González,et al.  A topological substructural approach applied to the computational prediction of rodent carcinogenicity. , 2005, Bioorganic & medicinal chemistry.

[67]  A.M. Richard,et al.  AI and SAR approaches for predicting chemical carcinogenicity: Survey and status report , 2002, SAR and QSAR in environmental research.