Correlation ranking procedure for factor selection in PC-ANN modeling and application to ADMETox evaluation

Abstract A correlation ranking procedure is proposed for selection of factors in principal component-artificial neural network (PC-ANN). The model was applied in the ADMETox evaluation to predict the carcinogenesis activity of 60 organic solvents and the blood–brain barrier (BBB) partitioning of 115 diverse organic molecules. A total of 150 molecular descriptors, including quantum chemical, constitutional, topological and chemical descriptors were calculated. The resulted descriptors were subjected to principal component analysis (PCA), and a three-layered feed forward artificial neural network (ANN) model was employed to model the nonlinear relationship between the extracted principal components (PCs) and the activities. A correlation ranking procedure is proposed here to select the most relevant set of PCs. First, the nonlinear relationship between each one of the PCs was modeled by separate neural networks and the correlation ability of each PC with the activity data was determined. Then, the PCs were entered to the ANN model based on their decreasing correlation ability. The results supported that the proposed model could predict the carcinogenesis activity and logBBB of the organic compounds with percent relative error lower than 4%. Comparison of the results with two other existing factor selection methods named eigenvalue ranking (EV-PC-ANN) and genetic algorithm (PC-GA-ANN) revealed that the proposed model gave results near to the PC-GA-ANN method, while less accurate results were obtained by the EV-PC-ANN procedure.

[1]  Akira Tsuji,et al.  Drug delivery through the blood-brain barrier , 1996 .

[2]  Andreas Zell,et al.  Feature Selection for Descriptor Based Classification Models. 1. Theory and GA-SEC Algorithm , 2004, J. Chem. Inf. Model..

[3]  Gilles Klopman,et al.  ADME evaluation. 2. A computer model for the prediction of intestinal absorption in humans. , 2002, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[4]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .

[5]  C. B. Lucasius,et al.  Genetic algorithms in wavelength selection: a comparative study , 1994 .

[6]  Yu-Long Xie,et al.  Evaluation of principal component selection methods to form a global prediction model by principal component regression , 1997 .

[7]  M. Shamsipur,et al.  Multicomponent acid–base titration by principal component-artificial neural network calibration , 2002 .

[8]  Palanisamy Thanikaivelan,et al.  Application of quantum chemical descriptor in quantitative structure activity and structure property relationship , 2000 .

[9]  Narayanan Surendran,et al.  Implementation of an ADME enabling selection and visualization tool for drug discovery. , 2004, Journal of pharmaceutical sciences.

[10]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery. 3. Modeling Blood-Brain Barrier Partitioning Using Simple Molecular Descriptors , 2003, J. Chem. Inf. Comput. Sci..

[11]  Bahram Hemmateenejad,et al.  QSAR study of the calcium channel antagonist activity of some recently synthesized dihydropyridine derivatives. An application of genetic algorithm for variable selection in MLR and PLS methods , 2002 .

[12]  Bahram Hemmateenejad,et al.  Genetic Algorithm Applied to the Selection of Factors in Principal Component-Artificial Neural Networks: Application to QSAR Study of Calcium Channel Antagonist Activity of 1, 4-Dihydropyridines (Nifedipine Analogous) , 2003, J. Chem. Inf. Comput. Sci..

[13]  David J. Begley,et al.  Potential of Immobilized Artificial Membranes for Predicting Drug Penetration Across the Blood−Brain Barrier , 1998, Pharmaceutical Research.

[14]  H. Schmidli Multivariate prediction for QSAR , 1997 .

[15]  S. Wold,et al.  Some recent developments in PLS modeling , 2001 .

[16]  Bahram Hemmateenejad,et al.  Application of ab initio theory for the prediction of acidity constants of some 1-hydroxy-9,10-anthraquinone derivatives using genetic neural network , 2003 .

[17]  D. Massart,et al.  Comparison of Prediction- and Correlation-Based Methods to Select the Best Subset of Principal Components for Principal Component Regression and Detect Outlying Objects , 1998 .

[18]  I. Jolliffe Principal Component Analysis , 2002 .

[19]  Juan M. Luco,et al.  Prediction of the Brain-Blood Distribution of a Large Set of Drugs from Structurally Derived Descriptors Using Partial Least-Squares (PLS) Modeling , 1999, J. Chem. Inf. Comput. Sci..

[20]  Didier Villemin,et al.  Neural Networks: Accurate Nonlinear QSAR Model for HEPT Derivatives , 2003, J. Chem. Inf. Comput. Sci..

[21]  M. Shamsipur,et al.  Quantitative Structure‐Activity Relationship Study of Recently Synthesized 1, 4‐Dihydropyridine Calcium Channel Antagonists. Application of the Hansch Analysis Method , 2002, Archiv der Pharmazie.

[22]  Jürgen Bajorath,et al.  Molecular Descriptors for Effective Classification of Biologically Active Compounds Based on Principal Component Analysis Identified by a Genetic Algorithm , 2000, J. Chem. Inf. Comput. Sci..

[23]  Ronald T. Borchardt,et al.  Hydrogen Bonding Potential as a Determinant of the in Vitro and in Situ Blood–Brain Barrier Permeability of Peptides , 1994, Pharmaceutical Research.

[24]  Paul J. Gemperline,et al.  Nonlinear multivariate calibration using principal components regression and artificial neural networks , 1991 .

[25]  U. Norinder,et al.  Computational approaches to the prediction of the blood-brain distribution. , 2002, Advanced drug delivery reviews.

[26]  Brian D. Hudson,et al.  A Consensus Neural Network-Based Technique for Discriminating Soluble and Poorly Soluble Compounds , 2003, J. Chem. Inf. Comput. Sci..

[27]  S C Basak,et al.  Predicting mutagenicity of chemicals using topological and quantum chemical parameters: a similarity based study. , 1995, Chemosphere.

[28]  G Schneider,et al.  Artificial neural networks for computer-based molecular design. , 1998, Progress in biophysics and molecular biology.

[29]  R Benigni,et al.  Prediction of rodent carcinogenicity of aromatic amines: a quantitative structure-activity relationships model. , 2001, Carcinogenesis.

[30]  Adam P. Hitchcock,et al.  Quantitative Mapping of Structured Polymeric Systems Using Singular Value Decomposition Analysis of Soft X-ray Images , 2002 .

[31]  C. Reichardt Solvents and Solvent Effects in Organic Chemistry , 1988 .

[32]  Charles Hagwood,et al.  Mathematical analysis of spectral orthogonality , 1993 .

[33]  W. P. Purcell,et al.  Review of mutagenicity of monocyclic aromatic amines: quantitative structure-activity relationships. , 1997, Mutation research.

[34]  Harpreet S. Chadha,et al.  Hydrogen bonding. 33. Factors that influence the distribution of solutes between blood and brain. , 1994, Journal of pharmaceutical sciences.

[35]  Y. Takahata,et al.  Structure-Activity Relationship Studies of Carcinogenic Activity of Polycyclic Aromatic Hydrocarbons Using Calculated Molecular Descriptors with Principal Component Analysis and Neural Network Methods , 1999, J. Chem. Inf. Comput. Sci..

[36]  Tingjun Hou,et al.  ADME evaluation in drug discovery , 2002, Journal of molecular modeling.

[37]  B. Hemmateenejad,et al.  Quantum Chemical‐QSAR Study of Some Newly Synthesized 1,4‐Dihydropyridine Calcium Channel Blockers , 2003 .

[38]  Harpreet S. Chadha,et al.  Hydrogen-bonding. Part 36. Determination of blood brain distribution using octanol-water partition coefficients. , 1995, Drug design and discovery.

[39]  Antonio Chana,et al.  CODES/neural network model: A useful tool for in silico prediction of oral absorption and blood-brain barrier permeability of structurally diverse drugs , 2004 .

[40]  Romualdo Benigni,et al.  Carcinogenicity of the aromatic amines: from structure-activity relationships to mechanisms of action and risk assessment. , 2002, Mutation research.

[41]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery. 2. Prediction of Partition Coefficient by Atom-Additive Approach Based on Atom-Weighted Solvent Accessible Surface Areas , 2003, J. Chem. Inf. Comput. Sci..

[42]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[43]  U Depczynski,et al.  Genetic algorithms applied to the selection of factors in principal component regression , 2000 .

[44]  John H. Kalivas,et al.  Which principal components to utilize for principal component regression , 1992 .

[45]  Xueguang Shao,et al.  Molecular interactions of α-cyclodextrin inclusion complexes using a genetic algorithm , 2001 .

[46]  D B Kell,et al.  Solvent selection for whole cell biotransformations in organic media. , 1995, Critical reviews in biotechnology.

[47]  C. Hansch,et al.  Chem-bioinformatics and QSAR: a review of QSAR lacking positive hydrophobic terms. , 2001, Chemical reviews.

[48]  Jianguo Sun,et al.  A correlation principal component regression analysis of NIR data , 1995 .

[49]  Michał J. Markuszewski,et al.  Brain/blood distribution described by a combination of partition coefficient and molecular mass , 1996 .

[50]  Bahram Hemmateenejad,et al.  Application of ab initio theory to QSAR study of 1,4‐dihydropyridine‐based calcium channel blockers using GA‐MLR and PC‐GA‐ANN procedures , 2004, J. Comput. Chem..

[51]  K. Baumann,et al.  A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations , 2002 .

[52]  Hongshi Yu,et al.  ADME-Tox in drug discovery: integration of experimental and computational technologies. , 2003, Drug discovery today.

[53]  Douglas N. Rutledge,et al.  GENETIC ALGORITHM APPLIED TO THE SELECTION OF PRINCIPAL COMPONENTS , 1998 .

[54]  Harold C. Sox,et al.  Gulf War and Health , 2000 .

[55]  Gordon A Anderson,et al.  Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses. , 2003, Analytical chemistry.

[56]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery. 4. Prediction of Aqueous Solubility Based on Atom Contribution Approach , 2004, J. Chem. Inf. Model..

[57]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[58]  K Tuppurainen,et al.  Frontier orbital energies, hydrophobicity and steric factors as physical QSAR descriptors of molecular mutagenicity. A review with a case study: MX compounds. , 1999, Chemosphere.

[59]  D. E. Clark,et al.  Rapid calculation of polar molecular surface area and its application to the prediction of transport phenomena. 2. Prediction of blood-brain barrier penetration. , 1999, Journal of pharmaceutical sciences.

[60]  Yukui Zhang,et al.  Application of an artificial neural network in chromatography—retention behavior prediction and pattern recognition , 1999 .

[61]  Desire L. Massart,et al.  Feature selection in principal component analysis of analytical data , 2002 .