Quantitative structure-pharmacokinetic relationships for drug clearance by using statistical learning methods.

Quantitative structure-pharmacokinetic relationships (QSPkR) have increasingly been used for the prediction of the pharmacokinetic properties of drug leads. Several QSPkR models have been developed to predict the total clearance (CL(tot)) of a compound. These models give good prediction accuracy but they are primarily based on a limited number of related compounds which are significantly lesser in number and diversity than the 503 compounds with known CL(tot) described in the literature. It is desirable to examine whether these and other statistical learning methods can be used for predicting the CL(tot) of a more diverse set of compounds. In this work, three statistical learning methods, general regression neural network (GRNN), support vector regression (SVR) and k-nearest neighbour (KNN) were explored for modeling the CL(tot) of all of the 503 known compounds. Six different sets of molecular descriptors, DS-MIXED, DS-3DMoRSE, DS-ATS, DS-GETAWAY, DS-RDF and DS-WHIM, were evaluated for their usefulness in the prediction of CL(tot). GRNN-, SVR- and KNN-developed models have average-fold errors in the range of 1.63 to 1.96, 1.66-1.95 and 1.90-2.23, respectively. For the best GRNN-, SVR- and KNN-developed models, the percentage of compounds with predicted CL(tot) within two-fold error of actual values are in the range of 61.9-74.3% and are comparable or slightly better than those of earlier studies. QSPkR models developed by using DS-MIXED, which is a collection of constitutional, geometrical, topological and electrotopological descriptors, generally give better prediction accuracies than those developed by using other descriptor sets. These results suggest that GRNN, SVR, and their consensus model are potentially useful for predicting QSPkR properties of drug leads.

[1]  Yoshitaka Yano,et al.  Prediction of human clearance from animal data and molecular structural parameters using multivariate regression analysis. , 2002, Journal of pharmaceutical sciences.

[2]  A. J. Hopfinger,et al.  Predicting Blood–Brain Barrier Partitioning of Organic Molecules Using Membrane–Interaction QSAR Analysis , 2002, Pharmaceutical Research.

[3]  Z R Li,et al.  Prediction of genotoxicity of chemical compounds by statistical learning methods. , 2005, Chemical research in toxicology.

[4]  T Lavé,et al.  Prediction of Hepatic Metabolic Clearance , 2001, Clinical pharmacokinetics.

[5]  D J Rance,et al.  The prediction of human pharmacokinetic parameters from preclinical and in vitro metabolism data. , 1997, The Journal of pharmacology and experimental therapeutics.

[6]  Dennis H. Rouvray Computational chemical graph theory , 1990 .

[7]  Alexander Tropsha,et al.  Quantitative structure-pharmacokinetic parameters relationships (QSPKR) analysis of antimicrobial agents in humans using simulated annealing k-nearest-neighbor and partial least-square analysis methods. , 2004, Journal of pharmaceutical sciences.

[8]  Boris Hollas,et al.  An Analysis of the Autocorrelation Descriptor for Molecules , 2003 .

[9]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[10]  David J. Livingstone,et al.  Data Analysis for Chemists: Applications to QSAR and Chemical Product Design , 1996 .

[11]  Johann Gasteiger,et al.  Deriving the 3D structure of organic molecules from their infrared spectra , 1999 .

[12]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[13]  A. Tropsha,et al.  Development and validation of k-nearest-neighbor QSPR models of metabolic stability of drug candidates. , 2003, Journal of medicinal chemistry.

[14]  Peter C Jurs,et al.  Predicting the genotoxicity of thiophene derivatives from molecular structure. , 2003, Chemical research in toxicology.

[15]  Han van de Waterbeemd,et al.  Pharmacokinetics and metabolism in drug design , 2001 .

[16]  Peter C. Jurs,et al.  Prediction of Human Intestinal Absorption of Drug Compounds from Molecular Structure , 1998, J. Chem. Inf. Comput. Sci..

[17]  György M Keseru,et al.  A neural network based virtual screening of cytochrome P450 3A4 inhibitors. , 2002, Bioorganic & medicinal chemistry letters.

[18]  Douglas M. Hawkins,et al.  Assessing Model Fit by Cross-Validation , 2003, J. Chem. Inf. Comput. Sci..

[19]  Roberto Todeschini,et al.  Structure/Response Correlations and Similarity/Diversity Analysis by GETAWAY Descriptors, 1. Theory of the Novel 3D Molecular Descriptors , 2002, J. Chem. Inf. Comput. Sci..

[20]  Panos Macheras,et al.  Multivariate Statistics of Disposition Pharmacokinetic Parameters for Structurally Unrelated Drugs Used in Therapeutics , 2002, Pharmaceutical Research.

[21]  L. Hall,et al.  Molecular Structure Description: The Electrotopological State , 1999 .

[22]  W. Pardridge,et al.  CNS Drug Design Based on Principles of Blood‐Brain Barrier Transport , 1998, Journal of neurochemistry.

[23]  H Matter,et al.  Random or rational design? Evaluation of diverse compound subsets from chemical structure databases. , 1998, Journal of medicinal chemistry.

[24]  Johann Gasteiger,et al.  The Coding of the Three-Dimensional Structure of Molecules by Molecular Transforms and Its Application to Structure-Spectra Correlations and Studies of Biological Activity , 1996, J. Chem. Inf. Comput. Sci..

[25]  Yu Zong Chen,et al.  Prediction of Cytochrome P450 3A4, 2D6, and 2C9 Inhibitors and Substrates by Using Support Vector Machines , 2005, J. Chem. Inf. Model..

[26]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[27]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[28]  S. Unger Molecular Connectivity in Structure–activity Analysis , 1987 .

[29]  G R Wilkinson,et al.  Clearance approaches in pharmacology. , 1987, Pharmacological reviews.

[30]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[31]  J. Ruuskanen,et al.  Performance of (consensus) kNN QSAR for predicting estrogenic activity in a large diverse set of organic compounds , 2004, SAR and QSAR in environmental research.

[32]  Tudor I. Oprea,et al.  Chemography: the Art of Navigating in Chemical Space , 2000 .

[33]  Stephen R. Johnson,et al.  Molecular properties that influence the oral bioavailability of drug candidates. , 2002, Journal of medicinal chemistry.

[34]  Jorge Gálvez,et al.  Charge Indexes. New Topological Descriptors , 1994, J. Chem. Inf. Comput. Sci..

[35]  Ing-Marie Olsson,et al.  D-optimal onion designs in statistical molecular design , 2004 .

[36]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[37]  L. Kier Indexes of molecular shape from chemical graphs , 1987, Medicinal research reviews.

[38]  Ulf Norinder,et al.  Support vector machine models in drug design: applications to drug transport processes and QSAR using simplex optimisations and variable selection , 2003, Neurocomputing.

[39]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[40]  Peter C. Jurs,et al.  QSAR/QSPR Studies Using Probabilistic Neural Networks and Generalized Regression Neural Networks , 2002, J. Chem. Inf. Comput. Sci..

[41]  Y. Z. Chen,et al.  Quantitative Structure-Pharmacokinetic Relationships for drug distribution properties by using general regression neural network. , 2005, Journal of pharmaceutical sciences.

[42]  Juan J Perez,et al.  Managing molecular diversity. , 2005, Chemical Society reviews.

[43]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[44]  Subhash C. Basak,et al.  Prediction of Complement-Inhibitory Activity of Benzamidines Using Topological and Geometric Parameters , 1999, J. Chem. Inf. Comput. Sci..

[45]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[46]  Bengt Rippe,et al.  Ficoll and dextran vs. globular proteins as probes for testing glomerular permselectivity: effects of molecular size, shape, charge, and deformability. , 2005, American journal of physiology. Renal physiology.

[47]  R. Obach,et al.  Prediction of human clearance of twenty-nine drugs from hepatic microsomal intrinsic clearance data: An examination of in vitro half-life approach and nonspecific binding to microsomes. , 1999, Drug metabolism and disposition: the biological fate of chemicals.

[48]  M. Rami Reddy,et al.  Assessment of methods used for predicting lipophilicity: Application to nucleosides and nucleoside bases , 1993, J. Comput. Chem..

[49]  J. Zupan,et al.  Separation of data on the training and test set for modelling: a case study for modelling of five colour properties of a white pigment , 2003 .

[50]  Tomoko Niwa,et al.  Using General Regression and Probabilistic Neural Networks To Predict Human Intestinal Absorption with Topological Descriptors Derived from Two-Dimensional Chemical Structures , 2003, J. Chem. Inf. Comput. Sci..

[51]  Dennis A. Smith,et al.  Properties of cytochrome P450 isoenzymes and their substrates Part 2: properties of cytochrome P450 substrates , 1997 .

[52]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[53]  W. Stigelman,et al.  Goodman and Gilman's the Pharmacological Basis of Therapeutics , 1986 .

[54]  Gonzalo Colmenarejo,et al.  In silico prediction of drug‐binding strengths to human serum albumin , 2003, Medicinal research reviews.

[55]  Sean Ekins,et al.  Pharmacophore modeling of cytochromes P450. , 2002, Advanced drug delivery reviews.

[56]  Y. Sugiyama,et al.  Prediction of human hepatic clearance from in vivo animal experiments and in vitro metabolic studies with liver microsomes from animals and humans. , 2001, Drug metabolism and disposition: the biological fate of chemicals.

[57]  J. V. Turner,et al.  Pharmacokinetic parameter prediction from drug structure using artificial neural networks. , 2004, International journal of pharmaceutics.

[58]  T.W. Schultz,et al.  Selection of data sets for qsars: Analyses of tetrahymena toxicity from aromatic compounds , 2003, SAR and QSAR in environmental research.

[59]  Roberto Todeschini,et al.  MS-WHIM, new 3D theoretical descriptors derived from molecular surface properties: A comparative 3D QSAR study in a series of steroids , 1997, J. Comput. Aided Mol. Des..

[60]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[61]  Y. Yano,et al.  Prediction of human pharmacokinetics from animal data and molecular structural parameters using multivariate regression analysis: oral clearance. , 2003, Journal of pharmaceutical sciences.

[62]  A. Tsantili-Kakoulidou,et al.  Quantitative structure-pharmacokinetic relationships for disposition parameters of cephalosporins. , 2003, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[63]  Snezana Agatonovic-Kustrin,et al.  Multiple pharmacokinetic parameter prediction for a series of cephalosporins. , 2003, Journal of pharmaceutical sciences.

[64]  Jeffrey J. Sutherland,et al.  Spline-Fitting with a Genetic Algorithm: A Method for Developing Classification Structure-Activity Relationships , 2003, J. Chem. Inf. Comput. Sci..

[65]  Jeffrey J. Sutherland,et al.  Development of Quantitative Structure-Activity Relationships and Classification Models for Anticonvulsant Activity of Hydantoin Analogues , 2003, J. Chem. Inf. Comput. Sci..

[66]  J. F. Wang,et al.  Prediction of P-Glycoprotein Substrates by a Support Vector Machine Approach , 2004, J. Chem. Inf. Model..

[67]  Zheng Yuan,et al.  Prediction of protein accessible surface areas by support vector regression , 2004, Proteins.

[68]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[69]  Y. Sugiyama,et al.  Quantitative prediction of in vivo drug clearance and drug interactions from in vitro data on metabolism, together with binding and transport. , 1998, Annual review of pharmacology and toxicology.

[70]  A. Kozak,et al.  Does cross validation provide additional information in the evaluation of regression models , 2003 .

[71]  Lemont B. Kier,et al.  QSAR modeling of β-lactam binding to human serum proteins , 2003, J. Comput. Aided Mol. Des..