Comparative Analysis of Machine Learning Techniques for the Prediction of the DMPK Parameters Intrinsic Clearance and Plasma Protein Binding

Several machine learning techniques were evaluated for the prediction of parameters relevant in pharmacology and drug discovery including rat and human microsomal intrinsic clearance as well as plasma protein binding represented as the fraction of unbound compound. The algorithms assessed in this study include artificial neural networks (ANN), support vector machines (SVM) with the extension for regression, kappa nearest neighbor (KNN), and Kohonen Networks. The data sets, obtained through literature data mining, were described through a series of scalar, twoand three-dimensional descriptors including 2-D and 3-D autocorrelation, and radial distribution function. The feature sets were optimized for each data set individually for each machine learning technique using sequential forward feature selection. The data sets range from 400 to 600 compounds with experimentally determined values. Intrinsic clearance (CLint) is a measure of metabolism by cytochrome P-450 enzymes primarily in the vesicles of the smooth endoplasmic reticulum. These important enzymes contribute to the metabolism of an estimated 75% of the most frequently prescribed drugs in the U.S. The fraction of unbound compound (fu) greatly influences pharmacokinetics, efficacy, and toxicology. In this study, machine learning models were constructed by systematically optimizing feature sets and algorithmic parameters to calculate these parameters of interest with cross validated correlation/RMSD values reaching 9.53 over the normalized data set. These fully in silico models are useful in guiding early stages of drug discovery, such as analogue prioritization prior to synthesis and biological testing while reducing costs associated with the in vitro determination of these parameters. These models are made freely available for academic use.

[1]  Thomas Fox,et al.  Machine learning techniques for in silico modeling of drug metabolism. , 2006, Current topics in medicinal chemistry.

[2]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[3]  C. Yan,et al.  Discrimination of outer membrane proteins using a K-nearest neighbor method , 2007, Amino Acids.

[4]  Igor V. Pletnev,et al.  Drug Discovery Using Support Vector Machines. The Case Studies of Drug-likeness, Agrochemical-likeness, and Enzyme Inhibition Predictions , 2003, J. Chem. Inf. Comput. Sci..

[5]  Jens Meiler,et al.  Epothilones: Quantitative Structure Activity Relations Studied by Support Vector Machines and Artificial Neural Networks , 2003 .

[6]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[7]  L. Berezhkovskiy On the influence of protein binding on pharmacological activity of drugs. , 2010, Journal of pharmaceutical sciences.

[8]  J. Gasteiger,et al.  Automatic generation of 3D-atomic coordinates for organic molecules , 1990 .

[9]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[10]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[11]  Stephan Schmidt,et al.  Significance of protein binding in pharmacokinetics and pharmacodynamics. , 2010, Journal of pharmaceutical sciences.

[12]  Berith F. Jensen,et al.  In silico prediction of cytochrome P450 2D6 and 3A4 inhibition using Gaussian kernel weighted k-nearest neighbor and extended connectivity fingerprints, including structural fragment analysis of inhibitors versus noninhibitors. , 2007, Journal of medicinal chemistry.

[13]  I V Tetko,et al.  Volume learning algorithm artificial neural networks for 3D QSAR studies. , 2001, Journal of medicinal chemistry.

[14]  David A Winkler,et al.  Neural networks as robust tools in drug lead discovery and development , 2004, Molecular biotechnology.

[15]  HighWire Press,et al.  Drug metabolism and disposition : the biological fate of chemicals. , 1973 .

[16]  J. Meiler PROSHIFT: Protein chemical shift prediction using artificial neural networks , 2003, Journal of biomolecular NMR.

[17]  Song-Yu Yang,et al.  Type 10 17beta-hydroxysteroid dehydrogenase catalyzing the oxidation of steroid modulators of γ-aminobutyric acid type A receptors , 2005, Molecular and Cellular Endocrinology.

[18]  Barry C. Jones,et al.  DRUG-DRUG INTERACTIONS FOR UDP-GLUCURONOSYLTRANSFERASE SUBSTRATES: A PHARMACOKINETIC EXPLANATION FOR TYPICALLY OBSERVED LOW EXPOSURE (AUCI/AUC) RATIOS , 2004, Drug Metabolism and Disposition.

[19]  Andreas Bender,et al.  Melting Point Prediction Employing k-Nearest Neighbor Algorithms and Genetic Parameter Optimization , 2006, J. Chem. Inf. Model..

[20]  K.Z. Mao,et al.  Orthogonal forward selection and backward elimination algorithms for feature subset selection , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  I. Kola,et al.  Can the pharmaceutical industry reduce attrition rates? , 2004, Nature Reviews Drug Discovery.

[22]  Jens Meiler,et al.  Application of machine learning approaches on quantitative structure activity relationships , 2009, CIBCB.

[23]  Jens Meiler,et al.  Identification of Metabotropic Glutamate Receptor Subtype 5 Potentiators Using Virtual High-Throughput Screening , 2010, ACS chemical neuroscience.

[24]  A. Tropsha,et al.  Development and validation of k-nearest-neighbor QSPR models of metabolic stability of drug candidates. , 2003, Journal of medicinal chemistry.

[25]  Tatiana Nikolskaya,et al.  Modeling of human cytochrome p450-mediated drug metabolism using unsupervised machine learning approach. , 2003, Journal of medicinal chemistry.

[26]  W Patrick Walters,et al.  Prediction of 'drug-likeness'. , 2002, Advanced drug delivery reviews.

[27]  Allen B Richon Current status and future direction of the molecular modeling industry. , 2008, Drug discovery today.

[28]  Paulo Paixão,et al.  Prediction of the in vitro intrinsic clearance determined in suspensions of human hepatocytes by using artificial neural networks. , 2010, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[29]  Jens Meiler,et al.  Comparative analysis of machine learning techniques for the prediction of logP , 2011, 2011 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).