How Experimental Errors Influence Drug Metabolism and Pharmacokinetic QSAR/QSPR Models

We consider the impact of gross, systematic, and random experimental errors in relation to their impact on the predictive ability of QSAR/QSPR DMPK models used within early drug discovery. Models whose training sets contain fewer but repeatedly measured data points, with a defined threshold for the random error, resulted in prediction improvements ranging from 3.3% to 23.0% for an external test set, compared to models built from training sets in which the molecules were defined by single measurements. Similarly, models built on data with low experimental uncertainty, compared to those built on data with higher experimental uncertainty, gave prediction improvements ranging from 3.3% to 27.5%.

[1]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[2]  Stephen R. Johnson,et al.  The Trouble with QSAR (or How I Learned To Stop Worrying and Embrace Fallacy) , 2008, J. Chem. Inf. Model..

[3]  Pierre Bruneau,et al.  Search for Predictive Generic Model of Aqueous Solubility Using Bayesian Neural Nets , 2001, J. Chem. Inf. Comput. Sci..

[4]  Yun Alelyunas,et al.  Application of a Dried-DMSO rapid throughput 24-h equilibrium solubility in advancing discovery candidates. , 2009, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[5]  K. Valko,et al.  Application of high-performance liquid chromatography based measurements of lipophilicity to model biological distribution. , 2004, Journal of chromatography. A.

[6]  F. Lombardo,et al.  ElogD(oct): a tool for lipophilicity determination in drug discovery. 2. Basic and neutral compounds. , 2001, Journal of medicinal chemistry.

[7]  Scott Boyer,et al.  AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment , 2011, J. Cheminformatics.

[8]  T Scior,et al.  How to recognize and workaround pitfalls in QSAR studies: a critical review. , 2009, Current medicinal chemistry.

[9]  James N. Miller,et al.  Basic statistical methods for analytical chemistry. Part I. Statistics of repeated measurements. A review , 1988 .

[10]  Patrick Barton,et al.  A Method for Measuring the Lipophilicity of Compounds in Mixtures of 10 , 2011, Journal of biomolecular screening.

[11]  Robin Smith,et al.  High-throughput metabolic stability studies in drug discovery by orthogonal acceleration time-of-flight (OATOF) with analogue-to-digital signal capture (ADC). , 2010, Rapid communications in mass spectrometry : RCM.

[12]  Hong Wan,et al.  High-throughput screening of protein binding by equilibrium dialysis combined with liquid chromatography and mass spectrometry. , 2006, Journal of chromatography. A.

[13]  M. Cronin,et al.  Pitfalls in QSAR , 2003 .

[14]  Ulf Norinder,et al.  Automated QSAR with a Hierarchy of Global and Local Models , 2011, Molecular informatics.

[15]  M. Wenlock,et al.  The Role of Plasma Protein Binding in Drug Discovery , 2007 .

[16]  Yi Li,et al.  In silico ADME/Tox: why models fail , 2003, J. Comput. Aided Mol. Des..

[17]  Gábor Csányi,et al.  Gaussian Processes: A Method for Automatic QSAR Modeling of ADME Properties , 2007, J. Chem. Inf. Model..

[18]  Patrick Barton,et al.  A Highly Automated Assay for Determining the Aqueous Equilibrium Solubility of Drug Discovery Compounds , 2011, Journal of laboratory automation.