The lasso—a novel method for predictive covariate model building in nonlinear mixed effects models

Covariate models for population pharmacokinetics and pharmacodynamics are often built with a stepwise covariate modelling procedure (SCM). When analysing a small dataset this method may produce a covariate model that suffers from selection bias and poor predictive performance. The lasso is a method suggested to remedy these problems. It may also be faster than SCM and provide a validation of the covariate model. The aim of this study was to implement the lasso for covariate selection within NONMEM and to compare this method to SCM.In the lasso all covariates must be standardised to have zero mean and standard deviation one. Subsequently, the model containing all potential covariate–parameter relations is fitted with a restriction: the sum of the absolute covariate coefficients must be smaller than a value, t. The restriction will force some coefficients towards zero while the others are estimated with shrinkage. This means in practice that when fitting the model the covariate relations are tested for inclusion at the same time as the included relations are estimated. For a given SCM analysis, the model size depends on the P-value required for selection. In the lasso the model size instead depends on the value of t which can be estimated using cross-validation.The lasso was implemented as an automated tool using PsN. The method was compared to SCM in 16 scenarios with different dataset sizes, number of investigated covariates and starting models for the covariate analysis. Hundred replicate datasets were created by resampling from a PK-dataset consisting of 721 stroke patients. The two methods were compared primarily on the ability to predict external data, estimate their own predictive performance (external validation), and on the computer run-time.In all 16 scenarios the lasso predicted external data better than SCM with any of the studied P-values (5%, 1% and 0.1%), but the benefit was negligible for large datasets. The lasso cross-validation provided a precise and nearly unbiased estimate of the actual prediction error. On a single processor, the lasso was faster than SCM. Further, the lasso could run completely in parallel whereas SCM must run in steps.In conclusion, the lasso is superior to SCM in obtaining a predictive covariate model on a small dataset or on small subgroups (e.g. rare genotype). Run in parallel the lasso could be much faster than SCM. Using cross-validation, the lasso provides a validation of the covariate model and does not require the user to specify a P-value for selection.

[1]  Willi Sauerbrei,et al.  The Use of Resampling Methods to Simplify Regression Models in Medical Statistics , 1999 .

[2]  Jakob Ribbing,et al.  Power, Selection Bias and Predictive Performance of the Population Pharmacokinetic Covariate Model , 2004, Journal of Pharmacokinetics and Pharmacodynamics.

[3]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[4]  Lewis B. Sheiner,et al.  Building population pharmacokineticpharmacodynamic models. I. Models for covariate effects , 1992, Journal of Pharmacokinetics and Biopharmaceutics.

[5]  G. B. Wetherill,et al.  The Present State of Multiple Comparison Methods , 1971 .

[6]  M. Hutmacher,et al.  Efficient Screening of Covariates in Population Models Using Wald's Approximation to the Likelihood Ratio Test , 2001, Journal of Pharmacokinetics and Pharmacodynamics.

[7]  D. Altman,et al.  Bootstrap investigation of the stability of a Cox regression model. , 1989, Statistics in medicine.

[8]  France Mentré,et al.  Prediction Discrepancies for the Evaluation of Nonlinear Mixed-Effects Models , 2006, Journal of Pharmacokinetics and Pharmacodynamics.

[9]  Ewout W Steyerberg,et al.  Internal and external validation of predictive models: a simulation study of bias and precision in small samples. , 2003, Journal of clinical epidemiology.

[10]  Mats O. Karlsson,et al.  Assessment of Actual Significance Levels for Covariate Effects in NONMEM , 2001, Journal of Pharmacokinetics and Pharmacodynamics.

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  L. Sheiner,et al.  Evaluating Pharmacokinetic/Pharmacodynamic Models Using the Posterior Predictive Check , 2001, Journal of Pharmacokinetics and Pharmacodynamics.

[13]  S. Duffull,et al.  What is the best size descriptor to use for pharmacokinetic studies in the obese? , 2004, British journal of clinical pharmacology.

[14]  E W Steyerberg,et al.  Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. , 1999, Journal of clinical epidemiology.

[15]  E. Niclas Jonsson,et al.  Erratum to "PsN-Toolkit - A collection of computer intensive statistical methods for non-linear mixed effect modeling using NONMEM" [Comput. Methods Prog. Biomedicine 79 (2005) 241-257] , 2005, Comput. Methods Programs Biomed..

[16]  E. Niclas Jonsson,et al.  Perl-speaks-NONMEM (PsN) - a Perl module for NONMEM related programming , 2004, Comput. Methods Programs Biomed..

[17]  Lewis B. Sheiner,et al.  A population pharmacokinetic model for docetaxel (Taxotere®): Model building and validation , 1996, Journal of Pharmacokinetics and Biopharmaceutics.

[18]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[19]  Stuart L. Beal,et al.  Interaction between structural, statistical, and covariate models in population pharmacokinetic analysis , 1994, Journal of Pharmacokinetics and Biopharmaceutics.

[20]  D E Grobbee,et al.  External validation is necessary in prediction research: a clinical example. , 2003, Journal of clinical epidemiology.

[21]  Robert R. Bies,et al.  A Genetic Algorithm-Based, Hybrid Machine Learning Approach to Model Selection , 2006, Journal of Pharmacokinetics and Pharmacodynamics.

[22]  M. Karlsson,et al.  Comparison of stepwise covariate model building strategies in population pharmacokinetic-pharmacodynamic analysis , 2002, AAPS PharmSci.

[23]  J. Habbema,et al.  Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. , 2000, Statistics in medicine.

[24]  Mats O Karlsson,et al.  Population pharmacokinetics of clomethiazole and its effect on the natural course of sedation in acute stroke patients. , 2003, British journal of clinical pharmacology.

[25]  Mats O. Karlsson,et al.  Automated Covariate Model Building Within NONMEM , 1998, Pharmaceutical Research.

[26]  Lewis B. Sheiner,et al.  Some suggestions for measuring predictive performance , 1981, Journal of Pharmacokinetics and Biopharmaceutics.

[27]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[28]  E. Niclas Jonsson,et al.  PsN-Toolkit - A collection of computer intensive statistical methods for non-linear mixed effect modeling using NONMEM , 2005, Comput. Methods Programs Biomed..

[29]  L. Breiman,et al.  Submodel selection and evaluation in regression. The X-random case , 1992 .

[30]  France Mentré,et al.  Metrics for External Model Evaluation with an Application to the Population Pharmacokinetics of Gliclazide , 2006, Pharmaceutical Research.

[31]  J. Rodgers,et al.  The Bootstrap, the Jackknife, and the Randomization Test: A Sampling Taxonomy. , 1999, Multivariate behavioral research.

[32]  Stephen B. Duffull,et al.  Quantification of Lean Bodyweight , 2005, Clinical pharmacokinetics.

[33]  Nicholas H. G. Holford,et al.  A Size Standard for Pharmacokinetics , 1996, Clinical pharmacokinetics.

[34]  J. Shao Bootstrap Model Selection , 1996 .

[35]  M. O. Karlsson,et al.  The importance of modeling interoccasion variability in population pharmacokinetic analyses , 1993, Journal of Pharmacokinetics and Biopharmaceutics.

[36]  Robert Tibshirani,et al.  Computer‐Intensive Statistical Methods , 2006 .

[37]  N. Holford,et al.  Quantitative justification for target concentration intervention--parameter variability and predictive performance using population pharmacokinetic models for aminoglycosides. , 2004, British journal of clinical pharmacology.

[38]  E. Olofsen Using the Lasso to simultaneously identify the covariate and variance-covariance structures of nonlinear mixed-effects models , 2006 .

[39]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[40]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[41]  S. le Cessie,et al.  Predictive value of statistical models. , 1990, Statistics in medicine.