In Silico Prediction of Human Intravenous Pharmacokinetic Parameters with Improved Accuracy

Human pharmacokinetics is of great significance in the selection of drug candidates, and in silico estimation of pharmacokinetic parameters in the early stage of drug development has become the trend of drug research owing to its time- and cost-saving advantages. Herein, quantitative structure-property relationship studies were carried out to predict four human pharmacokinetic parameters including volume of distribution at steady state (VDss), clearance (CL), terminal half-life (t1/2), and fraction unbound in plasma (fu), using a dataset consisted of 1352 drugs. A series of regression models were built using the most suitable features selected by Boruta algorithm and four machine learning methods including support vector machine (SVM), random forest (RF), gradient boosting machine (GBM), and XGBoost (XGB). For VDss, SVM showed the best performance with R2test = 0.870 and RMSEtest = 0.208. For the other three pharmacokinetic parameters, the RF models produced the superior prediction accuracy (for CL, R2test = 0.875 and RMSEtest = 0.103; for t1/2, R2test = 0.832 and RMSEtest = 0.154; for fu, R2test = 0.818 and RMSEtest = 0.291). Assessed by 10-fold cross validation, leave-one-out cross validation, Y-randomization test and applicability domain evaluation, these models demonstrated excellent stability and predictive ability. Compared with other published models for human pharmacokinetic parameters estimation, it was further confirmed that our models obtained better predictive ability and could be used in the selection of preclinical candidates.