Prediction of Polar Surface Area and Drug Transport Processes Using Simple Parameters and PLS Statistics

Modeling of the calculated polar surface area of drugs with rapidly derived descriptors (i.e., the number of hydrogen bonds accepting oxygen and nitrogen atoms and the number of hydrogen atoms bonded to these) using partial least squares projection to latent structures (PLS) analysis is described. The statistical analysis showed strong relationships between the hydrogen-bonding descriptors and the calculated polar surface area of five chemically diverse sets of drugs (R2>0.93 and Q2>0.69, n = 11, 20, 45, 70, and 74, respectively). The statistical models (using H-bonding descriptors and log P) of transport across Caco-2 cells (n = 11), brain-blood partitioning (two data sets, n = 45 and 70) and percent intestinal absorption (n = 20) showed R2 = 0.92, 0.72, 0.76, and 0.81 and Q2 = 0.74, 0.75, 0.71, and 0.73, respectively. The inclusion of log P improved two models, had no effect on one model, and had a slightly negative impact on one model. The combination of H-bonding descriptors with log P is similar to the Lipinski "rule-of-five" mnemonic. However, by using a multivariate statistical method (e.g., PLS), the prediction becomes quantitative instead of qualitative. Good statistical models were derived which permit fast computational screening and prioritization of virtual compound libraries.