The Price is Right: Predicting Reagent Prices
暂无分享,去创建一个
We present a model for estimating the price of a reagent
from its chemical structure. It is intended to be useful when doing reagent
selection for library design. The model is a Random Forest regressor which is trained
on the MolPort catalog of 302K reagents and the log of their price. For descriptors
we use topological fingerprints from RDKit: chiral Morgan fingerprints, its medicinal
chemistry descriptors, and counts of undetermined chiral centers. The model has
an out-of-bag performance of 34% variance explained in log Price. When
predicting on known reagents, the model explains 91% of the variance in log Price.
We analyzed the model to understand the errors that the model makes. We show
that the compounds with the highest errors have only a subtly different
structure from similar molecules, but very different in price.