论文信息 - In silico technology for identification of potentially toxic compounds in drug discovery

In silico technology for identification of potentially toxic compounds in drug discovery

This review gives the background to analysis of toxicity data, development of predictive algorithms, and applications of these algorithms in lead selection and optimization. The considered algorithms predict acute toxicity (Mouse and Rat LD50), genotoxicity (Ames Test), carcinogenicity, and organspecific health effects (based on diverse animal and human studies). These tools can aid drug design in several ways. Often lead selection is based on the use of simple molecular properties (logP, MW, H-bonding) to define either a druglike or leadlike chemical space. These definitions need to be supplemented with substructurespecific considerations that account for variable chemical reactivity, ionization, and fuzzy-specific interactions with various biological constituents. The available toxicity predictions can fill these gaps to a certain extent, by supplementing or replacing various pre-defined filters of alert substructures that ignore the dependence of chemical reactivity and toxicity on substituent effects and whole-molecule ADME effects. In drug discovery these tools can help to prioritize in vitro measurements and estimate animal toxicity, although multiple data gaps in their training sets restrict their usefulness. A partial solution to this problem is calculation of 95% confidence intervals (or continuous probabilities) that indicate toxicological similarity of a given compound to the training set. If a compound is not too dissimilar, hazard substructures can be automatically generated, thus suggesting possible mechanistic explanations and structural modifications of the lead compound. The best solution however is to develop new predictive algorithms based on companyspecific data, and there are available analytical and development software tools that can help to do this. It is also necessary to continuously improve the existing organ-specific health effect predictions by adding new data (for existing and new endpoints) and improving the overall methodology used in data analysis.