Chemogenomic data analysis: prediction of small-molecule targets and the advent of biological fingerprint.

Chemogenomics comprises a systematic relationship between targets and ligands that are used as target modulators in living systems such as cells or organisms. In recent years, data on small molecule-bioactivity relationships have become increasingly available, and consequently so have the number of approaches used to translate bioactivity data into knowledge. This review will focus on two aspects of chemogenomics. Firstly, in cases such as cell-based screens, the question of which target(s) a compound is modulating in order to cause the observed phenotype is crucial. In silico target prediction tools can suggest likely biological targets of small molecules via data mining in target-annotated chemical databases. This review presents some of the current tools available for this task and shows some sample applications relevant to a pharmaceutical industry setting. These applications are the prediction of false-positives in cell-based reporter gene assays, the prediction of targets by linking bioassay data with protein domain annotations, and the direct prediction of adverse reactions. Secondly, in recent years a shift from structure-derived chemical descriptors to biological descriptors has occurred. Here, the effect of a compound on a number of biological endpoints is used to make predictions about other properties, such as putative targets, associated adverse reactions, and pathways modulated by the compound. This review further summarizes these "performance" descriptors and their applications, focusing on gene expression profiles and high-content screening data. The advent of such biological fingerprints suggests that the field of drug discovery is currently at a crossroads, where single target bioassay results are supplanted by multidimensional biological fingerprints that reflect a new awareness of biological networks and polypharmacology.