Distribution of Molecular Scaffolds and R-Groups Isolated from Large Compound Databases

We describe an approach to isolate molecular scaffolds and R-groups from known chemical compounds in order to generate scaffold and R-group databases from two large compound collections, OptiverseTM and MaybridgeTM. The distributions of molecular scaffolds and R-groups in the parent databases were analysed and compared. We find that a limited number of scaffolds and R-groups account for the majority of database compounds and that most of the scaffolds occur only once or twice in the compound databases. Diversity analysis suggests that the compound and scaffold databases have similar molecular diversity. Implications for library design are discussed.