Efficient Generation, Storage, and Manipulation of Fully Flexible Pharmacophore Multiplets and Their Use in 3-D Similarity Searching

Pharmacophore triplets and quartets have been used by many groups in recent years, primarily as a tool for molecular diversity analysis. In most cases, slow processing speeds and the very large size of the bitsets generated have forced researchers to compromise in terms of how such multiplets were stored, manipulated, and compared, e.g., by using simple unions to represent multiplets for sets of molecules. Here we report using bitmaps in place of bitsets to reduce storage demands and to improve processing speed. Here, a bitset is taken to mean a fully enumerated string of zeros and ones, from which a compressed bitmap is obtained by replacing uniform blocks ("runs") of digits in the bitset with a pair of values identifying the content and length of the block (run-length encoding compression). High-resolution multiplets involving four features are enabled by using 64 bit executables to create and manipulate bitmaps, which "connect" to the 32 bit executables used for database access and feature identification via an extensible mark-up language (XML) data stream. The encoding system used supports simple pairs, triplets, and quartets; multiplets in which a privileged substructure is used as an anchor point; and augmented multiplets in which an additional vertex is added to represent a contingent feature such as a hydrogen bond extension point linked to a complementary feature (e.g., a donor or an acceptor atom) in a base pair or triplet. It can readily be extended to larger, more complex multiplets as well. Database searching is one particular potential application for this technology. Consensus bitmaps built up from active ligands identified in preliminary screening can be used to generate hypothesis bitmaps, a process which includes allowance for differential weighting to allow greater emphasis to be placed on bits arising from multiplets expected to be particularly discriminating. Such hypothesis bitmaps are shown to be useful queries for database searching, successfully retrieving active compounds across a range of structural classes from a corporate database. The current implementation allows multiconformer bitmaps to be obtained from pregenerated conformations or by random perturbation on-the-fly. The latter application involves random sampling of the full range of conformations not precluded by steric clashes, which limits the usefulness of classical fingerprint similarity measures. A new measure of similarity, The Stochastic Cosine, is introduced here to address this need. This new similarity measure uses the average number of bits common to independently drawn conformer sets to normalize the cosine coefficient. Its use frees the user from having to ensure strict comparability of starting conformations and having to use fixed torsional increments, thereby allowing fully flexible characterization of pharmacophoric patterns.

[1]  J. Mason,et al.  Library design using BCUT chemistry-space descriptors and multiple four-point pharmacophore fingerprints: simultaneous optimization and structure-based diversity. , 2000, Journal of molecular graphics & modelling.

[2]  Malcolm J. McGregor,et al.  Pharmacophore Fingerprinting. 2. Application to Primary Library Design , 2000, J. Chem. Inf. Comput. Sci..

[3]  K. Wassermann,et al.  Synthesis and pharmacology of a novel pyrrolo[2,1,5-cd] indolizine (NNC 45-0095), a high affinity non-steroidal agonist for the estrogen receptor. , 2000, Bioorganic & medicinal chemistry letters.

[4]  David E. Clark,et al.  Enhancing the Hit-to-Lead Properties of Lead Optimization Libraries , 2000, J. Chem. Inf. Comput. Sci..

[5]  Christian Lemmen,et al.  Coupling structure-based design with combinatorial chemistry: application of active site derived pharmacophores with informative library design. , 2002, Journal of molecular graphics & modelling.

[6]  R. Mannhold,et al.  Comparative evaluation of the predictive power of calculation procedures for molecular lipophilicity. , 1995, Journal of pharmaceutical sciences.

[7]  P. Willett,et al.  A Comparison of Some Measures for the Determination of Inter‐Molecular Structural Similarity Measures of Inter‐Molecular Structural Similarity , 1986 .

[8]  Tad Hurst,et al.  Flexible 3D searching: The directed tweak technique , 1994, J. Chem. Inf. Comput. Sci..

[9]  K. Wassermann,et al.  Synthesis and estrogen receptor binding affinities of novel pyrrolo[2,1,5-cd]indolizine derivatives. , 2000, Bioorganic & medicinal chemistry letters.

[10]  Abdelazize Laoui,et al.  DIVSEL and COMPLIB — Strategies for the Design and Comparison of Combinatorial Libraries Using Pharmacophoric Descriptors. , 1998 .

[11]  Xin Chen,et al.  Recursive Partitioning Analysis of a Large Structure-Activity Data Set Using Three-Dimensional Descriptors1 , 1998, J. Chem. Inf. Comput. Sci..

[12]  A. Good,et al.  New methodology for profiling combinatorial libraries and screening sets: cleaning up the design process with HARPick. , 1997, Journal of medicinal chemistry.

[13]  J. Mason,et al.  New 4-point pharmacophore method for molecular similarity and diversity applications: overview of the method and applications, including a novel approach to the design of combinatorial libraries containing privileged substructures. , 1999, Journal of medicinal chemistry.

[14]  Lars Naerum,et al.  Scaffold hopping and optimization towards libraries of glycogen synthase kinase-3 inhibitors. , 2002, Bioorganic & medicinal chemistry letters.

[15]  Malcolm J. McGregor,et al.  Pharmacophore Fingerprinting. 1. Application to QSAR and Focused Library Design , 1999, J. Chem. Inf. Comput. Sci..

[16]  Tim D. J. Perkins,et al.  Large-scale virtual screening for discovering leads in the postgenomic era , 2001, IBM Syst. J..

[17]  K. Wassermann,et al.  Synthesis and pharmacological evaluation of novel cis-3,4-diaryl-hydroxychromanes as high affinity partial agonists for the estrogen receptor. , 2002, Bioorganic & medicinal chemistry.

[18]  D. E. Patterson,et al.  Designing Chemical Libraries for Lead Discovery , 1996 .

[19]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[20]  Andrew C. Good,et al.  Investigating the extension of pairwise distance pharmacophore measures to triplet-based descriptors , 1995, J. Comput. Aided Mol. Des..

[21]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[22]  Hans Matter,et al.  Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound Subsets , 1999, J. Chem. Inf. Comput. Sci..

[23]  K. Wassermann,et al.  Synthesis and biological evaluation of novel thio-substituted chromanes as high-affinity partial agonists for the estrogen receptor. , 2002, Bioorganic & medicinal chemistry letters.

[24]  Robert D Clark,et al.  Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors. , 1996, Journal of medicinal chemistry.

[25]  Andrew Smellie,et al.  Identification of Common Functional Configurations Among Molecules , 1996, J. Chem. Inf. Comput. Sci..

[26]  R. Webster Homer,et al.  SYBYL Line Notation (SLN): A Versatile Language for Chemical Structure Representation , 1997, J. Chem. Inf. Comput. Sci..