DISE: Directed Sphere Exclusion

The Sphere Exclusion algorithm is a well-known algorithm used to select diverse subsets from chemical-compound libraries or collections. It can be applied with any given distance measure between two structures. It is popular because of the intuitive geometrical interpretation of the method and its good performance on large data sets. This paper describes Directed Sphere Exclusion (DISE), a modification of the Sphere Exclusion algorithm, which retains all positive properties of the Sphere Exclusion algorithm but generates a more even distribution of the selected compounds in the chemical space. In addition, the computational requirement is significantly reduced, thus it can be applied to very large data sets.

[1]  Robert D. Clark,et al.  OptiSim: An Extended Dissimilarity Selection Method for Finding Diverse Representative Subsets , 1997, J. Chem. Inf. Comput. Sci..

[2]  K. M. Smith,et al.  Novel software tools for chemical diversity , 1998 .

[3]  Christos A. Nicolaou,et al.  Ties in Proximity and Clustering Compounds. , 2001 .

[4]  Robert S. Pearlman,et al.  Metric Validation and the Receptor-Relevant Subspace Concept , 1999, J. Chem. Inf. Comput. Sci..

[5]  Robert D Clark,et al.  Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors. , 1996, Journal of medicinal chemistry.

[6]  D C Spellmeyer,et al.  Measuring diversity: experimental design of combinatorial libraries for drug discovery. , 1995, Journal of medicinal chemistry.

[7]  Peter Willett,et al.  Dissimilarity-based compound selection for library design , 2001 .

[8]  Peter Willett,et al.  Dissimilarity-Based Algorithms for Selecting Structurally Diverse Sets of Compounds , 1999, J. Comput. Biol..

[9]  P Willett,et al.  Comparison of algorithms for dissimilarity-based compound selection. , 1997, Journal of molecular graphics & modelling.

[10]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[11]  U. Eichler,et al.  ADDRESSING THE PROBLEM OF MOLECULAR DIVERSITY , 1999 .

[12]  Brian D. Hudson,et al.  Parameter Based Methods for Compound Selection from Chemical Databases , 1996 .

[13]  M S Lajiness,et al.  Implementing drug screening programs using molecular similarity methods. , 1989, Progress in clinical and biological research.

[14]  Ramaswamy Nilakantan,et al.  Database diversity assessment: New ideas, concepts, and tools , 1997, J. Comput. Aided Mol. Des..

[15]  F. Burden Molecular Identification Number for Substructure Searches. , 1989 .

[16]  P Willett,et al.  Binning schemes for partition-based compound selection. , 1999, Journal of molecular graphics & modelling.