Hierarchical Strategy for Identifying Active Chemotype Classes in Compound Databases

A general methodology is presented for analyzing patterns of activity in compound databases, which is based on the use of structural chemotypes and provides a focused, hierarchical classification of active compounds. Each node in the hierarchical tree corresponds to a specific chemotype and is labeled by a unique code or identifier. All chemotypes at a given level of the hierarchy define equivalence classes, and those of higher structural resolution have a strict parent–child (i.e. subset) relationship to those of lower resolution. Active chemotypes contain a relatively high proportion of actives and are characterized through the use of enrichment plots. These plots show the relationship of occupancy to activity enrichment for a set of chemotypes at a given level of structural resolution. Paths through the hierarchy from chemotypes of lower to those of higher structural resolution (e.g. reduced cyclic system skeletons → cyclic system skeletons → cyclic systems → complete molecules) are unique. Unique paths in the hierarchy that only pass through active chemotypes are called chains or paths of actives. These chains provide links for identifying structurally related active compounds at increasing levels of structural resolution. Analysis of actives can also be carried out at any specific level of structural resolution deemed appropriate by the investigator. Chemotype codes can be used to search compound databases for new molecules possessing these codes or sets of hierarchically related codes. An example, based on the NCI AIDS database, is presented that illustrates the general approach and provides a more detailed description of several interesting classes of active chemotypes and their inter‐relationships.

[1]  Robert P. Sheridan,et al.  Chemical Similarity Using Physiochemical Property Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[2]  E. De Clercq,et al.  Identification of novel thiocarboxanilide derivatives that suppress a variety of drug-resistant mutant human immunodeficiency virus type 1 strains at a potency similar to that for wild-type virus , 1996, Antimicrobial agents and chemotherapy.

[3]  Jun Feng,et al.  PowerMV: A Software Environment for Molecular Viewing, Descriptor Generation, Data Analysis and Hit Evaluation , 2005, J. Chem. Inf. Model..

[4]  M. Stahl,et al.  Scaffold hopping. , 2004, Drug discovery today. Technologies.

[5]  Mark Johnson,et al.  Algorithm for Naming Molecular Equivalence Classes Represented by Labeled Pseudographs , 2001, J. Chem. Inf. Comput. Sci..

[6]  S Stanley Young,et al.  Using recursive partitioning analysis to evaluate compound selection methods. , 2004, Methods in molecular biology.

[7]  J. Mcmahon,et al.  Inhibition of reverse transcriptase of human immunodeficiency virus type 1 and chimeric enzymes of human immunodeficiency viruses types 1 and 2 by two novel non‐nucleoside inhibitors , 1994, FEBS letters.

[8]  D. Covell,et al.  Evaluation of selected chemotypes in coupled cellular and molecular target-based screens identifies novel HIV-1 zinc finger inhibitors. , 1996, Journal of medicinal chemistry.

[9]  B. E. Evans,et al.  Methods for drug discovery: development of potent, selective, orally effective cholecystokinin antagonists. , 1988, Journal of Medicinal Chemistry.

[10]  Mark Johnson,et al.  Using Molecular Equivalence Numbers To Visually Explore Structural Features that Distinguish Chemical Libraries , 2002, J. Chem. Inf. Comput. Sci..

[11]  Glenn J. Myatt,et al.  LeadScope: Software for Exploring Large Sets of Screening Data , 2000, J. Chem. Inf. Comput. Sci..

[12]  L. Santos,et al.  The Rough Sets theory , 2006 .

[13]  Schmid,et al.  "Scaffold-Hopping" by Topological Pharmacophore Search: A Contribution to Virtual Screening. , 1999, Angewandte Chemie.

[14]  Jun Xu,et al.  Drug-like Index: A New Approach To Measure Drug-like Compounds and Their Diversity , 2000, J. Chem. Inf. Comput. Sci..

[15]  M. Parniak,et al.  The thiocarboxanilide nonnucleoside UC781 is a tight-binding inhibitor of HIV-1 reverse transcriptase. , 1997, Biochemistry.

[16]  G. Bemis,et al.  Properties of known drugs. 2. Side chains. , 1999, Journal of medicinal chemistry.

[17]  L. Wiebe,et al.  Synthesis, in vitro biological stability, and anti-HIV activity of 5-halo-6-alkoxy(or azido)-5,6-dihydro-3'-azido-3'-deoxythymidine diastereomers as potential prodrugs to 3'-azido-3'-deoxythymidine (AZT). , 1994, Journal of medicinal chemistry.

[18]  Arthur A. Patchett,et al.  Chapter 26. Privileged structures — An update , 2000 .

[19]  Scott Boyer,et al.  Chemical and biological profiling of an annotated compound library directed to the nuclear receptor family. , 2005, Current topics in medicinal chemistry.

[20]  Suzanne K. Schreyer,et al.  Data Shaving: A Focused Screening Approach , 2004, J. Chem. Inf. Model..

[21]  D W Barry,et al.  3'-Azido-3'-deoxythymidine (BW A509U): an antiviral agent that inhibits the infectivity and cytopathic effect of human T-lymphotropic virus type III/lymphadenopathy-associated virus in vitro. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[22]  M. Cushman,et al.  Novel modifications in the alkenyldiarylmethane (ADAM) series of non-nucleoside reverse transcriptase inhibitors. , 1999, Journal of medicinal chemistry.

[23]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[24]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[25]  R T Walker,et al.  Synthesis and antiviral activity of 6-benzyl analogs of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) as potent and selective anti-HIV-1 agents. , 2010, Journal of medicinal chemistry.

[26]  J. R. Scotti,et al.  Available From , 1973 .