Fragment Descriptors in Virtual Screening

This article reviews the application of fragment descriptors at different stages of virtual screening: filtering, similarity search, and direct activity assessment using QSAR/QSPR models. Several case studies are considered. It is demonstrated that the power of fragment descriptors stems from their universality, very high computational efficiency, simplicity of interpretation and versatility.

[1]  Peter C. Jurs,et al.  ADAPT: A Computer System for Automated Data Analysis Using Pattern Recognition Techniques , 1976, J. Chem. Inf. Comput. Sci..

[2]  Osman F. Güner,et al.  Pharmacophore perception, development, and use in drug design , 2000 .

[3]  Pierre Acklin,et al.  Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins , 2003, J. Chem. Inf. Comput. Sci..

[4]  K Enslein,et al.  Estimation of maximum tolerated dose for long-term bioassays from acute lethal dose and structure by QSAR. , 1991, Risk analysis : an official publication of the Society for Risk Analysis.

[5]  Igor I. Baskin,et al.  Prediction of Physical Properties of Organic Compounds Using Artificial Neural Networks within the Substructure Approach , 2001 .

[6]  Michael F. Lynch,et al.  Analysis of structural characteristics of chemical compounds in a large computer-based file. Part II. Atom-centred fragments , 1970 .

[7]  Igor I. Baskin,et al.  Chemical graphs and their basis invariants , 1999 .

[8]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[9]  Claudio N. Cavasotto,et al.  Ligand docking and structure-based virtual screening in drug discovery. , 2007, Current topics in medicinal chemistry.

[10]  Andreas Bender,et al.  Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier , 2004, J. Chem. Inf. Model..

[11]  I. Tetko,et al.  In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds: trends, problems and solutions. , 2006, Current medicinal chemistry.

[12]  H S Rosenkranz,et al.  International Commission for Protection Against Environmental Mutagens and Carcinogens. Approaches to SAR in carcinogenesis and mutagenesis. Prediction of carcinogenicity/mutagenicity using MULTI-CASE. , 1994, Mutation research.

[13]  Darren V S Green,et al.  Virtual screening of virtual libraries. , 2003, Progress in medicinal chemistry.

[14]  Robert P. Sheridan,et al.  Chemical Similarity Using Physiochemical Property Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[15]  A B Glaz,et al.  Cybernetic methods of drug design. I. Statement of the problem--the perceptron approach. , 1973, Computers and biomedical research, an international journal.

[16]  D. Fourches,et al.  Successful “In Silico” Design of New Efficient Uranyl Binders , 2007 .

[17]  H S Rosenkranz,et al.  Artificial intelligence and Bayesian decision theory in the prediction of chemical carcinogens. , 1985, Mutation research.

[18]  Dan C. Fara,et al.  "In Silico" Design of New Uranyl Extractants Based on Phosphoryl-Containing Podands: QSPR Studies, Generation and Screening of Virtual Combinatorial Library, and Experimental Tests , 2004, J. Chem. Inf. Model..

[19]  Roberto Todeschini,et al.  Molecular descriptors for chemoinformatics , 2009 .

[20]  Daniel Domine,et al.  Fragment analysis in small molecule discovery. , 2002, Current opinion in drug discovery & development.

[21]  Igor I. Baskin,et al.  Artificial neural network and fragmental approach in prediction of physicochemical properties of organic compounds , 2003 .

[22]  Andreas Bender,et al.  Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOLPRINT 2D): Evaluation of Performance , 2004, J. Chem. Inf. Model..

[23]  Darren V. S. Green,et al.  Prediction of Biological Activity for High-Throughput Screening Using Binary Kernel Discrimination , 2001, J. Chem. Inf. Comput. Sci..

[24]  A. Gorse Diversity in medicinal chemistry space. , 2006, Current topics in medicinal chemistry.

[25]  R. Mannhold,et al.  Comparative evaluation of the predictive power of calculation procedures for molecular lipophilicity. , 1995, Journal of pharmaceutical sciences.

[26]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[27]  J. An,et al.  Structure-based virtual screening of chemical libraries for drug discovery. , 2006, Current opinion in chemical biology.

[28]  Alexandre Varnek,et al.  Anti-HIV Activity of HEPT, TIBO, and Cyclic Urea Derivatives: Structure-Property Studies, Focused Combinatorial Library Generation, and Hits Selection Using Substructural Molecular Fragments Method , 2003, J. Chem. Inf. Comput. Sci..

[29]  H S Rosenkranz,et al.  Structural requirements for the mutagenicity of environmental nitroarenes. , 1984, Mutation research.

[30]  H Kubinyi,et al.  Chemogenomics in drug discovery. , 2006, Ernst Schering Research Foundation workshop.

[31]  Igor I Baskin,et al.  One-class classification as a novel method of ligand-based virtual screening: the case of glycogen synthase kinase 3β inhibitors. , 2011, Bioorganic & medicinal chemistry letters.

[32]  A. Ghose,et al.  Atomic physicochemical parameters for three dimensional structure directed quantitative structure‐activity relationships III: Modeling hydrophobic interactions , 1988 .

[33]  H S Rosenkranz,et al.  Identification of 'genotoxic' and 'non-genotoxic' alerts for cancer in mice: the carcinogenic potency database. , 1998, Mutation research.

[34]  Robert C. Glen,et al.  Novel Methods for the Prediction of logP, pKa, and logD , 2002, J. Chem. Inf. Comput. Sci..

[35]  Igor I. Baskin,et al.  Virtual screening based on one-class classification , 2011 .

[36]  M D Barratt,et al.  The computational prediction of toxicity. , 2001, Current opinion in chemical biology.

[37]  Malcolm J. McGregor,et al.  Clustering of Large Databases of Compounds: Using the MDL "Keys" as Structural Descriptors , 1997, J. Chem. Inf. Comput. Sci..

[38]  I. Tetko,et al.  ISIDA - Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors , 2008 .

[39]  C E Berkoff,et al.  Substructural analysis. A novel approach to the problem of drug design. , 1974, Journal of medicinal chemistry.

[40]  Alexandre Varnek,et al.  Building a chemical space based on fragment descriptors. , 2008, Combinatorial chemistry & high throughput screening.

[41]  G. Schneider,et al.  Extraction and visualization of potential pharmacophore points using support vector machines: application to ligand-based virtual screening for COX-2 inhibitors. , 2005, Journal of medicinal chemistry.

[42]  D V S Green,et al.  Methods for library design and optimisation. , 2004, Mini reviews in medicinal chemistry.

[43]  Jürgen Bajorath,et al.  Molecular Similarity Analysis and Virtual Screening by Mapping of Consensus Positions in Binary-Transformed Chemical Descriptor Spaces with Variable Dimensionality , 2004, J. Chem. Inf. Model..

[44]  Jürgen Bajorath,et al.  Chemical Database Mining through Entropy-Based Molecular Similarity Assessment of Randomly Generated Structural Fragment Populations , 2007, J. Chem. Inf. Model..

[45]  Louis Hodes,et al.  An Efficient Design for Chemical Structure Searching. I. The Screens , 1975, J. Chem. Inf. Comput. Sci..

[46]  J. Dearden,et al.  Design of new cognition enhancers: from computer prediction to synthesis and biological evaluation. , 2004, Journal of medicinal chemistry.

[47]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[48]  A. Petrauskas,et al.  ACD/Log P method description , 2000 .

[49]  Louis Hodes,et al.  Selection of molecular fragment features for structure-activity studies in antitumor screening , 1981, J. Chem. Inf. Comput. Sci..

[50]  Stephen R. Johnson,et al.  Molecular properties that influence the oral bioavailability of drug candidates. , 2002, Journal of medicinal chemistry.

[51]  Benjamin Parent,et al.  Fuzzy Tricentric Pharmacophore Fingerprints, 1. Topological Fuzzy Pharmacophore Triplets and Adapted Molecular Similarity Scoring Schemes , 2006, J. Chem. Inf. Model..

[52]  Alan H. Lipkus,et al.  Similarity searching on CAS Registry substances. 2. 2D structural similarity , 1994, J. Chem. Inf. Comput. Sci..

[53]  Louis Hodes,et al.  Clustering a large number of compounds. 1. Establishing the method on an initial sample , 1989, J. Chem. Inf. Comput. Sci..

[54]  Luhua Lai,et al.  Structural Features of Toxic Chemicals for Specific Toxicity , 1999, J. Chem. Inf. Comput. Sci..

[55]  Jean-Loup Faulon,et al.  The Signature Molecular Descriptor. 1. Using Extended Valence Sequences in QSAR and QSPR Studies , 2003, J. Chem. Inf. Comput. Sci..

[56]  A. Bender,et al.  Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. , 2006, IDrugs : the investigational drugs journal.

[57]  B. Fan,et al.  Molecular similarity and diversity in chemoinformatics: From theory to applications , 2006, Molecular Diversity.

[58]  Alexandre Varnek,et al.  QSAR modeling of blood:air and tissue:air partition coefficients using theoretical descriptors. , 2005, Bioorganic & medicinal chemistry.

[59]  D. Horvath,et al.  ISIDA Property‐Labelled Fragment Descriptors , 2010, Molecular informatics.

[60]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[61]  Jürgen Bajorath,et al.  Differential Shannon Entropy as a Sensitive Measure of Differences in Database Variability of Molecular Descriptors , 2001, J. Chem. Inf. Comput. Sci..

[62]  G. Klopman MULTICASE 1. A Hierarchical Computer Automated Structure Evaluation Program , 1992 .

[63]  A. Leo CALCULATING LOG POCT FROM STRUCTURES , 1993 .

[64]  Remigijus Didziapetris,et al.  Fragmental Methods in the Design of New Compounds. Applications of The Advanced Algorithm Builder , 2002 .

[65]  John S. Delaney,et al.  Assessing the ability of chemical similarity measures to discriminate between active and inactive compounds , 1996, Molecular Diversity.

[66]  Yu Chen,et al.  Evaluation of Quantitative Structure-Activity Relationship Methods for Large-Scale Prediction of Chemicals Binding to the Estrogen Receptor , 1998, J. Chem. Inf. Comput. Sci..

[67]  L. Hodes,et al.  A statistical-heuristic methods for automated selection of drugs for screening. , 1977, Journal of medicinal chemistry.

[68]  Peter Willett,et al.  Implementation of nearest-neighbor searching in an online chemical structure search system , 1986, J. Chem. Inf. Comput. Sci..

[69]  Jürgen Bajorath,et al.  POT-DMC: A virtual screening method for the identification of potent hits. , 2004, Journal of medicinal chemistry.

[70]  C Silipo,et al.  Calculation of hydrophobic constant (log P) from pi and f constants. , 1975, Journal of medicinal chemistry.

[71]  Jürgen Bajorath Chemoinformatics methods for systematic comparison of molecules from natural and synthetic sources and design of hybrid libraries , 2004, Molecular Diversity.

[72]  S. Benson,et al.  Additivity Rules for the Estimation of Molecular Properties. Thermodynamic Properties , 1958 .

[73]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[74]  Robert P Sheridan,et al.  Web enabling technology for the design, enumeration, optimization and tracking of compound libraries. , 2005, Current topics in medicinal chemistry.

[75]  G Klopman,et al.  In-Silico Screening of High Production Volume Chemicals for Mutagenicity using the mcase QSAR Expert System , 2003, SAR and QSAR in environmental research.

[76]  E. Muratov,et al.  Quantitative structure-activity relationship studies of [(biphenyloxy)propyl]isoxazole derivatives. Inhibitors of human rhinovirus 2 replication. , 2007, Journal of medicinal chemistry.

[77]  Keith J. Laidler,et al.  A SYSTEM OF MOLECULAR THERMOCHEMISTRY FOR ORGANIC GASES AND LIQUIDS , 1956 .

[78]  Andreas Bender,et al.  Characterizing Bitterness: Identification of Key Structural Features and Development of a Classification Model , 2006, J. Chem. Inf. Model..

[79]  Jean-Loup Faulon,et al.  The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides. , 2003, Journal of molecular graphics & modelling.

[80]  Peter Willett,et al.  Comparison of fragment weighting schemes for substructural analysis , 1989 .

[81]  Luhua Lai,et al.  A New Atom-Additive Method for Calculating Partition Coefficients , 1997, J. Chem. Inf. Comput. Sci..

[82]  Gordon M. Crippen,et al.  Atomic physicochemical parameters for three-dimensional-structure-directed quantitative structure-activity relationships. 2. Modeling dispersive and hydrophobic interactions , 1987, J. Chem. Inf. Comput. Sci..

[83]  Peter Willett The Effect of Screen Set Size on Retrieval from Chemical Substructure Search Systems , 1979 .

[84]  Dragos Horvath,et al.  Neighborhood Behavior of in Silico Structural Spaces with Respect to in Vitro Activity Spaces-A Novel Understanding of the Molecular Similarity Principle in the Context of Multiple Receptor Binding Profiles , 2003, J. Chem. Inf. Comput. Sci..

[85]  Peter Ertl,et al.  Estimation of pKa for Druglike Compounds Using Semiempirical and Information-Based Descriptors , 2007, J. Chem. Inf. Model..

[86]  H. J. Bernstein,et al.  The Physical Properties of Molecules in Relation to Their Structure. I. Relations between Additive Molecular Properties in Several Homologous Series , 1952 .

[87]  Igor I. Baskin,et al.  Molecular Similarity. 1. Analytical Description of the Set of Graph Similarity Measures , 1998, J. Chem. Inf. Comput. Sci..

[88]  Igor I Baskin,et al.  The One‐Class Classification Approach to Data Description and to Models Applicability Domain , 2010, Molecular informatics.

[89]  Igor I Baskin,et al.  Chemoinformatics as a Theoretical Chemistry Discipline , 2011, Molecular informatics.

[90]  Tudor I. Oprea,et al.  Pursuing the leadlikeness concept in pharmaceutical research. , 2004, Current opinion in chemical biology.

[91]  Schmid,et al.  "Scaffold-Hopping" by Topological Pharmacophore Search: A Contribution to Virtual Screening. , 1999, Angewandte Chemie.

[92]  Alexandre Varnek,et al.  Skin permeation rate as a function of chemical structure. , 2006, Journal of medicinal chemistry.

[93]  Chris L Waller,et al.  Recent advances in molecular diversity , 2002, J. Comput. Aided Mol. Des..

[94]  Sergei V. Trepalin,et al.  New Diversity Calculations Algorithms Used for Compound Selection , 2002, J. Chem. Inf. Comput. Sci..

[95]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[96]  G. Klopman Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules , 1985 .

[97]  Jean-Loup Faulon,et al.  The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences , 2003, J. Chem. Inf. Comput. Sci..

[98]  Héléna A. Gaspar,et al.  Generative Topographic Mapping (GTM): Universal Tool for Data Visualization, Structure‐Activity Modeling and Dataset Comparison , 2012, Molecular informatics.

[99]  Igor V. Tetko,et al.  Neural Network Studies, 4. Introduction to Associative Neural Networks , 2002, J. Chem. Inf. Comput. Sci..

[100]  Hanna Geppert,et al.  Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation , 2010, J. Chem. Inf. Model..

[101]  Igor V. Tetko,et al.  Rule-Based Systems to Predict Lipophilicity , 2007 .

[102]  Nikolai S. Zefirov,et al.  Fragmental Approach in QSPR , 2002, J. Chem. Inf. Comput. Sci..

[103]  V. E. Golender,et al.  Logico-Structural Approach to Computer-Assisted Drug Design , 1980 .

[104]  Dimitris K. Agrafiotis,et al.  Advances in diversity profiling and combinatorial series design , 2004, Molecular Diversity.

[105]  Alexandre Varnek,et al.  Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures , 2005, J. Comput. Aided Mol. Des..

[106]  Jürgen Bajorath,et al.  Assessment of Molecular Similarity from the Analysis of Randomly Generated Structural Fragment Populations , 2006, J. Chem. Inf. Model..

[107]  N. Bodor,et al.  Neural network studies: Part 3. Prediction of partition coefficients , 1994 .

[108]  Igor V. Tetko,et al.  Exhaustive QSPR Studies of a Large Diverse Set of Ionic Liquids: How Accurately Can We Predict Melting Points? , 2007, J. Chem. Inf. Model..

[109]  E. Fluder,et al.  Latent semantic structure indexing (LaSSI) for defining chemical similarity. , 2001, Journal of medicinal chemistry.

[110]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[111]  A. Ghose,et al.  Atomic Physicochemical Parameters for Three‐Dimensional Structure‐Directed Quantitative Structure‐Activity Relationships I. Partition Coefficients as a Measure of Hydrophobicity , 1986 .

[112]  Peter Willett,et al.  A Screen Set Generation Algorithm , 1979, J. Chem. Inf. Comput. Sci..

[113]  Hugo Kubinyi,et al.  Similarity and Dissimilarity: A Medicinal Chemist’s View , 2002 .

[114]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[115]  Vladimir Poroikov,et al.  Chemical Similarity Assessment through Multilevel Neighborhoods of Atoms: Definition and Comparison with the Other Descriptors , 1999, J. Chem. Inf. Comput. Sci..

[116]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[117]  Philip N. Judson,et al.  QSAR and Expert Systems in the Prediction of Biological Activity , 1992 .

[118]  Tudor I. Oprea,et al.  Property distribution of drug-related chemical databases* , 2000, J. Comput. Aided Mol. Des..

[119]  Peter C. Jurs,et al.  Generation of Descriptors from Molecular Structures , 1976, J. Chem. Inf. Comput. Sci..

[120]  Michael F. Lynch,et al.  Strategic Considerations in the Design of a Screening System for Substructure Searches of Chemical Structure Files , 1973 .

[121]  G. Schneider,et al.  From Virtual to Real Screening for D3 Dopamine Receptor Ligands , 2005, Chembiochem : a European journal of chemical biology.

[122]  Robin Taylor,et al.  Simulation Analysis of Experimental Design Strategies for Screening Random Compounds as Potential New Drugs and Agrochemicals , 1995, J. Chem. Inf. Comput. Sci..

[123]  Philip N. Judson Rule induction for systems predicting biological activity , 1994, J. Chem. Inf. Comput. Sci..

[124]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[125]  R D Hull,et al.  Chemical similarity searches using latent semantic structural indexing (LaSSI) and comparison to TOPOSIM. , 2001, Journal of medicinal chemistry.

[126]  D. Sanderson,et al.  Computer Prediction of Possible Toxic Action from Chemical Structure; The DEREK System , 1991, Human & experimental toxicology.

[127]  Thierry Convard,et al.  SmilogP: A Program for a Fast Evaluation of Theoretical Log P from the Smiles Code of a Molecule , 1994 .

[128]  Igor I. Baskin,et al.  On the Basis of Invariants of Labeled Molecular Graphs , 1995, J. Chem. Inf. Comput. Sci..

[129]  Markus H. J. Seifert,et al.  Virtual high-throughput screening of molecular databases. , 2007, Current opinion in drug discovery & development.

[130]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[131]  Victor Kuzmin,et al.  Hierarchical QSAR technology based on the Simplex representation of molecular structure , 2008, J. Comput. Aided Mol. Des..

[132]  K. Sen,et al.  Molecular Similarity II , 1995 .

[133]  W. Reynolds Thermodynamic properties in SI , 1979 .

[134]  Igor I. Baskin,et al.  Chapter 1:Fragment Descriptors in SAR/QSAR/QSPR Studies, Molecular Similarity Analysis and in Virtual Screening , 2008 .

[135]  Ruth V. Powers,et al.  Search of CA Registry (1.25 Million Compounds) With the Topological Screens System. , 1972 .

[136]  Igor I. Baskin,et al.  Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis? , 2012, J. Chem. Inf. Model..

[137]  Alexandre Varnek,et al.  Correlation of blood-brain penetration using structural descriptors. , 2006, Bioorganic & medicinal chemistry.

[138]  V. E. Golender,et al.  Structure-activity relationship oriented languages for chemical structure representation , 1982, J. Chem. Inf. Comput. Sci..

[139]  Peter Willett,et al.  Rapid Quantification of Molecular Diversity for Selective Database Acquisition , 1997, J. Chem. Inf. Comput. Sci..

[140]  J. Bajorath,et al.  State-of-the-art in ligand-based virtual screening. , 2011, Drug discovery today.

[141]  S. Free,et al.  A MATHEMATICAL CONTRIBUTION TO STRUCTURE-ACTIVITY STUDIES. , 1964, Journal of medicinal chemistry.

[142]  L. Lai,et al.  Calculating partition coefficient by atom-additive method , 2000 .

[143]  Jérôme Hert,et al.  Comparison of Fingerprint-Based Methods for Virtual Screening Using Multiple Bioactive Reference Structures , 2004, J. Chem. Inf. Model..

[144]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings , 1997 .

[145]  Vladimir Poroikov,et al.  Robustness of Biological Activity Spectra Predicting by Computer Program PASS for Noncongeneric Sets of Chemical Compounds , 2000, J. Chem. Inf. Comput. Sci..

[146]  Yoshihiro Kudo,et al.  Automatic log P estimation based on combined additive modeling methods , 1990, J. Comput. Aided Mol. Des..