The Light and Dark Sides of Virtual Screening: What Is There to Know?

Virtual screening consists of using computational tools to predict potentially bioactive compounds from files containing large libraries of small molecules. Virtual screening is becoming increasingly popular in the field of drug discovery as in silico techniques are continuously being developed, improved, and made available. As most of these techniques are easy to use, both private and public organizations apply virtual screening methodologies to save resources in the laboratory. However, it is often the case that the techniques implemented in virtual screening workflows are restricted to those that the research team knows. Moreover, although the software is often easy to use, each methodology has a series of drawbacks that should be avoided so that false results or artifacts are not produced. Here, we review the most common methodologies used in virtual screening workflows in order to both introduce the inexperienced researcher to new methodologies and advise the experienced researcher on how to prevent common mistakes and the improper usage of virtual screening methodologies.

[1]  Michael M. Mysinger,et al.  Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking , 2012, Journal of medicinal chemistry.

[2]  J. Irwin,et al.  Docking Screens for Novel Ligands Conferring New Biology. , 2016, Journal of medicinal chemistry.

[3]  Andreas Bender,et al.  Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier , 2004, J. Chem. Inf. Model..

[4]  Adrià Cereto-Massagué,et al.  Molecular fingerprint similarity search in virtual screening. , 2015, Methods.

[5]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[6]  Kenneth M Merz,et al.  The Ecstasy and Agony of Assay Interference Compounds. , 2017, ACS chemical neuroscience.

[7]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[8]  Brian K Shoichet,et al.  A detergent-based assay for the detection of promiscuous inhibitors , 2006, Nature Protocols.

[9]  J. Baell,et al.  Chemistry: Chemical con artists foil drug discovery , 2014, Nature.

[10]  Markus A Lill,et al.  Induced fit docking, and the use of QM/MM methods in docking. , 2013, Drug discovery today. Technologies.

[11]  Kam Y. J. Zhang,et al.  Hierarchical virtual screening approaches in small molecule drug discovery , 2014, Methods.

[12]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[13]  E. Lionta,et al.  Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances , 2014, Current topics in medicinal chemistry.

[14]  Woody Sherman,et al.  Novel Method for Generating Structure-Based Pharmacophores Using Energetic Analysis , 2009, J. Chem. Inf. Model..

[15]  Gustavo Henrique Goulart Trossini,et al.  Use of machine learning approaches for novel drug discovery , 2016, Expert opinion on drug discovery.

[16]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[17]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[18]  Marvin Johnson,et al.  Concepts and applications of molecular similarity , 1990 .

[19]  J. S. Dixon,et al.  Distance Geometry in Molecular Modeling , 2007 .

[20]  Kerim Babaoglu,et al.  Overview of Methods and Strategies for Conducting Virtual Small Molecule Screening , 2017, Current protocols in chemical biology.

[21]  A Lavecchia,et al.  Virtual screening strategies in drug discovery: a critical review. , 2013, Current medicinal chemistry.

[22]  Jonathan D Hirst,et al.  Machine learning in virtual screening. , 2009, Combinatorial chemistry & high throughput screening.

[23]  Kunal Roy,et al.  How far can virtual screening take us in drug discovery? , 2013, Expert opinion on drug discovery.

[24]  Adrià Cereto-Massagué,et al.  The good, the bad and the dubious: VHELIBS, a validation helper for ligands and binding sites , 2013, Journal of Cheminformatics.

[25]  Vinícius Gonçalves Maltarollo,et al.  Applying machine learning techniques for ADME-Tox prediction: a review , 2015, Expert opinion on drug metabolism & toxicology.

[26]  Ashutosh Kumar,et al.  Advances in the Development of Shape Similarity Methods and Their Application in Drug Discovery , 2018, Front. Chem..

[27]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[28]  Woody Sherman,et al.  Rapid Shape-Based Ligand Alignment and Virtual Screening Method Based on Atom/Feature-Pair Similarities and Volume Overlap Scoring , 2011, J. Chem. Inf. Model..

[29]  Antje Chang,et al.  BRENDA in 2017: new perspectives and new tools in BRENDA , 2016, Nucleic Acids Res..

[30]  David E. Shaw,et al.  PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results , 2006, J. Comput. Aided Mol. Des..

[31]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[32]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[33]  Stephani Joy Y Macalino,et al.  Role of computer-aided drug design in modern drug discovery , 2015, Archives of Pharmacal Research.

[34]  P. Hawkins,et al.  Comparison of shape-matching and docking as virtual screening tools. , 2007, Journal of medicinal chemistry.

[35]  Adrià Cereto-Massagué,et al.  DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets , 2012, Bioinform..

[36]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .

[37]  Andy Vinter,et al.  Molecular Field Extrema as Descriptors of Biological Activity: Definition and Validation , 2006, J. Chem. Inf. Model..

[38]  Malcolm J. McGregor,et al.  Pharmacophore Fingerprinting. 2. Application to Primary Library Design , 2000, J. Chem. Inf. Comput. Sci..

[39]  Woody Sherman,et al.  ConfGen: A Conformational Search Method for Efficient Generation of Bioactive Conformers , 2010, J. Chem. Inf. Model..

[40]  William J. Allen,et al.  DOCK 6: Impact of new features and current docking performance , 2015, J. Comput. Chem..

[41]  Andreas Bender,et al.  Recognizing Pitfalls in Virtual Screening: A Critical Review , 2012, J. Chem. Inf. Model..

[42]  Matthias Rarey,et al.  Benchmarking Commercial Conformer Ensemble Generators , 2017, J. Chem. Inf. Model..

[43]  Sereina Riniker,et al.  Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation , 2015, J. Chem. Inf. Model..

[44]  G. Maggiora,et al.  Molecular similarity in medicinal chemistry. , 2014, Journal of medicinal chemistry.

[45]  Jürgen Bajorath,et al.  Design and Evaluation of a Molecular Fingerprint Involving the Transformation of Property Descriptor Values into a Binary Classification Scheme , 2003, J. Chem. Inf. Comput. Sci..

[46]  Michael L. Connolly,et al.  Computation of molecular volume , 1985 .

[47]  J. A. Grant,et al.  A fast method of molecular shape comparison: A simple application of a Gaussian description of molecular shape , 1996, J. Comput. Chem..

[48]  Olivier Michielin,et al.  SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules , 2017, Scientific Reports.

[49]  Kohei Ichikawa,et al.  Virtual Screening Techniques and Current Computational Infrastructures. , 2016, Current pharmaceutical design.

[50]  Peter Kolb,et al.  Docking screens: right for the right reasons? , 2009, Current topics in medicinal chemistry.

[51]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[52]  Andreas Bender,et al.  Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOLPRINT 2D): Evaluation of Performance , 2004, J. Chem. Inf. Model..

[53]  Jean-Louis Reymond,et al.  SMIfp (SMILES fingerprint) Chemical Space for Virtual Screening and Visualization of Large Databases of Organic Molecules , 2013, J. Chem. Inf. Model..

[54]  V. Zoete,et al.  Identification of Human IKK-2 Inhibitors of Natural Origin (Part I): Modeling of the IKK-2 Kinase Domain, Virtual Screening and Activity Assays , 2011, PloS one.

[55]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[56]  Thierry Langer,et al.  LigandScout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters , 2005, J. Chem. Inf. Model..

[57]  Vinicius M Alves,et al.  Virtual screening strategies in medicinal chemistry: the state of the art and current challenges. , 2014, Current topics in medicinal chemistry.

[58]  Antonio Lavecchia,et al.  Machine-learning approaches in drug discovery: methods and applications. , 2015, Drug discovery today.

[59]  C. Wermuth,et al.  Glossary of terms used in medicinal chemistry (IUPAC Recommendations 1998) , 1998 .

[60]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[61]  Diogo Santos-Martins,et al.  Receptor-based virtual screening protocol for drug discovery. , 2015, Archives of biochemistry and biophysics.

[62]  Luca Sartori,et al.  Identification and Selection of "Privileged Fragments" Suitable for Primary Screening , 2008, J. Chem. Inf. Model..

[63]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[64]  J. Medina-Franco,et al.  Systemic QSAR and phenotypic virtual screening: chasing butterflies in drug discovery. , 2017, Drug discovery today.

[65]  M. L. Connolly Solvent-accessible surfaces of proteins and nucleic acids. , 1983, Science.

[66]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[67]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[68]  Z. Deng,et al.  Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. , 2004, Journal of medicinal chemistry.

[69]  Károly Héberger,et al.  Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? , 2015, Journal of Cheminformatics.

[70]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[71]  P. Hawkins Conformation Generation: The State of the Art , 2017, J. Chem. Inf. Model..

[72]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[73]  Hege S. Beard,et al.  Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. , 2004, Journal of medicinal chemistry.

[74]  Klaus R. Liedl,et al.  Identification of PPARgamma Partial Agonists of Natural Origin (I): Development of a Virtual Screening Procedure and In Vitro Validation , 2012, PloS one.

[75]  J. A. Grant,et al.  A Gaussian Description of Molecular Shape , 1995 .

[76]  Paul Labute,et al.  Binary QSAR: A New Method for the Determination of Quantitative Structure Activity Relationships , 1998, Pacific Symposium on Biocomputing.

[77]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[78]  B. Shoichet,et al.  A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. , 2002, Journal of medicinal chemistry.