A review of ligand-based virtual screening web tools and screening algorithms in large molecular databases in the age of big data.

Virtual screening has become a widely used technique for helping in drug discovery processes. The key to this success is its ability to aid in the identification of novel bioactive compounds by screening large molecular databases. Several web servers have emerged in the last few years supplying platforms to guide users in screening publicly accessible chemical databases in a reasonable time. In this review, we discuss a representative set of online virtual screening servers and their underlying similarity algorithms. Other related topics, such as molecular representation or freely accessible databases are also treated. The most relevant contributions to this review arise from critical discussions concerning the pros and cons of servers and algorithms, and the challenges that future works must solve in a virtual screening framework.

[1]  Yu-chian Chen Beware of docking! , 2015, Trends in pharmacological sciences.

[2]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[3]  Alexander D. MacKerell,et al.  Computer-Aided Drug Design Methods. , 2017, Methods in molecular biology.

[4]  Wolfgang H. B. Sauer,et al.  Molecular Shape Diversity of Combinatorial Libraries: A Prerequisite for Broad Bioactivity , 2003, J. Chem. Inf. Comput. Sci..

[5]  Danishuddin,et al.  Descriptors and their selection methods in QSAR analysis: paradigm for drug design. , 2016, Drug discovery today.

[6]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[7]  Jacob de Vlieg,et al.  Comparative Analysis of Pharmacophore Screening Tools , 2012, J. Chem. Inf. Model..

[8]  Zygmunt S Derewenda,et al.  Insights into the inhibition of the p90 ribosomal S6 kinase (RSK) by the flavonol glycoside SL0101 from the 1.5 Å crystal structure of the N-terminal domain of RSK2 with bound inhibitor. , 2012, Biochemistry.

[9]  Beáta Flachner,et al.  Combination of 2D/3D Ligand-Based Similarity Search in Rapid Virtual Screening from Multimillion Compound Repositories. Selection and Biological Evaluation of Potential PDE4 and PDE5 Inhibitors , 2014, Molecules.

[10]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[11]  Kohei Ichikawa,et al.  Virtual Screening Techniques and Current Computational Infrastructures. , 2016, Current pharmaceutical design.

[12]  Rona R. Ramsay,et al.  A perspective on multi-target drug discovery and design for complex diseases , 2018, Clinical and Translational Medicine.

[13]  Eugene N. Muratov,et al.  Chalcone Derivatives: Promising Starting Points for Drug Design , 2017, Molecules.

[14]  Xiaofeng Liu,et al.  ChemMapper: a versatile web server for exploring pharmacology and chemical structure association based on molecular 3D similarity method , 2013, Bioinform..

[15]  Matthias Rarey,et al.  Similarity searching in large combinatorial chemistry spaces , 2001, J. Comput. Aided Mol. Des..

[16]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[17]  David Ryan Koes,et al.  ZINCPharmer: pharmacophore search of the ZINC database , 2012, Nucleic Acids Res..

[18]  Deepak Singla,et al.  Open source software and web services for designing therapeutic molecules. , 2013, Current topics in medicinal chemistry.

[19]  Jonathan J. Chen,et al.  Developing an in silico pipeline for faster drug candidate discovery: Virtual high throughput screening with the Signature molecular descriptor using support vector machine models , 2017 .

[20]  Pierre Tufféry,et al.  wwLigCSRre: a 3D ligand-based server for hit identification and optimization , 2009, Nucleic Acids Res..

[21]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[22]  Álvaro Cortés Cabrera,et al.  A reverse combination of structure-based and ligand-based strategies for virtual screening , 2012, Journal of Computer-Aided Molecular Design.

[23]  Kwong-Sak Leung,et al.  USR-VS: a web server for large-scale prospective virtual screening using ultrafast shape recognition techniques , 2016, Nucleic Acids Res..

[24]  Leonardo L. G. Ferreira,et al.  Molecular Docking and Structure-Based Drug Design Strategies , 2015, Molecules.

[25]  John Steele,et al.  Drug-like properties: guiding principles for design - or chemical prejudice? , 2004, Drug discovery today. Technologies.

[26]  Xin Yan,et al.  LBVS: an online platform for ligand-based virtual screening using publicly accessible databases , 2014, Molecular Diversity.

[27]  Keun Woo Lee,et al.  Exploration for novel inhibitors showing back-to-front approach against VEGFR-2 kinase domain (4AG8) employing molecular docking mechanism and molecular dynamics simulations , 2018, BMC Cancer.

[28]  David Baker,et al.  Accurate protein structure modeling using sparse NMR data and homologous structure information , 2012, Proceedings of the National Academy of Sciences.

[29]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[30]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[31]  José L. Medina-Franco,et al.  Database fingerprint (DFP): an approach to represent molecular databases , 2017, Journal of Cheminformatics.

[32]  Narges Zolfaghari Molecular docking analysis of nitisinone with homogentisate 1,2 dioxygenase , 2017, Bioinformation.

[33]  Theodora Katsila,et al.  Computational approaches in target identification and drug discovery , 2016, Computational and structural biotechnology journal.

[34]  K. P. Soman,et al.  Ligand-Based Virtual Screening using Random Walk Kernel and Empirical Filters , 2015 .

[35]  Vijay S. Pande,et al.  SIML: A Fast SIMD Algorithm for Calculating LINGO Chemical Similarities on GPUs and CPUs , 2010, J. Chem. Inf. Model..

[36]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[37]  Xia Wang,et al.  Systematic Analysis of the Multiple Bioactivities of Green Tea through a Network Pharmacology Approach , 2014, Evidence-based complementary and alternative medicine : eCAM.

[38]  Thomas Girke,et al.  ChemMine. A Compound Mining Database for Chemical Genomics1 , 2005, Plant Physiology.

[39]  Jennifer L Miller,et al.  Recent developments in focused library design: targeting gene-families. , 2006, Current topics in medicinal chemistry.

[40]  Maykel Cruz-Monteagudo,et al.  Generalized Molecular Descriptors Derived From Event-Based Discrete Derivative. , 2016, Current pharmaceutical design.

[41]  Serdar Durdagi,et al.  Investigation of PDE5/PDE6 and PDE5/PDE11 selective potent tadalafil-like PDE5 inhibitors using combination of molecular modeling approaches, molecular fingerprint-based virtual screening protocols and structure-based pharmacophore development , 2017, Journal of enzyme inhibition and medicinal chemistry.

[42]  J. Drews Drug discovery: a historical perspective. , 2000, Science.

[43]  Jean-Louis Reymond,et al.  Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 , 2012, J. Chem. Inf. Model..

[44]  Olivier Sperandio,et al.  One hundred thousand mouse clicks down the road: selected online resources supporting drug discovery collected over a decade. , 2013, Drug discovery today.

[45]  Gnanendra Shanmugam,et al.  Computer-Aided Drug Discovery in Plant Pathology , 2017, The plant pathology journal.

[46]  Richard Grunzke,et al.  Portals and Web-Based Resources for Virtual Screening. , 2016, Current drug targets.

[47]  Donato Rigante,et al.  Lung Involvement in Children with Hereditary Autoinflammatory Disorders , 2016, International journal of molecular sciences.

[48]  Peter Willett Special Issue: Chemoinformatics , 2016, Molecules.

[49]  Stefano Forli,et al.  Charting a Path to Success in Virtual Screening , 2015, Molecules.

[50]  Martin Serrano,et al.  Nucleic Acids Research Advance Access published October 18, 2007 ChemBank: a small-molecule screening and , 2007 .

[51]  Stephen R. Johnson,et al.  Molecular properties that influence the oral bioavailability of drug candidates. , 2002, Journal of medicinal chemistry.

[52]  Ke Ding,et al.  Discovery of pteridin-7(8H)-one-based irreversible inhibitors targeting the epidermal growth factor receptor (EGFR) kinase T790M/L858R mutant. , 2013, Journal of medicinal chemistry.

[53]  Loriano Storchi,et al.  New and Original pKa Prediction Method Using Grid Molecular Interaction Fields , 2007, J. Chem. Inf. Model..

[54]  Hualiang Jiang,et al.  SHAFTS: a hybrid approach for 3D molecular similarity calculation. 2. Prospective case study in the discovery of diverse p90 ribosomal S6 protein kinase 2 inhibitors to suppress cell migration. , 2011, Journal of medicinal chemistry.

[55]  Lars Ridder,et al.  Revisiting the Rule of Five on the Basis of Pharmacokinetic Data from Rat , 2011, ChemMedChem.

[56]  Arnim Hellweg,et al.  TmoleX—A graphical user interface for TURBOMOLE , 2010, J. Comput. Chem..

[57]  Peter Willett,et al.  Similarity-based virtual screening using 2D fingerprints. , 2006, Drug discovery today.

[58]  John Davies,et al.  Design of small molecule libraries for NMR screening and other applications in drug discovery. , 2002, Current topics in medicinal chemistry.

[59]  Narges Zolfaghari,et al.  Competitive rational inhibitor design to 4-maleylaceto-acetate isomerase , 2017, Bioinformation.

[60]  Olivier Michielin,et al.  SwissSimilarity: A Web Tool for Low to Ultra High Throughput Ligand-Based Virtual Screening , 2016, J. Chem. Inf. Model..

[61]  David Ryan Koes,et al.  Pharmer: Efficient and Exact Pharmacophore Search , 2011, J. Chem. Inf. Model..

[62]  R. Jackson,et al.  Homology-modelling protein-ligand interactions: allowing for ligand-induced conformational change. , 2010, Journal of molecular biology.

[63]  Jürgen Bajorath,et al.  Integration of virtual and high-throughput screening , 2002, Nature Reviews Drug Discovery.

[64]  Joshua M. Stuart,et al.  Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. , 2018, Cell.

[65]  Abhinav Vishnu,et al.  Deep learning for computational chemistry , 2017, J. Comput. Chem..

[66]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[67]  Raman Sharma,et al.  ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics , 2010, J. Comput. Aided Mol. Des..

[68]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[69]  Kunal Roy,et al.  How far can virtual screening take us in drug discovery? , 2013, Expert opinion on drug discovery.

[70]  Z R Li,et al.  MODEL—molecular descriptor lab: A web‐based server for computing structural and physicochemical features of compounds , 2007, Biotechnology and bioengineering.

[71]  Zhiyong Lu,et al.  The CHEMDNER corpus of chemicals and drugs and its annotation principles , 2015, Journal of Cheminformatics.

[72]  Arno Formella,et al.  Superimposé: a 3D structural superposition server , 2008, Nucleic Acids Res..

[73]  M. L. Connolly Solvent-accessible surfaces of proteins and nucleic acids. , 1983, Science.

[74]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[75]  Adel Hamza,et al.  SABRE: Ligand/Structure-Based Virtual Screening Approach Using Consensus Molecular-Shape Pattern Recognition , 2014, J. Chem. Inf. Model..

[76]  Ganesh Chandra Sahoo,et al.  In Vitro Evaluation of Antileishmanial Activity of Computationally Screened Compounds against Ascorbate Peroxidase To Combat Amphotericin B Drug Resistance , 2017, Antimicrobial Agents and Chemotherapy.

[77]  D. Dripps,et al.  DNA bending induced by the catabolite activator protein allows ring formation of a 144 bp DNA. , 1987, Journal of biomolecular structure & dynamics.

[78]  Antony J. Williams,et al.  ChemSpider:: An Online Chemical Information Resource , 2010 .

[79]  Ingo Muegge,et al.  3D virtual screening of large combinatorial spaces. , 2015, Methods.

[80]  Feroz Khan,et al.  Virtual screening, Docking, ADMET and System Pharmacology studies on Garcinia caged Xanthone derivatives for Anticancer activity , 2018, Scientific Reports.

[81]  Cheng Luo,et al.  Development of a novel class of B-Raf(V600E)-selective inhibitors through virtual screening and hierarchical hit optimization. , 2012, Organic & biomolecular chemistry.

[82]  Xiaofeng Liu,et al.  SHAFTS: A Hybrid Approach for 3D Molecular Similarity Calculation. 1. Method and Assessment of Virtual Screening , 2011, J. Chem. Inf. Model..

[83]  Christian N. S. Pedersen,et al.  Methods for Similarity-based Virtual Screening , 2013, Computational and structural biotechnology journal.

[84]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[85]  Renu Vyas,et al.  Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery. , 2015, Combinatorial chemistry & high throughput screening.

[86]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[87]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[88]  Andrzej Kolinski,et al.  Protein structure prediction: Combining de novo modeling with sparse experimental data , 2007, J. Comput. Chem..

[89]  Enrico Glaab,et al.  Building a virtual ligand screening pipeline using free software: a survey , 2015, Briefings Bioinform..

[90]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[91]  Pedro M. Alzari,et al.  A potent new mode of β-lactamase inhibition revealed by the 1.7 Å X-ray crystallographic structure of the TEM-1–BLIP complex , 1996, Nature Structural Biology.

[92]  Ruth Brenk,et al.  Mining the ChEMBL Database: An Efficient Chemoinformatics Workflow for Assembling an Ion Channel-Focused Screening Library , 2011, J. Chem. Inf. Model..

[93]  William J. Welsh,et al.  Avalanche for shape and feature-based virtual screening with 3D alignment , 2015, Journal of Computer-Aided Molecular Design.

[94]  Daisuke Kihara,et al.  Three-Dimensional Compound Comparison Methods and Their Application in Drug Discovery , 2015, Molecules.

[95]  Sumudu P Leelananda,et al.  Computational methods in drug discovery , 2016, Beilstein journal of organic chemistry.

[96]  Douglas R. Houston,et al.  UFSRAT: Ultra-Fast Shape Recognition with Atom Types –The Discovery of Novel Bioactive Small Molecular Scaffolds for FKBP12 and 11βHSD1 , 2015, PloS one.

[97]  Salman Akhtar,et al.  Screening and Elucidation of Selected Natural Compounds for Anti- Alzheimer's Potential Targeting BACE-1 Enzyme: A Case Computational Study. , 2017, Current computer-aided drug design.

[98]  Jorge Cortés,et al.  SEABED: Small molEcule activity scanner weB servicE baseD , 2014, Bioinform..

[99]  Benjamin A. Ellingson,et al.  Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database , 2010, J. Chem. Inf. Model..

[100]  Jean-Louis Reymond,et al.  Chemical Space: Big Data Challenge for Molecular Diversity. , 2017, Chimia.

[101]  Dong-Sheng Cao,et al.  ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..

[102]  J. Hughes,et al.  Physiochemical drug properties associated with in vivo toxicological outcomes. , 2008, Bioorganic & medicinal chemistry letters.

[103]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[104]  Xi Dai,et al.  HybridSim‐VS: a web server for large‐scale ligand‐based virtual screening using hybrid similarity recognition techniques , 2017, Bioinform..