GFscore: A General Nonlinear Consensus Scoring Function for High-Throughput Docking

Most of the recent published works in the field of docking and scoring protein/ligand complexes have focused on ranking true positives resulting from a Virtual Library Screening (VLS) through the use of a specified or consensus linear scoring function. In this work, we present a methodology to speed up the High Throughput Screening (HTS) process, by allowing focused screens or for hitlist triaging when a prohibitively large number of hits is identified in the primary screen, where we have extended the principle of consensus scoring in a nonlinear neural network manner. This led us to introduce a nonlinear Generalist scoring Function, GFscore, which was trained to discriminate true positives from false positives in a data set of diverse chemical compounds. This original Generalist scoring Function is a combination of the five scoring functions found in the CScore package from Tripos Inc. GFscore eliminates up to 75% of molecules, with a confidence rate of 90%. The final result is a Hit Enrichment in the list of molecules to investigate during a research campaign for biological active compounds where the remaining 25% of molecules would be sent to in vitro screening experiments. GFscore is therefore a powerful tool for the biologist, saving both time and money.

[1]  Huan-Xiang Zhou,et al.  Prediction of interface residues in protein–protein complexes by a consensus neural network method: Test against NMR data , 2005, Proteins.

[2]  R. Glen,et al.  Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. , 1995, Journal of molecular biology.

[3]  D. Frank Hsu,et al.  Consensus Scoring Criteria for Improving Enrichment in Virtual Screening , 2005, J. Chem. Inf. Model..

[4]  T Lengauer,et al.  The particle concept: placing discrete water molecules during protein‐ligand docking predictions , 1999, Proteins.

[5]  G. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions. , 1999 .

[6]  R. Jarvis,et al.  ClusteringUsing a Similarity Measure Based on SharedNear Neighbors , 1973 .

[7]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[8]  Maria Kontoyianni,et al.  Evaluation of library ranking efficacy in virtual screening , 2005, J. Comput. Chem..

[9]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[10]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[11]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[12]  J. Irwin,et al.  ZINC ? A Free Database of Commercially Available Compounds for Virtual Screening. , 2005 .

[13]  Anthony E. Klon,et al.  Application of Machine Learning To Improve the Results of High-Throughput Docking Against the HIV-1 Protease , 2004, Journal of Chemical Information and Modeling.

[14]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[15]  D. J. Price,et al.  Assessing scoring functions for protein-ligand interactions. , 2004, Journal of medicinal chemistry.

[16]  D. Rognan,et al.  Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. , 2000, Journal of medicinal chemistry.

[17]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[18]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[19]  Thierry Langer,et al.  Impact of Scoring Functions on Enrichment in Docking-Based Virtual Screening: An Application Study on Renin Inhibitors , 2004, J. Chem. Inf. Model..

[20]  Anthony E. Klon,et al.  Finding more needles in the haystack: A simple and efficient method for improving high-throughput docking results. , 2004, Journal of medicinal chemistry.

[21]  G. V. Paolini,et al.  Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes , 1997, J. Comput. Aided Mol. Des..

[22]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[23]  Li Xing,et al.  Evaluation and application of multiple scoring functions for a virtual screening experiment , 2004, J. Comput. Aided Mol. Des..

[24]  Didier Rognan,et al.  Comparative evaluation of eight docking tools for docking and virtual screening accuracy , 2004, Proteins.

[25]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[26]  Richard D. Taylor,et al.  Virtual Screening Using Protein—Ligand Docking: Avoiding Artificial Enrichment. , 2004 .

[27]  S S Cohen,et al.  A strategy for the chemotherapy of infectious disease. , 1977, Science.

[28]  Jie Liang,et al.  CASTp: Computed Atlas of Surface Topography of proteins , 2003, Nucleic Acids Res..

[29]  Thomas Lengauer,et al.  A fast flexible docking method using an incremental construction algorithm. , 1996, Journal of molecular biology.

[30]  P. Hajduk,et al.  Evaluation of PMF scoring in docking weak ligands to the FK506 binding protein. , 1999, Journal of medicinal chemistry.

[31]  L. Krippahl,et al.  BiGGER: A new (soft) docking algorithm for predicting protein interactions , 2000, Proteins.

[32]  H. Edelsbrunner,et al.  Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design , 1998, Protein science : a publication of the Protein Society.

[33]  Shaomeng Wang,et al.  How Does Consensus Scoring Work for Virtual Library Screening? An Idealized Computer Experiment , 2001, J. Chem. Inf. Comput. Sci..

[34]  Johannes H. Voigt,et al.  Comparison of the NCI Open Database with Seven Large Chemical Structural Databases , 2001, J. Chem. Inf. Comput. Sci..

[35]  R. Clark,et al.  Consensus scoring for ligand/protein interactions. , 2002, Journal of molecular graphics & modelling.

[36]  F. Jørgensen,et al.  A new concept for multidimensional selection of ligand conformations (MultiSelect) and multidimensional scoring (MultiScore) of protein-ligand binding affinities. , 2001, Journal of medicinal chemistry.