TrixX: structure-based molecule indexing for large-scale virtual screening in sublinear time

Structure-based virtual screening today is basically organized as a sequential process where the molecules of a screening library are evaluated for instance with respect to their fit with a biological target. In this paper, we present a novel structure-based screening paradigm avoiding sequential searching and therefore enabling sublinear runtime behavior. We implemented the novel paradigm in the virtual screening tool TrixX and successfully applied it in screening experiments on four targets from relevant therapeutic areas. With the screening paradigm implemented in TrixX, we propose some important extensions and modifications to traditional virtual screening approaches: Instead of processing all compounds in the screening library sequentially, TrixX first analyzes the geometric and physicochemical binding site characteristics and then draws compounds with matching features from a compound catalog. The catalog organizes the compounds by their physicochemical and geometric features making use of relational database technology with indexed tables in order to support efficient queries for compounds with specific features. A key element of the compound catalog is a highly selective geometric descriptor that carries information on the type of functional groups of the compound, their Euclidian distance, the preferred interaction direction of each functional group, and the location of steric bulk around the triangle.In a re-docking experiment with 200 protein–ligand complexes, we could show that TrixX is able to correctly predict the location of ligand functional groups in co-crystallized complexes. In a retrospective virtual screening experiment for four different targets, the enrichment factors of TrixX are comparable to the enrichment factors of FlexX and FlexX-Scan. With computing times clearly below one second per compound, TrixX counts among the fastest virtual screening tools currently available and is nearly two orders of magnitude faster than standard FlexX.

[1]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[2]  M. Rarey,et al.  FlexX‐Scan: Fast, structure‐based virtual screening , 2004, Proteins.

[3]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[4]  Shaomeng Wang,et al.  An Extensive Test of 14 Scoring Functions Using the PDBbind Refined Set of 800 Protein-Ligand Complexes , 2004, J. Chem. Inf. Model..

[5]  Andreas Krämer,et al.  Fast 3D molecular superposition and similarity search in databases of flexible molecules , 2003, J. Comput. Aided Mol. Des..

[6]  Diane Joseph-McCarthy,et al.  Pharmacophore‐based molecular docking to account for ligand flexibility , 2003, Proteins.

[7]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[8]  B. Matthews,et al.  Docking molecules by families to increase the diversity of hits in database screens: Computational strategy and experimental evaluation , 2001, Proteins.

[9]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[10]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[11]  Renxiao Wang,et al.  Comparative evaluation of 11 scoring functions for molecular docking. , 2003, Journal of medicinal chemistry.

[12]  Pascal Furet,et al.  Structure-based design and protein X-ray analysis of a protein kinase inhibitor. , 2002, Bioorganic & medicinal chemistry letters.

[13]  R. Langridge,et al.  On the structure selectivity problem in drug design. A comparative study of benzylpyrimidine inhibition of vertebrate and bacterial dihydrofolate reductase via molecular graphics and quantitative structure-activity relationships. , 1989, Journal of medicinal chemistry.

[14]  Gennady Verkhivker,et al.  Deciphering common failures in molecular docking of ligand-protein complexes , 2000, J. Comput. Aided Mol. Des..

[15]  G. Schneider,et al.  Fuzzy pharmacophore models from molecular alignments for correlation-vector-based virtual screening. , 2004, Journal of medicinal chemistry.

[16]  M Rarey,et al.  Detailed analysis of scoring functions for virtual screening. , 2001, Journal of medicinal chemistry.

[17]  J M Blaney,et al.  A geometric approach to macromolecule-ligand interactions. , 1982, Journal of molecular biology.

[18]  B. Shoichet,et al.  Flexible ligand docking using conformational ensembles , 1998, Protein science : a publication of the Protein Society.

[19]  Andrew C. Good,et al.  High-throughput and Virtual Screening: Core Lead Discovery Technologies Move Towards Integration , 2000 .

[20]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[21]  Sally A. Hindle,et al.  The FlexX database docking environment--rational extraction of receptor based pharmacophores. , 2004, Current drug discovery technologies.

[22]  J. Mason,et al.  New 4-point pharmacophore method for molecular similarity and diversity applications: overview of the method and applications, including a novel approach to the design of combinatorial libraries containing privileged substructures. , 1999, Journal of medicinal chemistry.

[23]  Jürgen Bajorath,et al.  Integration of virtual and high-throughput screening , 2002, Nature Reviews Drug Discovery.

[24]  D. Banner,et al.  Crystallographic analysis at 3.0-A resolution of the binding to human thrombin of four active site-directed inhibitors. , 1994, The Journal of biological chemistry.

[25]  I D Kuntz,et al.  CombiDOCK: Structure-based combinatorial docking and library design , 1998, Journal of computer-aided molecular design.

[26]  B. Stockwell Exploring biology with small organic molecules , 2004, Nature.

[27]  Thomas Lengauer,et al.  Flexible docking under pharmacophore type constraints , 2002, J. Comput. Aided Mol. Des..

[28]  J. Bolin,et al.  Crystal structures of Escherichia coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 A resolution. I. General features and binding of methotrexate. , 1982, The Journal of biological chemistry.

[29]  Fabio Zuccotto,et al.  Pharmacophore Features Distributions in Different Classes of Compounds , 2003, J. Chem. Inf. Comput. Sci..

[30]  J. Tainer,et al.  Screening a peptidyl database for potential ligands to proteins with side‐chain flexibility , 1998, Proteins.

[31]  R. Natesh,et al.  Crystal structure of the human angiotensin-converting enzyme–lisinopril complex , 2003, Nature.

[32]  L. Kuhn,et al.  Virtual screening with solvation and ligand-induced complementarity , 2000 .

[33]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[34]  Nagarajan Vaidehi,et al.  HierVLS hierarchical docking protocol for virtual ligand screening of large-molecule databases. , 2004, Journal of medicinal chemistry.