GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing

Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.

[1]  N. Chirgadze,et al.  The crystal structure of human α‐thrombin complexed with LY178550, a nonpeptidyl, active site‐directed inhibitor , 1997, Protein science : a publication of the Protein Society.

[2]  A. Ghose,et al.  A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. , 1999, Journal of combinatorial chemistry.

[3]  R. Raag,et al.  The structural basis for substrate-induced changes in redox potential and spin equilibrium in cytochrome P-450CAM. , 1991, Biochemistry.

[4]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[5]  Paul N. Mortenson,et al.  Diverse, high-quality test set for the validation of protein-ligand docking performance. , 2007, Journal of medicinal chemistry.

[6]  Karthikeyan Sankaralingam,et al.  A Detailed Analysis of Contemporary ARM and x86 Architectures , 2013 .

[7]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[8]  R. Abagyan,et al.  Flexible ligand docking to multiple receptor conformations: a practical alternative. , 2008, Current opinion in structural biology.

[9]  Burkhard Rost,et al.  ISCB Ebola Award for Important Future Research on the Computational Biology of Ebola Virus , 2015, PLoS Computational Biology.

[10]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[11]  Simon McIntosh-Smith,et al.  High performance in silico virtual drug screening on many-core processors , 2015, Int. J. High Perform. Comput. Appl..

[12]  Jie Li,et al.  Comparative Assessment of Scoring Functions on an Updated Benchmark: 1. Compilation of the Test Set , 2014, J. Chem. Inf. Model..

[13]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[14]  Herb Sutter,et al.  The Free Lunch Is Over A Fundamental Turn Toward Concurrency in Software , 2013 .

[15]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[16]  Michal Brylinski,et al.  Q‐Dock: Low‐resolution flexible ligand docking with pocket‐specific threading restraints , 2008, J. Comput. Chem..

[17]  Michael W Deem,et al.  Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.

[18]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[19]  P. Charifson,et al.  Peptide ligands of pp60(c-src) SH2 domains: a thermodynamic and structural study. , 1997, Biochemistry.

[20]  James Reinders,et al.  Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[21]  Kristof Beyls,et al.  Reuse Distance as a Metric for Cache Behavior. , 2001 .

[22]  M. Lill Efficient incorporation of protein flexibility and dynamics into molecular docking simulations. , 2011, Biochemistry.

[23]  Michal Brylinski,et al.  eFindSite: Enhanced Fingerprint‐Based Virtual Screening Against Predicted Ligand Binding Sites in Protein Models , 2014, Molecular informatics.

[24]  Michal Brylinski,et al.  Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets , 2015, Journal of Cheminformatics.

[25]  Didier Rognan,et al.  Beware of Machine Learning-Based Scoring Functions - On the Danger of Developing Black Boxes , 2014, J. Chem. Inf. Model..

[26]  Michal Brylinski,et al.  eFindSite: Improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands , 2013, Journal of Computer-Aided Molecular Design.

[27]  J. Bajorath,et al.  Quo vadis, virtual screening? A comprehensive survey of prospective applications. , 2010, Journal of medicinal chemistry.

[28]  Yanli Wang,et al.  Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review , 2012, The AAPS Journal.

[29]  J. Reymond,et al.  Exploring chemical space for drug discovery using the chemical universe database. , 2012, ACS chemical neuroscience.

[30]  Thomas Stützle,et al.  Accelerating Molecular Docking Calculations Using Graphics Processing Units , 2011, J. Chem. Inf. Model..

[31]  Christian N. S. Pedersen,et al.  GPU-accelerated high-accuracy molecular docking using guided differential evolution: real world applications , 2011, GECCO '11.

[32]  D. E. Clark What has virtual screening ever done for drug discovery? , 2008, Expert opinion on drug discovery.

[33]  J. Ramanujam,et al.  GeauxDock: A novel approach for mixed‐resolution ligand docking using a descriptor‐based force field , 2015, J. Comput. Chem..

[34]  Claudio N. Cavasotto,et al.  Homology modeling in drug discovery: current trends and applications. , 2009, Drug discovery today.

[35]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[36]  Rong Ge,et al.  Green Supercomputing Comes of Age , 2008, IT Professional.

[37]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[38]  Edgar Jacoby,et al.  Evaluation of the utility of homology models in high throughput docking , 2007, Journal of molecular modeling.

[39]  W. Wenzel,et al.  Comparison of stochastic optimization methods for receptor-ligand docking , 2002 .

[40]  Holger Gohlke,et al.  How Good Are State-of-the-Art Docking Tools in Predicting Ligand Binding Modes in Protein-Protein Interfaces? , 2012, J. Chem. Inf. Model..

[41]  José M. García,et al.  Parallelization of Virtual Screening in Drug Discovery on Massively Parallel Architectures , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[42]  H. Katzgraber Introduction to Monte Carlo Methods , 2009, 0905.1629.

[43]  R. Huber,et al.  Structures of class pi glutathione S-transferase from human placenta in complex with substrate, transition-state analogue and inhibitor. , 1997, Structure.