Enhancing large-scale docking simulation on heterogeneous systems: An MPI vs rCUDA study

Abstract Virtual Screening (VS) methods can considerably aid clinical research by predicting how ligands interact with pharmacological targets, thus accelerating the slow and critical process of finding new drugs. VS methods screen large databases of chemical compounds to find a candidate that interacts with a given target. The computational requirements of VS models, along with the size of the databases, containing up to millions of biological macromolecular structures, means computer clusters are a must. However, programming current clusters of computers is no easy task, as they have become heterogeneous and distributed systems where various programming models need to be used together to fully leverage their resources. This paper evaluates several strategies to provide peak performance to a GPU-based molecular docking application called M E T A D O C K in heterogeneous clusters of computers based on CPU and NVIDIA Graphics Processing Units (GPUs). Our developments start with an OpenMP, MPI and CUDA M E T A D O C K version as a baseline case of cluster utilization. Next, we explore the virtualized GPUs provided by the r C U D A framework in order to facilitate the programming process. rCUDA allows us to use remote GPUs, i.e. installed in other nodes of the cluster, as if they were installed in the local node, so enabling access to them using only OpenMP and CUDA. Finally, several load balancing strategies are analyzed in a search to enhance performance. Our results reveal that the use of middleware like rCUDA is a convincing alternative to leveraging heterogeneous clusters, as it offers even better performance than traditional approaches and also makes it easier to program these emerging clusters.

[1]  Kamil Kuca,et al.  Parallel Flexible Molecular Docking in Computational Chemistry on High Performance Computing Clusters , 2015, ICCCI.

[2]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[3]  Ajay N. Jain,et al.  Scoring functions for protein-ligand docking. , 2006, Current protein & peptide science.

[4]  David Kaeli,et al.  Heterogeneous Computing with OpenCL 2.0 , 2015 .

[5]  Andrew Warfield,et al.  Xen and the art of virtualization , 2003, SOSP '03.

[6]  P. Hajduk,et al.  A decade of fragment-based drug design: strategic advances and lessons learned , 2007, Nature reviews. Drug discovery.

[7]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[8]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[9]  Tal Garfinkel,et al.  Virtual machine monitors: current technology and future trends , 2005, Computer.

[10]  Carlos Reaño,et al.  A complete and efficient CUDA-sharing solution for HPC clusters , 2014, Parallel Comput..

[11]  Sayantan Sur,et al.  MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters , 2011, Computer Science - Research and Development.

[12]  Matthieu Montes,et al.  Benchmarking Data Sets for the Evaluation of Virtual Ligand Screening Methods: Review and Perspectives , 2015, J. Chem. Inf. Model..

[13]  Domingo Giménez,et al.  Parameterized Schemes of Metaheuristics: Basic Ideas and Applications With Genetic Algorithms, Scatter Search, and GRASP , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[14]  José M. García,et al.  High-Throughput parallel blind Virtual Screening using BINDSURF , 2012, BMC Bioinformatics.

[15]  Ruth Nussinov,et al.  Structure and dynamics of molecular networks: A novel paradigm of drug discovery. A comprehensive review , 2012, Pharmacology & therapeutics.

[16]  W. L. Jorgensen The Many Roles of Computation in Drug Discovery , 2004, Science.

[17]  Günther R. Raidi A unified view on hybrid metaheuristics , 2006 .

[18]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[19]  Domingo Giménez,et al.  Modeling Shared-Memory Metaheuristic Schemes for Electricity Consumption , 2012, DCAI.

[20]  J D Pickard,et al.  ICM+, a flexible platform for investigations of cerebrospinal dynamics in clinical practice. , 2008, Acta neurochirurgica. Supplement.

[21]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[22]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[23]  Ajay N. Jain Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. , 2003, Journal of medicinal chemistry.

[24]  Alejandro A. Franco,et al.  Multiscale modelling and numerical simulation of rechargeable lithium ion batteries: concepts, methods and challenges , 2013 .

[25]  Domingo Giménez,et al.  Determination of the Kinetic Constants of a Chemical Reaction in Heterogeneous Phase Using Parameterized Metaheuristics , 2013, ICCS.

[26]  Todd M. Austin Bridging the Moore's Law Performance Gap with Innovation Scaling , 2015, ICPE.

[27]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[28]  Simon McIntosh-Smith,et al.  High performance in silico virtual drug screening on many-core processors , 2015, Int. J. High Perform. Comput. Appl..

[29]  Carlos Reaño,et al.  Local and Remote GPUs Perform Similar with EDR 100G InfiniBand , 2015, Middleware Industry.

[30]  Sergio Iserte,et al.  Increasing the Performance of Data Centers by Combining Remote GPU Virtualization with Slurm , 2016, 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

[31]  Jesús Carretero,et al.  Optimizations to enhance sustainability of MPI applications , 2014, EuroMPI/ASIA.

[32]  Luca Maria Gambardella,et al.  A survey on metaheuristics for stochastic combinatorial optimization , 2009, Natural Computing.

[33]  Duncan Poole,et al.  Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit Solvent Particle Mesh Ewald. , 2013, Journal of chemical theory and computation.

[34]  B. Roux,et al.  Absolute binding free energy calculations using molecular dynamics simulations with restraining potentials. , 2006, Biophysical journal.

[35]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[36]  Domingo Giménez,et al.  METADOCK: A parallel metaheuristic schema for virtual screening methods , 2018, Int. J. High Perform. Comput. Appl..

[37]  Jesús A. Izaguirre,et al.  Petaflop Computing for Protein Folding , 2001, PPSC.

[38]  Fedor N. Novikov,et al.  Lead finder: an approach to improve accuracy of protein-ligand docking, binding energy estimation, and virtual screening. , 2008, Journal of chemical information and modeling.

[39]  Thierry Langer,et al.  Virtual screening for the discovery of bioactive natural products , 2008, Progress in drug research. Fortschritte der Arzneimittelforschung. Progres des recherches pharmaceutiques.