Tunable approximations to control time-to-solution in an HPC molecular docking Mini-App

The drug discovery process involves several tasks to be performed in vivo, in vitro and in silico. Molecular docking is a task typically performed in silico. It aims at finding the three-dimensional pose of a given molecule when it interacts with the target protein binding site. This task is often done for virtual screening a huge set of molecules to find the most promising ones, which will be forwarded to the later stages of the drug discovery process. Given the huge complexity of the problem, molecular docking cannot be solved by exploring the entire space of the ligand poses. State-of-the-art approaches face the problem by sampling the space of the ligand poses to generate results in a reasonable time budget. In this work, we improve the geometric approach to molecular docking by introducing tunable approximations. In particular, we analysed and enriched the original implementation with tunable software knobs to explore and control the performance-accuracy trade-offs. We modelled time-to-solution of the virtual screening task as a function of software knobs, input data features, and available computational resources. Therefore, the application can autotune its configuration according to a user-defined time budget. We used a Mini-App derived by LiGenDock—a state-of-the-art molecular docking application—to validate the proposed approach. We run the enhanced Mini-App on a high-performance computing system by using a very large database of pockets and ligands. The proposed approach exposes a time-to-solution interval spanning more than one order of magnitude with accuracy degradation up to 30%, more in general providing different accuracy levels according to the needs of the virtual screening campaign.

[1]  G. Pazour,et al.  Ror2 signaling regulates Golgi structure and transport through IFT20 for tumor invasiveness , 2017, Scientific Reports.

[2]  Katherine Yelick,et al.  OSKI: A library of automatically tuned sparse matrix kernels , 2005 .

[3]  Dimitrios S. Nikolopoulos,et al.  Exploiting Significance of Computations for Energy-Constrained Approximate Computing , 2016, International Journal of Parallel Programming.

[4]  Christiane Jablonowski,et al.  An analysis of 1D finite-volume methods for geophysical problems on refined grids , 2011, J. Comput. Phys..

[5]  Adrià Cereto-Massagué,et al.  The Light and Dark Sides of Virtual Screening: What Is There to Know? , 2019, International journal of molecular sciences.

[6]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[8]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[9]  Elizabeth R. Jessup,et al.  Performance-Based Numerical Solver Selection in the Lighthouse Framework , 2016, SIAM J. Sci. Comput..

[10]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[11]  Norbert Wehn,et al.  The transprecision computing paradigm: Concept, design, and applications , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Keshav Pingali,et al.  Proactive Control of Approximate Programs , 2016, ASPLOS.

[13]  Gianluca Palermo,et al.  mARGOt: A Dynamic Autotuning Framework for Self-Aware Approximate Computing , 2019, IEEE Transactions on Computers.

[14]  Teng Wang,et al.  Characterization and Optimization of Memory-Resident MapReduce on HPC Systems , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[15]  Carlo Cavazzoni,et al.  LiGen: A High Performance Workflow for Chemistry Driven de Novo Design , 2013, J. Chem. Inf. Model..

[16]  Prasanna Balaprakash,et al.  Autotuning in High-Performance Computing Applications , 2018, Proceedings of the IEEE.

[17]  Ping Chen,et al.  A nested‐grid ocean model: With application to the simulation of meanders and eddies in the Norwegian Coastal Current , 1992 .

[18]  Michael S. Fox-Rabinovitz,et al.  A Variable-Resolution Stretched-Grid General Circulation Model: Regional Climate Simulation , 2001 .

[19]  P Willett,et al.  Development and validation of a genetic algorithm for flexible docking. , 1997, Journal of molecular biology.

[20]  E. Lionta,et al.  Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances , 2014, Current topics in medicinal chemistry.

[21]  Matthew P. Repasky,et al.  Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. , 2004, Journal of medicinal chemistry.

[22]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[23]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[24]  Kalyan Veeramachaneni,et al.  Autotuning algorithmic choice for input sensitivity , 2015, PLDI.

[25]  Cass W. Everitt,et al.  Interactive Order-Independent Transparency , 2001 .

[26]  Kwan-Liu Ma,et al.  Visualizing 3D Earthquake Simulation Data , 2011, Computing in Science & Engineering.

[27]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[28]  Kaushik Roy,et al.  Approximate computing and the quest for computing efficiency , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[29]  Thomas Lengauer,et al.  Evaluation of the FLEXX incremental construction algorithm for protein–ligand docking , 1999, Proteins.

[30]  Ajay N. Jain Surflex-Dock 2.1: Robust performance from ligand energetic modeling, ring flexibility, and knowledge-based search , 2007, J. Comput. Aided Mol. Des..

[31]  José M. F. Moura,et al.  Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms , 2004, Int. J. High Perform. Comput. Appl..

[32]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33]  Shaomeng Wang,et al.  MCDOCK: A Monte Carlo simulation approach to the molecular docking problem , 1999, J. Comput. Aided Mol. Des..

[34]  Chun Chen,et al.  Auto-tuning full applications: A case study , 2011, Int. J. High Perform. Comput. Appl..

[35]  Michael Gerndt,et al.  An architecture for flexible auto-tuning: The Periscope Tuning Framework 2.0 , 2016, 2016 2nd International Conference on Green High Performance Computing (ICGHPC).

[36]  Omer Khan,et al.  GraphTuner: An Input Dependence Aware Loop Perforation Scheme for Efficient Execution of Approximated Graph Algorithms , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[37]  Alessandro Pedretti,et al.  Novel selective, potent naphthyl TRPM8 antagonists identified through a combined ligand- and structure-based virtual screening approach , 2017, Scientific Reports.

[38]  Cláudio T. Silva,et al.  GPU-Based Tiled Ray Casting Using Depth Peeling , 2006, J. Graph. Tools.

[39]  David D. Cox,et al.  Machine learning for predictive auto-tuning with boosted regression trees , 2012, 2012 Innovative Parallel Computing (InPar).

[40]  Qiang Xu,et al.  Approximate Computing: A Survey , 2016, IEEE Design & Test.

[41]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[42]  Gianluca Palermo,et al.  An Efficient Monte Carlo-Based Probabilistic Time-Dependent Routing Calculation Targeting a Server-Side Car Navigation System , 2019, IEEE Transactions on Emerging Topics in Computing.

[43]  T. Fuller‐Rowell,et al.  A two‐dimensional, high‐resolution, nested‐grid model of the thermosphere: 1. Neutral response to an electric field “spike” , 1984 .

[44]  T. Killeen,et al.  A high-resolution, three-dimensional, time dependent, nested grid model of the coupled thermosphere–ionosphere , 1999 .

[45]  Luca Benini,et al.  The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems , 2016, Conf. Computing Frontiers.

[46]  Jie Liu,et al.  Scalable-effort classifiers for energy-efficient machine learning , 2015, DAC.

[47]  Carlo Cavazzoni,et al.  Use of Experimental Design To Optimize Docking Performance: The Case of LiGenDock, the Docking Module of Ligen, a New De Novo Design Program , 2013, J. Chem. Inf. Model..

[48]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[49]  M. Rarey,et al.  FlexX‐Scan: Fast, structure‐based virtual screening , 2004, Proteins.

[50]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[51]  Paul D Lyne,et al.  Structure-based virtual screening: an overview. , 2002, Drug discovery today.

[52]  P. Sadayappan,et al.  Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[53]  Todd J. A. Ewing,et al.  DOCK 4.0: Search strategies for automated molecular docking of flexible molecule databases , 2001, J. Comput. Aided Mol. Des..

[54]  Alan Edelman,et al.  Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).

[55]  L. Krippahl,et al.  BiGGER: A new (soft) docking algorithm for predicting protein interactions , 2000, Proteins.

[56]  T. Fuller‐Rowell,et al.  A two-dimensional, high-resolution, nested-grid model of the thermosphere: 2. Response of the thermosphere to narrow and broad electrodynamic features , 1985 .

[57]  S. Kim,et al.  "Soft docking": matching of molecular surface cubes. , 1991, Journal of molecular biology.

[58]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[59]  John E Eksterowicz,et al.  Evaluation of a novel shape-based computational filter for lead evolution: application to thrombin inhibitors. , 2002, Journal of medicinal chemistry.

[60]  René Thomsen,et al.  MolDock: a new technique for high-accuracy molecular docking. , 2006, Journal of medicinal chemistry.