Graphics processing units in bioinformatics, computational biology and systems biology

Abstract Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools.

[1]  L. Alberghina,et al.  Systems Biology: Definitions and Perspectives , 2005 .

[2]  Scott L. Diamond,et al.  Systems Biology of Coagulation Initiation: Kinetics of Thrombin Generation in Resting and Activated Human Blood , 2010, PLoS Comput. Biol..

[3]  Huzefa Rangwala,et al.  GPU-Euler: Sequence Assembly Using GPGPU , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[4]  D. Gillespie The chemical Langevin equation , 2000 .

[5]  Ray W. Grout,et al.  Accelerated application development: The ORNL Titan experience , 2015, Comput. Electr. Eng..

[6]  Bo Hong,et al.  A GPU-Based Approach to Accelerate Computational Protein-DNA Docking , 2012, Computing in Science & Engineering.

[7]  Rodrigo Lopez,et al.  Multiple sequence alignment with the Clustal series of programs , 2003, Nucleic Acids Res..

[8]  Alessandro Orro,et al.  A tool for mapping Single Nucleotide Polymorphisms using Graphics Processing Units , 2014, BMC Bioinformatics.

[9]  Diwakar Shukla,et al.  OpenMM 4: A Reusable, Extensible, Hardware Independent Library for High Performance Molecular Simulation. , 2013, Journal of chemical theory and computation.

[10]  Jens Stoye,et al.  Exact and complete short-read alignment to microbial genomes using Graphics Processing Unit programming , 2011, Bioinform..

[11]  Gianni De Fabritiis,et al.  Swan: A tool for porting CUDA programs to OpenCL , 2011, Comput. Phys. Commun..

[12]  Mile Šikić,et al.  SW#–GPU-enabled exact alignments on genome scale , 2013, Bioinform..

[13]  Roshan M. D'Souza,et al.  Accelerating the Gillespie Exact Stochastic Simulation Algorithm Using Hybrid Parallel Execution on Graphics Processing Units , 2012, PloS one.

[14]  Giancarlo Mauri,et al.  cuTauLeaping: A GPU-Powered Tau-Leaping Stochastic Simulator for Massive Parallel Analyses of Biological Systems , 2014, PloS one.

[15]  Giancarlo Mauri,et al.  Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization , 2014, BioMed research international.

[16]  Cole Trapnell,et al.  Optimizing data intensive GPGPU computations for DNA sequence alignment , 2009, Parallel Comput..

[17]  Thomas K. F. Wong,et al.  SOAP3-dp: Fast, Accurate and Sensitive GPU-Based Short Read Aligner , 2013, PloS one.

[18]  Andreas W. Götz,et al.  SPFP: Speed without compromise - A mixed precision model for GPU accelerated molecular dynamics simulations , 2013, Comput. Phys. Commun..

[19]  Giancarlo Mauri,et al.  GPU-accelerated simulations of mass-action kinetics models with cupSODA , 2014, The Journal of Supercomputing.

[20]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[21]  Sanguthevar Rajasekaran,et al.  Fast GPU algorithms for implementing the red-black Gauss-Seidel method for Solving Partial Differential Equations , 2013, 2013 IEEE Symposium on Computers and Communications (ISCC).

[22]  Vijay S Pande,et al.  Long Timestep Molecular Dynamics on the Graphical Processing Unit. , 2013, Journal of chemical theory and computation.

[23]  Kevin Skadron,et al.  Accelerating Compute-Intensive Applications with GPUs and FPGAs , 2008, 2008 Symposium on Application Specific Processors.

[24]  Chris Sander,et al.  CAST: an iterative algorithm for the complexity analysis of sequence tracts , 2000, Bioinform..

[25]  Giancarlo Mauri,et al.  A memetic hybrid method for the Molecular Distance Geometry Problem with incomplete information , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[26]  Kunihiko Sadakane,et al.  MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph , 2014, Bioinform..

[27]  E. Klipp,et al.  Modelling signalling pathways-A yeast approach , 2005 .

[28]  M. Karplus,et al.  Dynamics of folded proteins , 1977, Nature.

[29]  Yongchao Liu,et al.  CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[30]  D. Lauffenburger,et al.  Physicochemical modelling of cell signalling pathways , 2006, Nature Cell Biology.

[31]  Jesús S. Dehesa,et al.  Insight into the informational-structure behavior of the Diels-Alder reaction of cyclopentadiene and maleic anhydride , 2014, Journal of Molecular Modeling.

[32]  Erika Cule,et al.  ABC-SysBio—approximate Bayesian computation in Python with GPU support , 2010, Bioinform..

[33]  Weiguo Liu,et al.  CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[34]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[35]  Karen R. Khar,et al.  Fast Docking on Graphics Processing Units via Ray-Casting , 2013, PloS one.

[36]  Mansour,et al.  Reaction-diffusion master equation: A comparison with microscopic simulations. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[37]  Roshan M. D'Souza,et al.  Accelerating the smoldyn spatial stochastic biochemical reaction network simulator using GPUs , 2011, SpringSim.

[38]  Yong Dou,et al.  CPU-GPU hybrid accelerating the Zuker algorithm for RNA secondary structure prediction applications , 2012, BMC Genomics.

[39]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[40]  Peter J. Stuckey,et al.  Fast and accurate protein substructure searching with simulated annealing and GPUs , 2010, BMC Bioinformatics.

[41]  Clifford A. Shaffer,et al.  Challenges for Modeling and Simulation Methods in Systems Biology , 2006, Proceedings of the 2006 Winter Simulation Conference.

[42]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[43]  Partha Pratim Pande,et al.  Hardware accelerators for biocomputing: A survey , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[44]  Richard K. Gordon,et al.  On the acceleration of the numerical solution of partial differential equations using radial basis functions and graphics processing units , 2013 .

[45]  Linda R Petzold,et al.  Efficient step size selection for the tau-leaping simulation method. , 2006, The Journal of chemical physics.

[46]  Radek Erban,et al.  Fat versus Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions , 2012, IEEE Transactions on Parallel and Distributed Systems.

[47]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[48]  Jason H. Moore,et al.  Exploiting graphics processing units for computational biology and bioinformatics , 2010, Interdisciplinary Sciences: Computational Life Sciences.

[49]  Graham Pullan,et al.  BarraCUDA - a fast short read sequence aligner using graphics processing units , 2011, BMC Research Notes.

[50]  Roger D. Chamberlain,et al.  Accelerating HMMER on GPUs by implementing hybrid data and task parallelism , 2010, BCB '10.

[51]  Michael Zuker,et al.  DINAMelt web server for nucleic acid melting prediction , 2005, Nucleic Acids Res..

[52]  Narayan Ganesan,et al.  CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU , 2016, BMC Bioinformatics.

[53]  Christian S. Jensen,et al.  GPU-Based Computing of Repeated Range Queries over Moving Objects , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[54]  Linda R. Petzold,et al.  Improved leap-size selection for accelerated stochastic simulation , 2003 .

[55]  Ananth Grama,et al.  PuReMD-GPU: A reactive molecular dynamics simulation package for GPUs , 2014, J. Comput. Phys..

[56]  Alexandru Iosup,et al.  Grid Computing Workloads , 2011, IEEE Internet Computing.

[57]  Rob Johnson,et al.  SYSBIONS: nested sampling for systems biology , 2015, Bioinform..

[58]  Daniela M. Romano,et al.  High performance cellular level agent-based simulation with FLAME for the GPU , 2010, Briefings Bioinform..

[59]  Lila M. Gierasch,et al.  Sending Signals Dynamically , 2009, Science.

[60]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[61]  Bernd Meyer,et al.  Accelerating reaction-diffusion simulations with general-purpose graphics processing units , 2011, Bioinform..

[62]  J. Butcher Numerical methods for ordinary differential equations , 2003 .

[63]  Thomas Stützle,et al.  Accelerating Molecular Docking Calculations Using Graphics Processing Units , 2011, J. Chem. Inf. Model..

[64]  S. Salzberg,et al.  Bioinformatics challenges of new sequencing technology. , 2008, Trends in genetics : TIG.

[65]  J. Butcher Numerical Methods for Ordinary Differential Equations: Butcher/Numerical Methods , 2005 .

[66]  Pietro Liò,et al.  Computational Modeling, Formal Analysis, and Tools for Systems Biology , 2016, PLoS Comput. Biol..

[67]  S. Salzberg,et al.  Fast algorithms for large-scale genome alignment and comparison. , 2002, Nucleic acids research.

[68]  Steven S Andrews,et al.  Spatial and stochastic cellular modeling with the Smoldyn simulator. , 2012, Methods in molecular biology.

[69]  Solène Grosdidier,et al.  Computer applications for prediction of protein-protein interactions and rational drug design. , 2009, Advances and applications in bioinformatics and chemistry : AABC.

[70]  Marco S. Nobile Evolutionary Inference of Biological Systems Accelerated on Graphics Processing Units , 2015 .

[71]  Xiaowen Chu,et al.  G-BLASTN: accelerating nucleotide alignment by graphics processors , 2014, Bioinform..

[72]  Theocharis Theocharides,et al.  GPU technology as a platform for accelerating local complexity analysis of protein sequences , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[73]  Pier Luca Lanzi,et al.  Proceedings of the 13th annual conference on Genetic and evolutionary computation , 2011, GECCO 2011.

[74]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[75]  Armin R. Mikler,et al.  Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, BCB 2010, Niagara Falls, NY, USA, August 2-4, 2010 , 2010, BCB.

[76]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[77]  Benjamin Lindner,et al.  Scaling of Multimillion-Atom Biological Molecular Dynamics Simulation on a Petascale Supercomputer. , 2009, Journal of chemical theory and computation.

[78]  D. Wilkinson Stochastic modelling for quantitative description of heterogeneous biological systems , 2009, Nature Reviews Genetics.

[79]  R. Dror,et al.  Long-timescale molecular dynamics simulations of protein structure and function. , 2009, Current opinion in structural biology.

[80]  Yongchao Liu,et al.  CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform , 2012, Bioinform..

[81]  Siu-Ming Yiu,et al.  SOAP3: ultra-fast GPU-based parallel alignment tool for short reads , 2012, Bioinform..

[82]  Bożena Małysiak-Mrozek,et al.  Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA , 2014, Journal of Molecular Modeling.

[83]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[84]  Gianni De Fabritiis,et al.  A survey of computational molecular science using graphics processing units , 2012 .

[85]  L. Petzold Automatic Selection of Methods for Solving Stiff and Nonstiff Systems of Ordinary Differential Equations , 1983 .

[86]  Sharon C. Glotzer,et al.  Pseudo-random number generation for Brownian Dynamics and Dissipative Particle Dynamics simulations on GPU devices , 2011, J. Comput. Phys..

[87]  Eleftheria Polychronidou,et al.  Molecular dynamics simulations through GPU video games technologies. , 2014, Journal of molecular biochemistry.

[88]  Duncan Poole,et al.  Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 2. Explicit Solvent Particle Mesh Ewald. , 2013, Journal of chemical theory and computation.

[89]  Jonathan R. Karr,et al.  A Whole-Cell Computational Model Predicts Phenotype from Genotype , 2012, Cell.

[90]  Sergey Petoukhov,et al.  Mathematics of Bioinformatics: Theory, Methods and Applications , 2011 .

[91]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[92]  Julian Francis Miller,et al.  Cartesian genetic programming , 2000, GECCO '10.

[93]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[94]  Robert M Farber,et al.  Topical perspective on massive threading and parallelism. , 2011, Journal of molecular graphics & modelling.

[95]  Lorenzo Dematté,et al.  Smoldyn on Graphics Processing Units: Massively Parallel Brownian Dynamics Simulations , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[96]  Stephen A. Jarvis,et al.  An investigation of the performance portability of OpenCL , 2013, J. Parallel Distributed Comput..

[97]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[98]  Dario Tamascelli,et al.  Graphics processing units accelerated semiclassical initial value representation molecular dynamics. , 2013, The Journal of chemical physics.

[99]  Giancarlo Mauri,et al.  Modelling Spatial Heterogeneity and Macromolecular Crowding with Membrane Systems , 2010, Int. Conf. on Membrane Computing.

[100]  Chi-Ren Shyu,et al.  Accelerating large-scale protein structure alignments with graphics processing units , 2012, BMC Research Notes.

[101]  Matthew G. Knepley,et al.  Preliminary Implementation of PETSc Using GPUs , 2013 .

[102]  Michael Goesele,et al.  Massively-Parallel Simulation of Biochemical Systems , 2009, GI Jahrestagung.

[103]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[104]  Mikola Lysenko,et al.  Data-parallel algorithms for agent-based model simulation of tuberculosis on graphics processing units , 2009, SpringSim '09.

[105]  Andrea Clematis,et al.  A CUDA Implementation of the Spatial TAU-Leaping in Crowded Compartments (STAUCC) Simulator , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[106]  Mudita Singhal,et al.  COPASI - a COmplex PAthway SImulator , 2006, Bioinform..

[107]  Lorenzo Dematté,et al.  GPU computing for systems biology , 2010, Briefings Bioinform..

[108]  D. C. Rapaport,et al.  The Art of Molecular Dynamics Simulation , 1997 .

[109]  Peter L. Freddolino,et al.  Molecular dynamics simulations of the complete satellite tobacco mosaic virus. , 2006, Structure.

[110]  Christian N. S. Pedersen,et al.  GPU-accelerated high-accuracy molecular docking using guided differential evolution: real world applications , 2011, GECCO '11.

[111]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[112]  Toshikazu Ebisuzaki,et al.  Hardware accelerator for molecular dynamics: MDGRAPE-2 , 2003 .

[113]  Yifan Chen,et al.  Microwave breast tumor detection and size estimation using contrast-agent-loaded magnetotactic bacteria , 2013, 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[114]  Thanh Thuy Nguyen,et al.  Aligning Multi Sequences on GPUs , 2012, ICCASA.

[115]  I. Z. Reguly,et al.  A comparison between parallelization approaches in molecular dynamics simulations on GPUs , 2014, J. Comput. Chem..

[116]  Ruth Nussinov,et al.  Principles of docking: An overview of search algorithms and a guide to scoring functions , 2002, Proteins.

[117]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[118]  Giovanni Manzini,et al.  Indexing compressed text , 2005, JACM.

[119]  Alessandro Orro,et al.  G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods , 2015, Front. Bioeng. Biotechnol..

[120]  Hong Li,et al.  Efficient Parallelization of the Stochastic Simulation Algorithm for Chemically Reacting Systems On the Graphics Processing Unit , 2010, Int. J. High Perform. Comput. Appl..

[121]  Chun-Yuan Lin,et al.  Efficient parallel algorithm for multiple sequence alignments with regular expression constraints on graphics processing units , 2014, Int. J. Comput. Sci. Eng..

[122]  Weiguo Liu,et al.  Accelerating molecular dynamics simulations using Graphics Processing Units with CUDA , 2008, Comput. Phys. Commun..

[123]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[124]  Khaled Benkrid,et al.  Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm , 2010, ICCS.

[125]  Turki Turki,et al.  MaxSSmap: a GPU program for mapping divergent short reads to genomes with the maximum scoring subsequence , 2014, BMC Genomics.

[126]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[127]  Layne T. Watson,et al.  Proceedings of the 19th High Performance Computing Symposia , 2011 .

[128]  Dominique Lavenier,et al.  GPU Accelerated RNA Folding Algorithm , 2009, ICCS.

[129]  Michael P. H. Stumpf,et al.  GPU accelerated biochemical network simulation , 2011, Bioinform..

[130]  Haruki Nakamura,et al.  Molecular Dynamics Simulations Accelerated by GPU for Biological Macromolecules with a Non-Ewald Scheme for Electrostatic Interactions. , 2013, Journal of chemical theory and computation.

[131]  Yuri Matsuzaki,et al.  MEGADOCK 4.0: an ultra–high-performance protein–protein docking software for heterogeneous supercomputers , 2014, Bioinform..

[132]  Giancarlo Mauri,et al.  A GPU-Based Multi-swarm PSO Method for Parameter Estimation in Stochastic Biological Systems Exploiting Discrete-Time Target Series , 2012, EvoBIO.

[133]  Qing Nie,et al.  Integrative multicellular biological modeling: a case study of 3D epidermal development using GPU algorithms , 2010, BMC Systems Biology.

[134]  Ron Elber,et al.  MOIL-opt: Energy-Conserving Molecular Dynamics on a GPU/CPU system. , 2011, Journal of chemical theory and computation.

[135]  Yongchao Liu,et al.  CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units , 2009, BMC Research Notes.

[136]  Roshan M. D’Souza,et al.  Accelerating the Gillespie τ-Leaping Method Using Graphics Processing Units , 2012, PloS one.

[137]  David W. Ritchie,et al.  Ultra-fast FFT protein docking on graphics processors , 2010, Bioinform..

[138]  Noriko Hiroi,et al.  Acceleration of discrete stochastic biochemical simulation using GPGPU , 2015, Front. Physiol..

[139]  J. M. Haile,et al.  Molecular dynamics simulation : elementary methods / J.M. Haile , 1992 .

[140]  Andrés Tomás,et al.  Using GPUs for the Exact Alignment of Short-Read Genetic Sequences by Means of the Burrows-Wheeler Transform , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[141]  Giancarlo Mauri,et al.  GPU-powered sensitivity analysis of a large-scale model of death cell signaling and proliferation in cancer cells , 2015 .

[142]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.