Comparative assessment of GPGPU technologies to accelerate objective functions: A case study on parsimony

Abstract Objective functions provide measurements of solution quality that represent the core calculations required to tackle NP-hard optimization problems. Since their complexity keeps growing with the introduction of more realistic data, research efforts have turned their interest into the proposal of efficient objective function implementations that take advantage of potential parallelism. This work explores GPGPU technologies to accelerate objective functions, considering as a case study the parallelization of phylogenetic parsimony calculations from DNA data. We undertake the comparative evaluation of different GPU programming models and architectures, highlighting the benefits and drawbacks of each approach through experimentation on six real-world biological datasets. Experimental results shed light on the strong relationship between the characteristics of the input data and the effective utilization of GPU resources. Furthermore, comparisons with other parallel architectures and methods point out how current and future optimization scenarios can benefit from the use of accurate, efficient GPU approaches.

[1]  T. J. Robinson,et al.  Impacts of the Cretaceous Terrestrial Revolution and KPg Extinction on Mammal Diversification , 2011, Science.

[2]  Yang Hong,et al.  A heterogeneous computing accelerated SCE-UA global optimization method using OpenMP, OpenCL, CUDA, and OpenACC. , 2017, Water science and technology : a journal of the International Association on Water Pollution Research.

[3]  Anne-Mieke Vandamme,et al.  The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing , 2009 .

[4]  J. McPherson,et al.  Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[5]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[6]  El-Ghazali Talbi,et al.  Parallel Evolutionary Combinatorial Optimization , 2015, Handbook of Computational Intelligence.

[7]  Nicholas Wilt,et al.  The CUDA Handbook: A Comprehensive Guide to GPU Programming , 2013 .

[8]  Enrique Alba,et al.  Parallel metaheuristics: recent advances and new trends , 2012, Int. Trans. Oper. Res..

[9]  Emanuele Danovaro,et al.  Heterogeneous architectures for computational intensive applications: A cost-effectiveness analysis , 2014, J. Comput. Appl. Math..

[10]  Chuan Yi Tang,et al.  MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL , 2017, Evolutionary bioinformatics online.

[11]  David Kaeli,et al.  Heterogeneous Computing with OpenCL , 2011 .

[12]  M. Rosenberg,et al.  How should gaps be treated in parsimony? A comparison of approaches using simulation. , 2007, Molecular phylogenetics and evolution.

[13]  Filip Husník,et al.  Multiple origins of endosymbiosis within the Enterobacteriaceae (γ-Proteobacteria): convergence of complex phylogenetic approaches , 2011, BMC Biology.

[14]  Vivek K. Pallipuram,et al.  A comparative study of GPU programming models and architectures using neural networks , 2011, The Journal of Supercomputing.

[15]  Jianbin Fang,et al.  A Comprehensive Performance Comparison of CUDA and OpenCL , 2011, 2011 International Conference on Parallel Processing.

[16]  Alexander Goesmann,et al.  Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations , 2016, Front. Genet..

[17]  Alexandros Stamatakis,et al.  FPGA Acceleration of the Phylogenetic Parsimony Kernel? , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.

[18]  R. Zardoya,et al.  Life-history evolution and mitogenomic phylogeny of caecilian amphibians. , 2014, Molecular phylogenetics and evolution.

[19]  S. Hedges The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies. , 1992, Molecular biology and evolution.

[20]  Pedro Trancoso,et al.  Fine-grain Parallelism Using Multi-core, Cell/BE, and GPU Systems: Accelerating the Phylogenetic Likelihood Function , 2009, 2009 International Conference on Parallel Processing.

[21]  Matt Martineau,et al.  Assessing the performance portability of modern parallel programming models using TeaLeaf , 2017, Concurr. Comput. Pract. Exp..

[22]  Martin Lilleeng Sætra,et al.  Graphics processing unit (GPU) programming strategies and trends in GPU computing , 2013, J. Parallel Distributed Comput..

[23]  James F. Smith Phylogenetics of seed plants : An analysis of nucleotide sequences from the plastid gene rbcL , 1993 .

[24]  Marco S. Nobile,et al.  Graphics processing units in bioinformatics, computational biology and systems biology , 2016, Briefings Bioinform..

[25]  Jeffrey Overbey,et al.  COMPARING PROGRAMMER PRODUCTIVITY IN OPENACC AND CUDA: AN EMPIRICAL INVESTIGATION , 2016 .

[26]  Dhia Bouktila,et al.  Large-scale analysis of NBS domain-encoding resistance gene analogs in Triticeae , 2014, Genetics and molecular biology.

[27]  Carlos Reaño,et al.  A Performance Comparison of CUDA Remote GPU Virtualization Frameworks , 2015, 2015 IEEE International Conference on Cluster Computing.

[28]  Nicolas C. Rochette,et al.  Bio++: efficient extensible libraries and tools for computational molecular evolution. , 2013, Molecular biology and evolution.

[29]  J. M. Mirande,et al.  Combined phylogeny of ray‐finned fishes (Actinopterygii) and the use of morphological characters in large‐scale analyses , 2017, Cladistics : the international journal of the Willi Hennig Society.

[30]  Robert C Thomson,et al.  Sparse supermatrices for phylogenetic inference: taxonomy, alignment, rogue taxa, and the phylogeny of living turtles. , 2010, Systematic biology.

[31]  Bernard M. E. Moret,et al.  Rec-I-DCM3: a fast algorithmic technique for reconstructing phylogenetic trees , 2004, Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004..

[32]  Pieter S. van der Meulen,et al.  pepKalc: scalable and comprehensive calculation of electrostatic interactions in random coil polypeptides , 2018, Bioinform..

[33]  Giulio Giunta,et al.  A GPGPU Transparent Virtualization Component for High Performance Computing Clouds , 2010, Euro-Par.

[34]  R. Barrangou,et al.  Phylogenetic Diversity of the Enteric Pathogen Salmonella enterica subsp. enterica Inferred from Genome-Wide Reference-Free SNP Characters , 2013, Genome biology and evolution.

[35]  Solon P. Pissis,et al.  A Generic Vectorization Scheme and a GPU Kernel for the Phylogenetic Likelihood Library , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[36]  Antonio J. Rueda Ruiz,et al.  A comparison of native GPU computing versus OpenACC for implementing flow-routing algorithms in hydrological applications , 2016, Comput. Geosci..

[37]  Stephen A. Jarvis,et al.  Accelerating Hydrocodes with OpenACC, OpenCL and CUDA , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[38]  Alexander S. Szalay,et al.  Arioc: GPU‐accelerated alignment of short bisulfite‐treated reads , 2018, Bioinform..

[39]  Satoshi Matsuoka,et al.  CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[40]  Jack J. Dongarra,et al.  From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming , 2012, Parallel Comput..

[41]  Stephen A. Jarvis,et al.  An investigation of the performance portability of OpenCL , 2013, J. Parallel Distributed Comput..

[42]  Ann M. Dennis,et al.  Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic , 2016, Scientific Reports.

[43]  Xing Guo,et al.  Parallel Computation of Aerial Target Reflection of Background Infrared Radiation: Performance Comparison of OpenMP, OpenACC, and CUDA Implementations , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[44]  Marc A. Suchard,et al.  Many-core algorithms for statistical phylogenetics , 2009, Bioinform..

[45]  Rob Farber,et al.  Parallel Programming with OpenACC , 2016 .

[46]  Daniel L. Ayres,et al.  BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics , 2011, Systematic biology.

[47]  David A. Bader,et al.  Computational Grand Challenges in Assembling the Tree of Life: Problems and Solutions , 2006, Adv. Comput..

[48]  Leonel Sousa,et al.  Accelerating the phylogenetic parsimony function on heterogeneous systems , 2017, Concurr. Comput. Pract. Exp..