MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL

A phylogenetic tree is a visual diagram of the relationship between a set of biological species. The scientists usually use it to analyze many characteristics of the species. The distance-matrix methods, such as Unweighted Pair Group Method with Arithmetic Mean and Neighbor Joining, construct a phylogenetic tree by calculating pairwise genetic distances between taxa. These methods have the computational performance issue. Although several new methods with high-performance hardware and frameworks have been proposed, the issue still exists. In this work, a novel parallel Unweighted Pair Group Method with Arithmetic Mean approach on multiple Graphics Processing Units is proposed to construct a phylogenetic tree from extremely large set of sequences. The experimental results present that the proposed approach on a DGX-1 server with 8 NVIDIA P100 graphic cards achieves approximately 3-fold to 7-fold speedup over the implementation of Unweighted Pair Group Method with Arithmetic Mean on a modern CPU and a single GPU, respectively.

[1]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[2]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[3]  Yongchao Liu,et al.  Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[4]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[5]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[6]  Feng Lin,et al.  pNJTree: A parallel program for reconstruction of neighbor-joining tree and its application in ClustalW , 2006, Parallel Computing.

[7]  Cristiane Rosul Message Passing Interface ( MPI ) Advantages and Disadvantages for applicability in the NoC Environment by , 2008 .

[8]  Che-Lun Hung,et al.  Efficient parallel UPGMA algorithm based on multiple GPUs , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9]  D. Caron,et al.  Defining DNA-Based Operational Taxonomic Units for Microbial-Eukaryote Ecology , 2009, Applied and Environmental Microbiology.

[10]  Kuo-Bin Li,et al.  ClustalW-MPI: ClustalW analysis using distributed and parallel computing , 2003, Bioinform..

[11]  J F Li,et al.  A fast neighbor joining method. , 2015, Genetics and molecular research : GMR.

[12]  Che-Lun Hung,et al.  GPU‐UPGMA: high‐performance computing for UPGMA algorithm based on graphics processing units , 2015, Concurr. Comput. Pract. Exp..

[13]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.