A Linear Algebra Approach to Fast DNA Mixture Analysis Using GPUs

Analysis of DNA samples is an important tool in forensics, and the speed of analysis can impact investigations. Comparison of DNA sequences is based on the analysis of short tandem repeats (STRs), which are short DNA sequences of 2–5 base pairs. Current forensics approaches use 20 STR loci for analysis. The use of single nucleotide polymorphisms (SNPs) has utility for analysis of complex DNA mixtures. The use of tens of thousands of SNPs loci for analysis poses significant computational challenges because the forensic analysis scales by the product of the loci count and number of DNA samples to be analyzed. In this paper, we discuss the implementation of a DNA sequence comparison algorithm by re-casting the algorithm in terms of linear algebra primitives. By developing an overloaded matrix multiplication approach to DNA comparisons, we can leverage advances in GPU hardware and algoithms for dense matrix multiplication (DGEMM) to speed up DNA sample comparisons. We show that it is possible to compare 2048 unknown DNA samples with 20 million known samples in under 6 seconds using a NVIDIA K80 GPU.

[1]  Charles L. Lawson,et al.  Basic Linear Algebra Subprograms for Fortran Usage , 1979, TOMS.

[2]  D. B. Davis,et al.  Intel Corp. , 1993 .

[3]  Jack J. Dongarra,et al.  Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[4]  J. Demmel,et al.  An updated set of basic linear algebra subprograms (BLAS) , 2002, TOMS.

[5]  J. Butler,et al.  Short tandem repeat typing technologies used in human identity testing. , 2007, BioTechniques.

[6]  James Demmel,et al.  Benchmarking GPUs to tune dense linear algebra , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[8]  Vijay Gadepally,et al.  MATLAB for Signal Processing on Multiprocessors and Multicores , 2010, IEEE Signal Processing Magazine.

[9]  Jack J. Dongarra,et al.  Accelerating Numerical Dense Linear Algebra Calculations with GPUs , 2014, Numerical Computations with GPUs.

[10]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[11]  D. Ricke,et al.  Robust detection of individual forensic profiles in DNA mixtures. , 2015, Forensic science international. Genetics.

[12]  Darrell O. Ricke FastID: Extremely fast forensic DNA comparisons , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).