Speeding up eQTL scans in the BXD population using GPUs

The BXD recombinant inbred strains of mice are an important reference population for systems biology and genetics that have been full sequenced and deeply phenotyped. To facilitate inter-active use of genotype-phenotype relations using many massive omics data sets for this and other segregating populations, we have developed new algorithms and code that enables near-real time whole genome QTL scans for up to 1 million traits. By using easily parallelizable operations including matrix multiplication, vectorized operations, and element-wise operations, we have decreased run-time to a few seconds for large transcriptome data sets. Our code is ideal for interactive web services, such as GeneNetwork.org. We used parallelization of different CPU threads as well as GPUs. We found that the speed advantage of GPUs is dependent on problem size and shape (number of cases, number of genotypes, number of traits). Our results provide a path for speeding up eQTL scans using linear mixed models (LMMs). Our implementation is in the Julia programming language.

[1]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[2]  Lu Lu,et al.  WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior , 2004, Nature Neuroscience.

[3]  Tim Besard,et al.  Effective Extensible Programming: Unleashing Julia on GPUs , 2017, IEEE Transactions on Parallel and Distributed Systems.

[4]  G. Getz,et al.  Scaling computational genomics to millions of individuals with GPUs , 2019, Genome Biology.

[5]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[6]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[7]  Dominique Lavenier,et al.  Graphics Processing Unit-Accelerated Quantitative Trait Loci Detection , 2013, J. Comput. Biol..

[8]  Pjotr Prins,et al.  GeneNetwork: A Toolbox for Systems Genetics. , 2017, Methods in molecular biology.

[9]  Qian Wang,et al.  AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[10]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[11]  Ting Qi,et al.  Mixed Linear Model Approaches of Association Mapping for Complex Traits Based on Omics Variants , 2015, Scientific Reports.

[12]  B. Yandell,et al.  R/qtl2: Software for Mapping Quantitative Trait Loci with High-Dimensional Data and Multiparent Populations , 2018, Genetics.

[13]  C. Haley,et al.  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers , 1992, Heredity.

[14]  Lei Yan,et al.  GeneNetwork: framework for web-based genetics , 2016, J. Open Source Softw..

[15]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[16]  Eliezer M. Van Allen,et al.  Scaling computational genomics to millions of individuals with GPUs , 2018, Genome Biology.

[17]  Hao Wu,et al.  R/qtl: QTL Mapping in Experimental Crosses , 2003, Bioinform..