Efficient Multivariate Analysis Algorithms for Longitudinal Genome-wide Association Studies

Motivation Current dynamic phenotyping system introduces time as an extra dimension to genome-wide association studies (GWAS), which helps to explore the mechanism of dynamical genetic control for complex longitudinal traits. However, existing methods for longitudinal GWAS either ignore the covariance among observations of different time points or encounter computational efficiency issues. Results We herein developed efficient genome-wide multivariate association algorithms (GMA) for longitudinal data. In contrast to existing univariate linear mixed model analyses, the proposed new method has improved statistic power for association detection and computational speed. In addition, the new method can analyze unbalanced longitudinal data with thousands of individuals and more than ten thousand records within a few hours. The corresponding time for balanced longitudinal data is just a few minutes. Availability and Implementation We wrote a software package to implement the efficient algorithm named GMA (https://github.com/chaoning/GMA), which is available freely for interested users in relevant fields.

[1]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.

[2]  Chao Ning,et al.  Performance Gains in Genome-Wide Association Studies for Longitudinal Traits via Modeling Time-varied effects , 2017, Scientific Reports.

[3]  Gudrun A. Brockmann,et al.  Go with the flow—biology and genetics of the lactation cycle , 2015, Front. Genet..

[4]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[5]  Seung Hwan Lee,et al.  MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information , 2015, bioRxiv.

[6]  Malia A. Gehan,et al.  Lights, camera, action: high-throughput plant phenotyping is ready for a close-up. , 2015, Current opinion in plant biology.

[7]  Karin Meyer,et al.  WOMBAT—A tool for mixed model analyses in quantitative genetics by restricted maximum likelihood (REML) , 2007, Journal of Zhejiang University SCIENCE B.

[8]  Fei Zou,et al.  Varying Coefficient Models for Mapping Quantitative Trait Loci Using Recombinant Inbred Intercrosses , 2012, Genetics.

[9]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[10]  D. Rubin,et al.  Parameter expansion to accelerate EM: The PX-EM algorithm , 1998 .

[11]  Juan P. Steibel,et al.  Rapid screening for phenotype-genotype associations by linear transformations of genomic evaluations , 2014, BMC Bioinformatics.

[12]  M. Cleves,et al.  Predicting Coronary Heart Disease Events in Women: A Longitudinal Cohort Study , 2014, The Journal of cardiovascular nursing.

[13]  Dan Wang,et al.  A rapid epistatic mixed-model association analysis by linear retransformations of genomic estimated values , 2018, Bioinform..

[14]  Charles E McCulloch,et al.  A Flexible Estimating Equations Approach for Mapping Function-Valued Traits , 2011, Genetics.

[15]  Christine Yoshinaga-Itano,et al.  Early Predictors of Autism in Young Children Who Are Deaf or Hard of Hearing: Three Longitudinal Case Studies , 2014, Seminars in Speech and Language.

[16]  H. Akaike A new look at the statistical model identification , 1974 .

[17]  Per Madsen,et al.  Residual maximum likelihood estimation of (co)variance components in multivariate mixed linear models using average information , 1997 .

[18]  Karl W Broman,et al.  Genetics of Rapid and Extreme Size Evolution in Island Mice , 2015, Genetics.

[19]  D. Rubin,et al.  Parameter expansion to accelerate EM : The PX-EM algorithm , 1997 .

[20]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[21]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[22]  M. Sillanpää,et al.  Dynamic Quantitative Trait Locus Analysis of Plant Phenomic Data. , 2015, Trends in plant science.

[23]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[24]  Michel Georges,et al.  Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  James M. Reecy,et al.  Developmental progress and current status of the Animal QTLdb , 2015, Nucleic Acids Res..

[26]  Il-Youp Kwak,et al.  A Simple Regression-Based Method to Map Quantitative Trait Loci Underlying Function-Valued Phenotypes , 2014, Genetics.

[27]  R. Wu,et al.  Functional mapping — how to map and study the genetic architecture of dynamic complex traits , 2006, Nature Reviews Genetics.

[28]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[29]  L. R. Schaeffer,et al.  Application of random regression models in animal breeding , 2004 .

[30]  Claudia Arcidiacono,et al.  The automatic detection of dairy cow feeding and standing behaviours in free-stall barns by a computer vision-based system , 2015 .

[31]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.