LLR: a latent low‐rank approach to colocalizing genetic risk variants in multiple GWAS

Motivation: Genome‐wide association studies (GWAS), which genotype millions of single nucleotide polymorphisms (SNPs) in thousands of individuals, are widely used to identify the risk SNPs underlying complex human phenotypes (quantitative traits or diseases). Most conventional statistical methods in GWAS only investigate one phenotype at a time. However, an increasing number of reports suggest the ubiquity of pleiotropy, i.e. many complex phenotypes sharing common genetic bases. This motivated us to leverage pleiotropy to develop new statistical approaches to joint analysis of multiple GWAS. Results: In this study, we propose a latent low‐rank (LLR) approach to colocalizing genetic risk variants using summary statistics. In the presence of pleiotropy, there exist risk loci that affect multiple phenotypes. To leverage pleiotropy, we introduce a low‐rank structure to modulate the probabilities of the latent association statuses between loci and phenotypes. Regarding the computational efficiency of LLR, a novel expectation‐maximization‐path (EM‐path) algorithm has been developed to greatly reduce the computational cost and facilitate model selection and inference. We demonstrate the advantages of LLR over competing approaches through simulation studies and joint analysis of 18 GWAS datasets. Availability and implementation: The LLR software is available on https://sites.google.com/site/liujin810822. Contact: macyang@ust.hk.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Ross M. Fraser,et al.  Sex-stratified Genome-wide Association Studies Including 270,000 Individuals Show Sexual Dimorphism in Genetic Loci for Anthropometric Traits , 2013, PLoS genetics.

[2]  P. Visscher,et al.  A plethora of pleiotropy across complex traits , 2016, Nature Genetics.

[3]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[4]  J. Danesh,et al.  Large-scale association analysis identifies new risk loci for coronary artery disease , 2013 .

[5]  Qian Wang,et al.  Implications of pleiotropy: challenges and opportunities for mining Big Data in biomedicine , 2015, Front. Genet..

[6]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[7]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[8]  Gladys N. Pachas,et al.  Brain Reactivity to Smoking Cues Prior to Smoking Cessation Predicts Ability to Maintain Tobacco Abstinence , 2010, Biological Psychiatry.

[9]  B. Pasaniuc,et al.  Leveraging Functional-Annotation Data in Trans-ethnic Fine-Mapping Studies. , 2015, American journal of human genetics.

[10]  Jianxin Shi,et al.  Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs , 2013, Nature Genetics.

[11]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[12]  Frank Seifert,et al.  Smoking and structural brain deficits: a volumetric MR investigation , 2006, The European journal of neuroscience.

[13]  M. Daly,et al.  Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis , 2013, The Lancet.

[14]  Joseph K. Pickrell,et al.  Detection and interpretation of shared genetic influences on 42 human traits , 2015, Nature Genetics.

[15]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[16]  S. Innis,et al.  Dietary (n-3) fatty acids and brain development. , 2007, The Journal of nutrition.

[17]  Joseph K. Pickrell,et al.  Approximately independent linkage disequilibrium blocks in human populations , 2015, bioRxiv.

[18]  T. Ishihara,et al.  Multiple genetic factors in olanzapine-induced weight gain in schizophrenia patients: a cohort study. , 2008, The Journal of clinical psychiatry.

[19]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[20]  Thomas E. Nichols,et al.  Common genetic variants influence human subcortical brain structures , 2015, Nature.

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[22]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[23]  Jiang Li,et al.  MGAS: a powerful tool for multivariate gene-based genome-wide association analysis , 2014, Bioinform..

[24]  E. Eskin,et al.  Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies , 2014, PLoS genetics.

[25]  Bradley Efron,et al.  Large-scale inference , 2010 .

[26]  M Krawczak,et al.  Examination of the current top candidate genes for AD in a genome-wide association study , 2010, Molecular Psychiatry.

[27]  Claude Bouchard,et al.  A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance , 2012, Nature Genetics.

[28]  M. Daly,et al.  Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis , 2013, The Lancet.

[29]  Susanne Walitza,et al.  Meta-analysis of genome-wide association studies of attention-deficit/hyperactivity disorder. , 2010, Journal of the American Academy of Child and Adolescent Psychiatry.

[30]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[31]  Vincent Plagnol,et al.  Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci , 2008, Nature Genetics.

[32]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[33]  Christian Gieger,et al.  Seventy-five genetic loci influencing the human red blood cell , 2012, Nature.

[34]  I. Ntalla,et al.  A genome-wide association study of anorexia nervosa , 2011, Molecular Psychiatry.

[35]  Xiaofeng Zhu,et al.  Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. , 2015, American journal of human genetics.

[36]  Milos Kostic,et al.  The role of glutamate and its receptors in multiple sclerosis , 2014, Journal of Neural Transmission.

[37]  M. Heo,et al.  The distribution of body mass index among individuals with and without schizophrenia. , 1999, The Journal of clinical psychiatry.

[38]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[39]  Ming D. Li,et al.  Genome-wide meta-analyses identify multiple loci associated with smoking behavior , 2010, Nature Genetics.

[40]  Nick C Fox,et al.  Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease , 2013, Nature Genetics.

[41]  David N Cooper,et al.  A Changing of the Guard at Human Genetics , 2014, Human Genetics.

[42]  L. Turski,et al.  Multiple Sclerosis and Glutamate , 2003, Annals of the New York Academy of Sciences.

[43]  R. Tibshirani,et al.  Forward stagewise regression and the monotone lasso , 2007, 0705.0269.

[44]  Robert D. Henderson,et al.  The occurrence of autoimmune diseases in patients with multiple sclerosis and their families , 2000, Journal of Clinical Neuroscience.

[45]  Can Yang,et al.  Improving genetic risk prediction by leveraging pleiotropy , 2013, Human Genetics.

[46]  Eleazar Eskin,et al.  Identifying Causal Variants at Loci with Multiple Signals of Association , 2014, Genetics.

[47]  B Frigeni,et al.  Glutamate and multiple sclerosis. , 2012, Current medicinal chemistry.

[48]  Lin Wang,et al.  Accounting for non-genetic factors by low-rank representation and sparse regression for eQTL mapping , 2013, Bioinform..

[49]  Hongyu Zhao,et al.  GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation , 2014, PLoS genetics.

[50]  Shalom Coodin,et al.  Body Mass Index in Persons with Schizophrenia , 2001, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[51]  M. Benwell,et al.  Evidence that Tobacco Smoking Increases the Density of (−)‐[3H]Nicotine Binding Sites in Human Brain , 1988, Journal of neurochemistry.

[52]  Colin McKerlie,et al.  Type I Diabetes and Multiple Sclerosis Patients Target Islet Plus Central Nervous System Autoantigens; Nonimmunized Nonobese Diabetic Mice Can Develop Autoimmune Encephalitis1 , 2001, The Journal of Immunology.

[53]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[54]  David H Malin,et al.  A Nicotine Conjugate Vaccine Reduces Nicotine Distribution to Brain and Attenuates Its Behavioral and Cardiovascular Effects in Rats , 2000, Pharmacology Biochemistry and Behavior.

[55]  John Hardy,et al.  The genetic architecture of Alzheimer's disease: beyond APP, PSENs and APOE , 2012, Neurobiology of Aging.

[56]  J. Friedman Fast sparse regression and classification , 2012 .

[57]  M. Pirinen,et al.  Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis , 2013, Nature Genetics.

[58]  Kasper Lage,et al.  Pervasive Sharing of Genetic Effects in Autoimmune Disease , 2011, PLoS genetics.

[59]  Bjarni J. Vilhjálmsson,et al.  An efficient multi-locus mixed model approach for genome-wide association studies in structured populations , 2012, Nature Genetics.

[60]  Qian Wang,et al.  Pervasive pleiotropy between psychiatric disorders and immune disorders revealed by integrative analysis of multiple GWAS , 2015, Human Genetics.

[61]  Jianxin Shi,et al.  Common variants on chromosome 6p22.1 are associated with schizophrenia , 2009, Nature.

[62]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[63]  S. Innis,et al.  Essential fatty acid transfer and fetal development. , 2005, Placenta.

[64]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[65]  Matthew Stephens False Discovery Rates: A New Deal , 2016 .

[66]  Hongyu Zhao,et al.  Low-Rank Modeling and Its Applications in Image Analysis , 2014, ACM Comput. Surv..

[67]  Ryan J. Tibshirani,et al.  A general framework for fast stagewise algorithms , 2014, J. Mach. Learn. Res..

[68]  M. Fornage,et al.  Genetic Loci Associated with Plasma Phospholipid n-3 Fatty Acids: A Meta-Analysis of Genome-Wide Association Studies from the CHARGE Consortium , 2011, PLoS genetics.