Powerful statistical method to detect disease associated genes using publicly available GWAS summary data

To date, a large number of common variants underlying complex diseases have been identified via genome-wide association studies (GWAS) conducted in the last decade. More and more GWAS summary data have been posted for public access. We note that for most existing GWAS, the identified common variants are based on single-marker-based tests, which test one single nucleotide polymorphisms (SNP) at a time. A gene, rather than a SNP, is the basic functional unit of inheritance. Thus, results obtained at the gene level can be more readily extended to and integrated with downstream functional and pathogenic investigation. In this study, we propose a general gene based p value adaptive combination approach (GPA) that can integrate association evidence from GWAS summary statistics (which could be either p value or other test statistics) from continuous or binary traits. We conducted simulations to verify that the proposed method controls type I errors well, and performs favorably compared to single-marker analysis and other existing methods. We illustrated the utility of proposed methods through application to the GWAS meta-analysis results of fasting glucose from the international MAGIC consortium. The proposed method identified novel glycemic associated genes which can improve our understanding of the mechanisms involved in β-cell function and glucose homeostasis.

[1]  P. Elliott,et al.  New Blood Pressure–Associated Loci Identified in Meta-Analyses of 475 000 Individuals , 2017, Circulation. Cardiovascular genetics.

[2]  A. Price,et al.  Dissecting the genetics of complex traits using summary association statistics , 2016, Nature Reviews Genetics.

[3]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[4]  P. Wilson,et al.  High-density lipoprotein, low-density lipoprotein and coronary artery disease. , 1990, The American journal of cardiology.

[5]  M. Stephens A Unified Framework for Association Analysis with Multiple Related Phenotypes , 2013, PloS one.

[6]  M. Bębenek,et al.  The PTPN13 Y2081D (T>G) (rs989902) polymorphism is associated with an increased risk of sporadic colorectal cancer , 2017, Colorectal disease : the official journal of the Association of Coloproctology of Great Britain and Ireland.

[7]  L. Kruglyak,et al.  The role of regulatory variation in complex traits and disease , 2015, Nature Reviews Genetics.

[8]  Xiaofeng Zhu,et al.  Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. , 2015, American journal of human genetics.

[9]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[10]  John S. Witte,et al.  Comprehensive Approach to Analyzing Rare Genetic Variants , 2010, PloS one.

[11]  Bruce M. Spiegelman,et al.  Obesity and the Regulation of Energy Balance , 2001, Cell.

[12]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[13]  R. D. de Boer,et al.  Identification of hypertrophy- and heart failure-associated genes by combining in vitro and in vivo models. , 2012, Physiological genomics.

[14]  M. Desai,et al.  Association of total cholesterol/ high-density lipoprotein cholesterol ratio with proximal coronary atherosclerosis detected by multislice computed tomography. , 2009, Preventive cardiology.

[15]  W. Pan,et al.  A Powerful Pathway-Based Adaptive Test for Genetic Association with Common or Rare Variants. , 2015, American journal of human genetics.

[16]  M. Rieder,et al.  Common Missense Variant in the Glucokinase Regulatory Protein Gene Is Associated With Increased Plasma Triglyceride and C-Reactive Protein but Lower Fasting Glucose Concentrations , 2008, Diabetes.

[17]  Bin Guo,et al.  Statistical methods to detect novel genetic variants using publicly available GWAS summary data , 2018, Comput. Biol. Chem..

[18]  L. Kuller,et al.  Combined association of lipids and blood pressure in relation to incident cardiovascular disease in the elderly: the cardiovascular health study. , 2010, American journal of hypertension.

[19]  Tanya M. Teslovich,et al.  Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways , 2012, Nature Genetics.

[20]  G. Colditz,et al.  The disease burden associated with overweight and obesity. , 1999, JAMA.

[21]  Christian Gieger,et al.  New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk , 2010, Nature Genetics.

[22]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[23]  Peter M. Visscher,et al.  Fast set-based association analysis using summary data from GWAS identifies novel gene loci for human complex traits , 2016, Scientific Reports.

[24]  Harry Hemingway,et al.  Blood pressure and incidence of twelve cardiovascular diseases: lifetime risks, healthy life-years lost, and age-specific associations in 1·25 million people , 2014, The Lancet.

[25]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[26]  Baolin Wu,et al.  On Efficient and Accurate Calculation of Significance P‐Values for Sequence Kernel Association Testing of Variant Set , 2016, Annals of human genetics.

[27]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[28]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[29]  Wei Pan,et al.  A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants , 2010, Human Heredity.

[30]  H. Hakonarson,et al.  Pathway-based Genome-wide Association Studies Reveal the Association Between Growth Factor Activity and Inflammatory Bowel Disease , 2016, Inflammatory bowel diseases.

[31]  L. Fajas,et al.  Downregulation of protein tyrosine phosphatase PTP-BL represses adipogenesis. , 2009, The international journal of biochemistry & cell biology.

[32]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[33]  Shuanglin Zhang,et al.  An Adaptive Fisher’s Combination Method for Joint Analysis of Multiple Phenotypes in Association Studies , 2016, Scientific Reports.