Uncovering Networks from Genome-Wide Association Studies via Circular Genomic Permutation

Genome-wide association studies (GWAS) aim to detect single nucleotide polymorphisms (SNP) associated with trait variation. However, due to the large number of tests, standard analysis techniques impose highly stringent significance thresholds, leaving potentially associated SNPs undetected, and much of the trait genetic variation unexplained. Pathway- and network-based methodologies applied to GWAS aim to detect associations missed by standard single-marker approaches. The complex and non-random architecture of the genome makes it a challenge to derive an appropriate testing framework for such methodologies. We developed a rapid and simple permutation approach that uses GWAS SNP association results to establish the significance of pathway associations while accounting for the linkage disequilibrium structure of SNPs and the clustering of functionally related elements in the genome. All SNPs used in the GWAS are placed in a “circular genome” according to their location. Then the complete set of SNP association P values are permuted by rotation with respect to the genomic locations of the SNPs. Once these “simulated” P values are assigned, the joint gene P values are calculated using Fisher’s combination test, and the association of pathways is tested using the hypergeometric test. The circular genomic permutation approach was applied to a human genome-wide association dataset. The data consists of 719 individuals from the ORCADES study genotyped for ∼300,000 SNPs and measured for 51 traits ranging from physical to biochemical measurements. KEGG pathways (n = 225) were used as the sets of pathways to be tested. Our results demonstrate that the circular genomic permutations provide robust association P values. The non-permuted hypergeometric analysis generates ∼1400 pathway-trait combination results with an association P value more significant than P ≤ 0.05, whereas applying circular genomic permutation reduces the number of significant results to a more credible 40% of that value. The circular permutation software (“genomicper”) is available as an R package at http://cran.r-project.org/.

[1]  Peilin Jia,et al.  Gene set analysis of genome-wide association studies: methodological issues and perspectives. , 2011, Genomics.

[2]  Alexander Schliep,et al.  Exploiting prior knowledge and gene distances in the analysis of tumor expression profiles with extended Hidden Markov Models , 2011, Bioinform..

[3]  Zhi John Lu,et al.  Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project , 2011, Nucleic acids research.

[4]  B. Fridley,et al.  Gene set analysis of SNP data: benefits, challenges, and future directions , 2011, European Journal of Human Genetics.

[5]  Sayan Mukherjee,et al.  RS-SNP: a random-set method for genome-wide association studies , 2011, BMC Genomics.

[6]  Xi Chen,et al.  An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies , 2011, Bioinform..

[7]  R. Kones Rosuvastatin, inflammation, C-reactive protein, JUPITER, and primary prevention of cardiovascular disease – a perspective , 2010, Drug design, development and therapy.

[8]  H. Hakonarson,et al.  Analysing biological pathways in genome-wide association studies , 2010, Nature Reviews Genetics.

[9]  Peilin Jia,et al.  Common variants conferring risk of schizophrenia: A pathway analysis of GWAS data , 2010, Schizophrenia Research.

[10]  Peter Kraft,et al.  Pathway analysis of breast cancer genome-wide association study highlights three pathways and one canonical signaling cascade. , 2010, Cancer research.

[11]  Suhua Chang,et al.  i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study , 2010, Nucleic Acids Res..

[12]  Christian Gieger,et al.  New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk , 2010, Nature Genetics.

[13]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.

[14]  Judy H. Cho,et al.  Pathway analysis comparison using Crohn's disease genome wide association studies , 2010, BMC Medical Genomics.

[15]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[16]  Elizabeth A. Heron,et al.  The SNP ratio test: pathway analysis of genome-wide association datasets , 2009, Bioinform..

[17]  Manuel A. R. Ferreira,et al.  Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. , 2009, American journal of human genetics.

[18]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[19]  P. Matthews,et al.  Pathway and network-based analysis of genome-wide association studies in multiple sclerosis , 2009, Human molecular genetics.

[20]  M. Stephens,et al.  High-Resolution Mapping of Expression-QTLs Yields Insight into Human Gene Regulation , 2008, PLoS genetics.

[21]  Igor Rudan,et al.  Runs of homozygosity in European populations. , 2008, American journal of human genetics.

[22]  Jacques Fellay,et al.  WGAViewer: software for genomic annotation of whole genome association studies. , 2008, Genome research.

[23]  Liu Gang,et al.  Clinical Application of ACC/AHA 2005 Guideline Update for the Diagnosis and Management of Chronic Heart Failure in the Adult , 2008 .

[24]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.

[25]  G. Abecasis,et al.  Family-based association tests for genomewide association scans. , 2007, American journal of human genetics.

[26]  Stijn van Dongen,et al.  Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data , 2007, PLoS Comput. Biol..

[27]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[28]  Yurii S. Aulchenko,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm108 Genetics and population analysis GenABEL: an R library for genome-wide association analysis , 2022 .

[29]  S. Colan,et al.  Incidence, causes, and outcomes of dilated cardiomyopathy in children. , 2006, JAMA.

[30]  A. Callegaro,et al.  A locally adaptive statistical procedure (LAP) to identify differentially expressed chromosomal regions , 2006, Bioinform..

[31]  J. Ornato,et al.  ACC/AHA 2005 Guideline Update for the Diagnosis and Management of Chronic Heart Failure in the Adult—Summary Article , 2005 .

[32]  Stephen F. Schaffner,et al.  The X chromosome in population genetics , 2004, Nature Reviews Genetics.

[33]  H. Bussemaker,et al.  The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. , 2003, Genome research.

[34]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[35]  Erick García-García,et al.  Signal transduction during Fc receptor‐mediated phagocytosis , 2002, Journal of leukocyte biology.

[36]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[37]  L. Fananapazir,et al.  Identification of a gene responsible for familial Wolff-Parkinson-White syndrome. , 2001, The New England journal of medicine.

[38]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[39]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[40]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[41]  G. Lopaschuk,et al.  Glucose metabolism in the ischemic heart. , 1997, Circulation.

[42]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .