Network-assisted analysis to prioritize GWAS results: principles, methods and perspectives

Genome-wide association studies (GWAS) have rapidly become a powerful tool in genetic studies of complex diseases and traits. Traditionally, single marker-based tests have been used prevalently in GWAS and have uncovered tens of thousands of disease-associated SNPs. Network-assisted analysis (NAA) of GWAS data is an emerging area in which network-related approaches are developed and utilized to perform advanced analyses of GWAS data in order to study various human diseases or traits. Progress has been made in both methodology development and applications of NAA in GWAS data, and it has already been demonstrated that NAA results may enhance our interpretation and prioritization of candidate genes and markers. Inspired by the strong interest in and high demand for advanced GWAS data analysis, in this review article, we discuss the methodologies and strategies that have been reported for the NAA of GWAS data. Many NAA approaches search for subnetworks and assess the combined effects of multiple genes participating in the resultant subnetworks through a gene set analysis. With no restriction to pre-defined canonical pathways, NAA has the advantage of defining subnetworks with the guidance of the GWAS data under investigation. In addition, some NAA methods prioritize genes from GWAS data based on their interconnections in the reference network. Here, we summarize NAA applications to various diseases and discuss the available options and potential caveats related to their practical usage. Additionally, we provide perspectives regarding this rapidly growing research area.

[1]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[2]  M. Daly,et al.  Guilt by association , 2000, Nature Genetics.

[3]  Achim G. Hoffmann,et al.  Proceedings of the Nineteenth International Conference on Machine Learning , 2002 .

[4]  B S Weir,et al.  Truncated product method for combining P‐values , 2002, Genetic epidemiology.

[5]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[6]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[7]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[8]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[9]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[10]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.

[11]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[12]  Wendy Czika,et al.  Combining p‐values in large‐scale genomics experiments , 2007, Pharmaceutical statistics.

[13]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[14]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[15]  D. Curtis,et al.  A simple method for assessing the strength of evidence for association at the level of the whole gene , 2008, Advances and applications in bioinformatics and chemistry : AABC.

[16]  Kai Wang,et al.  A principal components regression approach to multilocus genetic association studies , 2008, Genetic epidemiology.

[17]  J. Lieberman,et al.  Genomewide association for schizophrenia in the CATIE study: results of stage 1 , 2008, Molecular Psychiatry.

[18]  Jason H. Moore,et al.  Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission , 2009, Human Genetics.

[19]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[20]  Jianmin Wu,et al.  Integrated network analysis platform for protein-protein interactions , 2009, Nature Methods.

[21]  David C. Wilson,et al.  Diverse genome-wide association studies associate the IL12/IL23 pathway with Crohn Disease. , 2009, American journal of human genetics.

[22]  M. McCarthy,et al.  Interrogating Type 2 Diabetes Genome-Wide Association Data Using a Biological Pathway-Based Approach , 2009, Diabetes.

[23]  P. Matthews,et al.  Pathway and network-based analysis of genome-wide association studies in multiple sclerosis , 2009, Human molecular genetics.

[24]  M. Daly,et al.  Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions , 2009, PLoS genetics.

[25]  Yves Moreau,et al.  Network Analysis of Differential Expression for the Identification of Disease-Causing Genes , 2009, PloS one.

[26]  C. Wijmenga,et al.  Comment on: Perry et al. (2009) Interrogating Type 2 Diabetes Genome-Wide Association Data Using a Biological Pathway-Based Approach. Diabetes;58:1463–1467 , 2009, Diabetes.

[27]  B. McKinney,et al.  Capturing the Spectrum of Interaction Effects in Genetic Association Studies by Simulated Evaporative Cooling Network Analysis , 2009, PLoS genetics.

[28]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[29]  Manuel A. R. Ferreira,et al.  Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. , 2009, American journal of human genetics.

[30]  Peilin Jia,et al.  A multi-dimensional evidence-based candidate gene prioritization approach for complex diseases-schizophrenia as a case , 2009, Bioinform..

[31]  Bing Zhang,et al.  Network-assisted protein identification and data interpretation in shotgun proteomics , 2009, Molecular systems biology.

[32]  Paul M Matthews,et al.  Genetic variation influences glutamate concentrations in brains of patients with multiple sclerosis. , 2010, Brain : a journal of neurology.

[33]  Gonçalo R. Abecasis,et al.  Functional Gene Group Analysis Reveals a Role of Synaptic Heterotrimeric G Proteins in Cognitive Ability , 2010, American journal of human genetics.

[34]  G. Gibson Hints of hidden heritability in GWAS , 2010, Nature Genetics.

[35]  Anbupalam Thalamuthu,et al.  Association tests using kernel‐based measures of multi‐locus genotype similarity between individuals , 2009, Genetic epidemiology.

[36]  Peilin Jia,et al.  Assessing gene length biases in gene set analysis of Genome-Wide Association Studies , 2010, Int. J. Comput. Biol. Drug Des..

[37]  C. Farber,et al.  Identification of a gene module associated with BMD through the integration of network analysis and genome‐wide association data , 2010, Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research.

[38]  David Haussler,et al.  Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM , 2010, Bioinform..

[39]  Peilin Jia,et al.  Common variants conferring risk of schizophrenia: A pathway analysis of GWAS data , 2010, Schizophrenia Research.

[40]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.

[41]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[42]  Lin S. Chen,et al.  Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. , 2010, American journal of human genetics.

[43]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[44]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[45]  L. Stein,et al.  A human functional protein interaction network and its application to cancer data analysis , 2010, Genome Biology.

[46]  H. Hakonarson,et al.  Analysing biological pathways in genome-wide association studies , 2010, Nature Reviews Genetics.

[47]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[48]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[49]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[50]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[51]  M. Daly,et al.  Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology , 2011, PLoS genetics.

[52]  N. Akula,et al.  A Systems Approach to the Biology of Mood Disorders through Network Analysis of Candidate Genes , 2011, Pharmacopsychiatry.

[53]  Zhongming Zhao,et al.  A bias-reducing pathway enrichment analysis of genome-wide association data confirmed association of the MHC region with schizophrenia , 2011, Journal of Medical Genetics.

[54]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[55]  Jason H. Moore,et al.  Ion channels and schizophrenia: a gene set-based analytic approach to GWAS data for biological hypothesis testing , 2012, Human Genetics.

[56]  Kasper Lage,et al.  Pervasive Sharing of Genetic Effects in Autoimmune Disease , 2011, PLoS genetics.

[57]  O. U. Sezerman,et al.  A New Methodology to Associate SNPs with Human Diseases According to Their Pathway Related Context , 2011, PloS one.

[58]  Yu Liu,et al.  Systems Biology Analyses of Gene Expression and Genome Wide Association Study Data in Obstructive Sleep Apnea , 2011, Pacific Symposium on Biocomputing.

[59]  Peilin Jia,et al.  Gene set analysis of genome-wide association studies: methodological issues and perspectives. , 2011, Genomics.

[60]  Wei Zheng,et al.  dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks , 2011, Bioinform..

[61]  L. Liang,et al.  Integrating pathway analysis and genetics of gene expression for genome-wide association study of basal cell carcinoma , 2012, Human Genetics.

[62]  Johnny S. H. Kwan,et al.  GATES: a rapid and powerful gene-based association test using extended Simes procedure. , 2011, American journal of human genetics.

[63]  Y. J. Kim,et al.  A Network-Based Approach to Prioritize Results from Genome-Wide Association Studies , 2011, PloS one.

[64]  Xi Chen,et al.  An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies , 2011, Bioinform..

[65]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[66]  Ting Hu,et al.  Characterizing genetic interactions in human disease association studies using statistical epistasis networks , 2011, BMC Bioinformatics.

[67]  E. Rimm,et al.  Protein Interaction-Based Genome-Wide Analysis of Incident Coronary Heart Disease , 2011, Circulation. Cardiovascular genetics.

[68]  Zhongming Zhao,et al.  Pathway-based analysis of GWAS datasets: effective but caution required. , 2011, The international journal of neuropsychopharmacology.

[69]  Claudia Hemmelmann,et al.  Statistical analysis of rare sequence variants: an overview of collapsing methods , 2011, Genetic epidemiology.

[70]  Michael Wigler,et al.  Rare De Novo Variants Associated with Autism Implicate a Large Functional Network of Genes Involved in Formation and Function of Synapses , 2011, Neuron.

[71]  Peilin Jia,et al.  Network-Assisted Investigation of Combined Causal Signals from Genome-Wide Association Studies in Schizophrenia , 2012, PLoS Comput. Biol..

[72]  Yang Liu,et al.  Integrative pathway analysis of genome-wide association studies and gene expression data in prostate cancer , 2012, BMC Systems Biology.

[73]  Zhongming Zhao,et al.  Searching joint association signals in CATIE schizophrenia genome-wide association studies through a refined integrative network approach , 2012, BMC Genomics.

[74]  Michael F. Walker,et al.  De novo mutations revealed by whole-exome sequencing are strongly associated with autism , 2012, Nature.

[75]  O. U. Sezerman,et al.  Identification of SNP Targeted Pathways From Genome-wide Association Study (GWAS) Data , 2012 .

[76]  P. Braun Interactome mapping for analysis of complex phenotypes: Insights from benchmarking binary interaction assays , 2012, Proteomics.

[77]  Paul Pavlidis,et al.  “Guilt by Association” Is the Exception Rather Than the Rule in Gene Networks , 2012, PLoS Comput. Biol..

[78]  D. Bennett,et al.  Alzheimer disease susceptibility loci: evidence for a protein network under natural selection. , 2012, American journal of human genetics.

[79]  Nicholas M. Pajewski,et al.  Six Degrees of Epistasis: Statistical Network Models for GWAS , 2011, Front. Gene..

[80]  Rachel B. Brem,et al.  Stitching together Multiple Data Dimensions Reveals Interacting Metabolomic and Transcriptomic Networks That Modulate Cell Regulation , 2012, PLoS biology.

[81]  Yan V. Sun,et al.  Integration of biological networks and pathways with genetic association studies , 2012, Human Genetics.

[82]  Gavin Giovannoni,et al.  Protein-Protein Interaction Analysis Highlights Additional Loci of Interest for Multiple Sclerosis , 2012, PloS one.

[83]  Eyal Ben-David,et al.  Networks of Neuronal Genes Affected by Common and Rare Variants in Autism Spectrum Disorders , 2012, PLoS genetics.

[84]  David Haussler,et al.  PARADIGM-SHIFT predicts the function of mutations in multiple cancers using pathway impact analysis , 2012, Bioinform..

[85]  T. Furey,et al.  Integrating genetic and gene expression evidence into genome-wide association analysis of gene sets. , 2011, Genome research.

[86]  Dennis Vitkup,et al.  Diverse types of genetic variation converge on functional gene networks involved in schizophrenia , 2012, Nature Neuroscience.

[87]  Paul Brennan,et al.  Comparison of Pathway Analysis Approaches Using Lung Cancer GWAS Data Sets , 2012, PloS one.

[88]  Jason H. Moore,et al.  Pathway analysis of genomic data: concepts, methods, and prospects for future development. , 2012, Trends in genetics : TIG.

[89]  S. Levy,et al.  De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia , 2012, Nature Genetics.

[90]  Bart De Moor,et al.  An unbiased evaluation of gene prioritization tools , 2012, Bioinform..

[91]  M. Stoll,et al.  A genome-wide association study identifies a gene network of ADAMTS genes in the predisposition to pediatric stroke. , 2012, Blood.

[92]  Gerome Breen,et al.  Common Genetic Variants and Gene-Expression Changes Associated with Bipolar Disorder Are Over-Represented in Brain Signaling Pathway Genes , 2012, Biological Psychiatry.

[93]  Evan T. Geller,et al.  Patterns and rates of exonic de novo mutations in autism spectrum disorders , 2012, Nature.

[94]  J. Dopazo,et al.  Discovering the hidden sub-network component in a ranked list of genes or proteins derived from genomic experiments , 2012, Nucleic acids research.

[95]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[96]  Charles R. Farber,et al.  Systems-Level Analysis of Genome-Wide Association Data , 2013, G3: Genes | Genomes | Genetics.

[97]  Peilin Jia,et al.  Association Signals Unveiled by a Comprehensive Gene Set Enrichment Analysis of Dental Caries Genome-Wide Association Studies , 2013, PloS one.

[98]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[99]  Ilan Y. Smoly,et al.  The TissueNet database of human tissue protein–protein interactions , 2012, Nucleic Acids Res..

[100]  Andrew G. Clark,et al.  Gene-Based Testing of Interactions in Association Studies of Quantitative Traits , 2013, PLoS genetics.

[101]  Sandra D'Alfonso,et al.  Network-based multiple sclerosis pathway analysis with GWAS data from 15,000 cases and 30,000 controls. , 2013, American journal of human genetics.

[102]  Cen Wu,et al.  Boosting signals in gene-based association studies via efficient SNP selection , 2014, Briefings Bioinform..