A Weighted SNP Correlation Network Method for Estimating Polygenic Risk Scores.

Polygenic scores are useful for examining the joint associations of genetic markers. However, because traditional methods involve summing weighted allele counts, they may fail to capture the complex nature of biology. Here we describe a network-based method, which we call weighted SNP correlation network analysis (WSCNA), and demonstrate how it could be used to generate meaningful polygenic scores. Using data on human height in a US population of non-Hispanic whites, we illustrate how this method can be used to identify SNP networks from GWAS data, create network-specific polygenic scores, examine network topology to identify hub SNPs, and gain biological insights into complex traits. In our example, we show that this method explains a larger proportion of the variance in human height than traditional polygenic score methods. We also identify hub genes and pathways that have previously been identified as influencing human height. In moving forward, this method may be useful for generating genetic susceptibility measures for other health related traits, examining genetic pleiotropy, identifying at-risk individuals, examining gene score by environmental effects, and gaining a deeper understanding of the underlying biology of complex traits.

[1]  A. Singleton,et al.  Genomewide association studies and human disease. , 2009, The New England journal of medicine.

[2]  Peter Langfelder,et al.  When Is Hub Gene Selection Better than Standard Meta-Analysis? , 2013, PloS one.

[3]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[4]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[5]  Jianxin Shi,et al.  Genetic risk sum score comprised of common polygenic variation is associated with body mass index , 2011, Human Genetics.

[6]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[7]  Jun Dong,et al.  Geometric Interpretation of Gene Coexpression Network Analysis , 2008, PLoS Comput. Biol..

[8]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[9]  J. Hirschhorn,et al.  Synthesizing genome-wide association studies and expression microarray reveals novel genes that act in the human growth plate to modulate height. , 2012, Human molecular genetics.

[10]  S. Horvath,et al.  A General Framework for Weighted Gene Co-Expression Network Analysis , 2005, Statistical applications in genetics and molecular biology.

[11]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[12]  M. Levine,et al.  A Genetic Network Associated With Stress Resistance, Longevity, and Cancer in Humans. , 2016, The journals of gerontology. Series A, Biological sciences and medical sciences.

[13]  C. Haley,et al.  An Evolutionary Perspective on Epistasis and the Missing Heritability , 2013, PLoS genetics.

[14]  Eric S. Lander,et al.  A polygenic burden of rare disruptive mutations in schizophrenia , 2014, Nature.

[15]  S. Horvath,et al.  Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target , 2006, Proceedings of the National Academy of Sciences.

[16]  A. McRae,et al.  Genome-Wide Association Study of Height and Body Mass Index in Australian Twin Families , 2010, Twin Research and Human Genetics.

[17]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[18]  Peter M Visscher,et al.  Prediction of individual genetic risk of complex disease. , 2008, Current opinion in genetics & development.

[19]  Peter M Visscher,et al.  Sizing up human height variation , 2008, Nature Genetics.

[20]  Lin Song,et al.  Comparison of co-expression measures: mutual information, correlation, and model based indices , 2012, BMC Bioinformatics.

[21]  Richard C. Davis,et al.  A systems genetic analysis of high density lipoprotein metabolism and network preservation across mouse models. , 2012, Biochimica et biophysica acta.

[22]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[23]  Peter Langfelder,et al.  Network methods for describing sample relationships in genomic datasets: application to Huntington’s disease , 2012, BMC Systems Biology.

[24]  F. Dudbridge Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.

[25]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[26]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[27]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[28]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[29]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[30]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[31]  S. Horvath,et al.  Conservation and evolution of gene coexpression networks in human and chimpanzee brains , 2006, Proceedings of the National Academy of Sciences.