A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants.

Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy.

[1]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[2]  M. Epstein,et al.  Flexible and Robust Methods for Rare‐Variant Testing of Quantitative Traits in Trios and Nuclear Families , 2014, Genetic epidemiology.

[3]  Arnab Maity,et al.  Kernel Machine SNP‐Set Testing Under Multiple Candidate Kernels , 2013, Genetic epidemiology.

[4]  Jianxin Shi,et al.  Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs , 2013, Nature Genetics.

[5]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[6]  HidenoriKoyama,et al.  Platelet P-Selectin Expression Is Associated With Atherosclerotic Wall Thickness in Carotid Artery in Humans , 2003 .

[7]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[8]  F. Agakov,et al.  Abundant pleiotropy in human complex diseases and traits. , 2011, American journal of human genetics.

[9]  E. Boerwinkle,et al.  Familial aggregation of hypertension treatment and control in the Genetic Epidemiology Network of Arteriopathy (GENOA) study. , 2004, The American journal of medicine.

[10]  P. Visscher,et al.  Common polygenic variation contributes to risk of schizophrenia and bipolar disorder , 2009, Nature.

[11]  Patrick Royston,et al.  Autosomal Genome-Wide Scan for Coronary Artery Calcification Loci in Sibships at High Risk for Hypertension , 2002, Arteriosclerosis, thrombosis, and vascular biology.

[12]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[13]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[14]  Robert Plomin,et al.  Common DNA Markers Can Account for More Than Half of the Genetic Influence on Cognitive Abilities , 2013, Psychological science.

[15]  M. Province,et al.  Avoiding the high Bonferroni penalty in genome‐wide association studies , 2009, Genetic epidemiology.

[16]  J. Gillespie The causes of molecular evolution , 1991 .

[17]  Erik Ingelsson,et al.  Genome-wide association studies of obesity and metabolic syndrome , 2014, Molecular and Cellular Endocrinology.

[18]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[19]  Momiao Xiong,et al.  Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models , 2015, Genetic epidemiology.

[20]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[21]  Jie Huang,et al.  PRIMe: a method for characterization and evaluation of pleiotropic regions from multiple genome-wide association studies , 2011, Bioinform..

[22]  N. Barton,et al.  Evolutionary quantitative genetics: how little do we know? , 1989, Annual review of genetics.

[23]  P. Visscher,et al.  A Commentary on ‘Common SNPs Explain a Large Proportion of the Heritability for Human Height’ by Yang et al. (2010) , 2010, Twin Research and Human Genetics.

[24]  Danielle Posthuma,et al.  Heritability and stability of resting blood pressure. , 2005, Twin research and human genetics : the official journal of the International Society for Twin Studies.

[25]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[26]  G. Abecasis,et al.  A general test of association for quantitative traits in nuclear families. , 2000, American journal of human genetics.

[27]  Debashis Ghosh,et al.  Equivalence of kernel machine regression and kernel distance covariance for multidimensional phenotype association studies , 2015, Biometrics.

[28]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[29]  S. Dagogo-Jack,et al.  Comorbidities of Diabetes and Hypertension: Mechanisms and Approach to Target Organ Protection , 2011, Journal of clinical hypertension.

[30]  J. Barrett,et al.  New IBD genetics: common pathways with other diseases , 2011, Gut.

[31]  Nima Hosseinzadeh,et al.  Heritability of the metabolic syndrome and its components in the Tehran Lipid and Glucose Study (TLGS). , 2012, Genetics research.

[32]  Stephan Ripke,et al.  Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs , 2012, Nature Genetics.

[33]  Lorna M. Lopez,et al.  Genome-wide association studies establish that human intelligence is highly heritable and polygenic , 2011, Molecular Psychiatry.

[34]  N. Cook,et al.  Polymorphism in the P-selectin and interleukin-4 genes as determinants of stroke: a population-based, prospective genetic analysis. , 2003, Human molecular genetics.

[35]  Olga V. Demler,et al.  Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. , 2005, Archives of general psychiatry.

[36]  Daniel J Schaid,et al.  Genomic Similarity and Kernel Methods II: Methods for Genomic Information , 2010, Human Heredity.

[37]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[38]  F. Hu,et al.  A Common Genetic Variant Is Associated with Adult and Childhood Obesity , 2006, Science.

[39]  Nicholas J. Schork,et al.  Statistical Properties of Multivariate Distance Matrix Regression for High-Dimensional Data Analysis , 2012, Front. Gene..

[40]  V. Moskvina,et al.  On multiple‐testing correction in genome‐wide association studies , 2008, Genetic epidemiology.

[41]  Michael R Kosorok On Brownian Distance Covariance and High Dimensional Data. , 2009, The annals of applied statistics.

[42]  J. Avorn,et al.  Patterns of cardiovascular risk in rheumatoid arthritis , 2006, Annals of the rheumatic diseases.

[43]  Jennifer Wessel,et al.  DNA sequence-based phenotypic association analysis. , 2008, Advances in genetics.

[44]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[45]  Christian Gieger,et al.  Six new loci associated with body mass index highlight a neuronal influence on body weight regulation , 2009, Nature Genetics.

[46]  Fredrik Barrenäs,et al.  Network properties of human disease genes with pleiotropic effects , 2010, BMC Systems Biology.

[47]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[48]  M. Turelli Heritable genetic variation via mutation-selection balance: Lerch's zeta meets the abdominal bristle. , 1984, Theoretical population biology.

[49]  Kali T. Witherspoon,et al.  Excess of rare, inherited truncating mutations in autism , 2015, Nature Genetics.

[50]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[51]  Christian Gieger,et al.  PSEA: Phenotype Set Enrichment Analysis—A New Method for Analysis of Multiple Phenotypes , 2012, Genetic epidemiology.

[52]  Tanya M. Teslovich,et al.  Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index , 2010 .

[53]  Marleen de Bruijne,et al.  A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans , 2012, PLoS genetics.

[54]  D. Balding,et al.  A Genome-Wide Association Study of the Metabolic Syndrome in Indian Asian Men , 2010, PloS one.

[55]  K Spaziano,et al.  Electronic medical records. , 2001, Radiologic technology.

[56]  D. Arveiler,et al.  The P-selectin gene is highly polymorphic: reduced frequency of the Pro715 allele carriers in patients with myocardial infarction. , 1998, Human molecular genetics.

[57]  Pierre Lafaye de Micheaux,et al.  Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods , 2010, Comput. Stat. Data Anal..

[58]  Matthew C Keller,et al.  Recent methods for polygenic analysis of genome-wide data implicate an important effect of common variants on cardiovascular disease risk , 2011, BMC Medical Genetics.

[59]  M. Daly,et al.  Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis , 2013, The Lancet.

[60]  P. O’Reilly,et al.  MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS , 2012, PloS one.

[61]  L. Kiemeney,et al.  A Comparison of Multivariate Genome-Wide Association Methods , 2014, PloS one.

[62]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[63]  E. Lander The New Genomics: Global Views of Biology , 1996, Science.

[64]  Arnab Maity,et al.  Multivariate Phenotype Association Analysis by Marker‐Set Kernel Machine Regression , 2012, Genetic epidemiology.

[65]  M P Epstein,et al.  Improved inference of relationship for pairs of individuals. , 2000, American journal of human genetics.

[66]  Shashaank Vattikuti,et al.  Heritability and Genetic Correlations Explained by Common SNPs for Metabolic Syndrome Traits , 2012, PLoS genetics.

[67]  Seunggeun Lee,et al.  General framework for meta-analysis of rare variants in sequencing association studies. , 2013, American journal of human genetics.

[68]  Christian Gieger,et al.  Genetic Variants in Novel Pathways Influence Blood Pressure and Cardiovascular Disease Risk , 2011, Nature.

[69]  A. Chakravarti Population genetics—making sense out of sequence , 1999, Nature Genetics.

[70]  Francis S. Collins,et al.  Variations on a Theme: Cataloging Human DNA Sequence Variation , 1997, Science.

[71]  M. Fornage,et al.  A Phenomics-Based Strategy Identifies Loci on APOC1, BRAP, and PLCG1 Associated with Metabolic Syndrome Phenotype Domains , 2011, PLoS genetics.

[72]  T. Ueno,et al.  Increased soluble form of P-selectin in patients with unstable angina. , 1995, Circulation.

[73]  S. Iacobelli,et al.  Increased levels of soluble P-selectin in hypercholesterolemic patients. , 1998, Circulation.

[74]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[75]  R. Lande The maintenance of genetic variability by mutation in a polygenic character with linked loci. , 2007, Genetical research.

[76]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[77]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[78]  R. Davies The distribution of a linear combination of 2 random variables , 1980 .

[79]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[80]  Yan V. Sun,et al.  A Bivariate Genome-Wide Approach to Metabolic Syndrome , 2011, Diabetes.

[81]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[82]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[83]  M. Kosorok Discussion of: Brownian distance covariance , 2009, 1010.0822.

[84]  Henrik,et al.  Association analyses of 249,796 individuals reveal eighteen new loci associated with body mass index , 2012 .

[85]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[86]  Manuel A. R. Ferreira,et al.  Genetics and population analysis A multivariate test of association , 2009 .

[87]  D B Allison,et al.  Multiple phenotype modeling in gene-mapping studies of quantitative traits: power advantages. , 1998, American journal of human genetics.

[88]  Naomi R. Wray,et al.  Estimation and partitioning of polygenic variation captured by common SNPs for Alzheimer's disease, multiple sclerosis and endometriosis , 2012, Human molecular genetics.

[89]  Jean-Louis Golmard,et al.  Specific haplotypes of the P-selectin gene are associated with myocardial infarction. , 2002, Human molecular genetics.