Data-adaptive multi-locus association testing in subjects with arbitrary genealogical relationships

Abstract Genome-wide sequencing enables evaluation of associations between traits and combinations of variants in genes and pathways. But such evaluation requires multi-locus association tests with good power, regardless of the variant and trait characteristics. And since analyzing families may yield more power than analyzing unrelated individuals, we need multi-locus tests applicable to both related and unrelated individuals. Here we describe such tests, and we introduce SKAT-X, a new test statistic that uses genome-wide data obtained from related or unrelated subjects to optimize power for the specific data at hand. Simulations show that: a) SKAT-X performs well regardless of variant and trait characteristics; and b) for binary traits, analyzing affected relatives brings more power than analyzing unrelated individuals, consistent with previous findings for single-locus tests. We illustrate the methods by application to rare unclassified missense variants in the tumor suppressor gene BRCA2, as applied to combined data from prostate cancer families and unrelated prostate cancer cases and controls in the Multi-ethnic Cohort (MEC). The methods can be implemented using open-source code for public use as the R-package GATARS (Genetic Association Tests for Arbitrarily Related Subjects) .

[1]  Saonli Basu,et al.  Adaptive SNP-Set Association Testing in Generalized Linear Mixed Models with Application to Family Studies , 2018, Behavior genetics.

[2]  Trevor Hastie,et al.  REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. , 2016, American journal of human genetics.

[3]  Kai Wang,et al.  Boosting the Power of the Sequence Kernel Association Test by Properly Estimating Its Null Distribution. , 2016, American journal of human genetics.

[4]  Ahmet Zehir,et al.  Inherited DNA-Repair Gene Mutations in Men with Metastatic Prostate Cancer. , 2016, The New England journal of medicine.

[5]  S. Redline,et al.  Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models. , 2016, American journal of human genetics.

[6]  Francesca Petralia,et al.  New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer , 2016, Journal of proteome research.

[7]  Kathryn Roeder,et al.  NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM. , 2015, The annals of applied statistics.

[8]  D. Koboldt,et al.  Adjusting Family Relatedness in Data‐driven Burden Test of Rare Variants , 2014, Genetic epidemiology.

[9]  Ayellet V. Segrè,et al.  Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease , 2014, PLoS genetics.

[10]  Daniel J Schaid,et al.  Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data , 2013, Genetic epidemiology.

[11]  Christopher A. Haiman,et al.  Genome-Wide Testing of Putative Functional Exonic Variants in Relationship with Breast and Prostate Cancer Risk in a Multiethnic Population , 2013, PLoS genetics.

[12]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[13]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[14]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[15]  Claudio J. Verzilli,et al.  An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People , 2012, Science.

[16]  Momiao Xiong,et al.  Family-based association studies for next-generation sequencing. , 2012, American journal of human genetics.

[17]  Wei Pan,et al.  Comparison of statistical tests for disease association with rare variants , 2011, Genetic epidemiology.

[18]  D. Easton,et al.  BRCA2 is a moderate penetrance gene contributing to young-onset prostate cancer: implications for genetic testing in prostate cancer patients , 2011, British Journal of Cancer.

[19]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[20]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[21]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[22]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[23]  Lee-Jen Wei,et al.  Pooled Association Tests for Rare Variants in Exon-Resequencing Studies , 2010 .

[24]  P. Scardino,et al.  Germline BRCA Mutations Denote a Clinicopathologic Subset of Prostate Cancer , 2010, Clinical Cancer Research.

[25]  Mary Sara McPeek,et al.  ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. , 2010, American journal of human genetics.

[26]  E. Zeggini,et al.  An Evaluation of Statistical Approaches to Rare Variant Analysis in Genetic Association Studies , 2009, Genetic epidemiology.

[27]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[28]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[29]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[30]  Shamil R Sunyaev,et al.  Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. , 2007, American journal of human genetics.

[31]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[32]  P. Donnelly,et al.  A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome , 2005, Science.

[33]  A. Whittemore,et al.  A genome screen of families with multiple cases of prostate cancer: evidence of genetic heterogeneity. , 2001, American journal of human genetics.

[34]  L R Cardon,et al.  The power to detect linkage disequilibrium with quantitative traits in selected samples. , 2001, American journal of human genetics.

[35]  N Risch,et al.  The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases. II. Individual genotyping. , 1999, Genome research.

[36]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[37]  D. Anderson,et al.  Algorithms for minimization without derivatives , 1974 .