A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data

High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present pedigree-VAAST (pVAAST), a disease-gene identification tool designed for high-throughput sequence data in pedigrees. pVAAST uses a sequence-based model to perform variant and gene-based linkage analysis. Linkage information is then combined with functional prediction and rare variant case-control association information in a unified statistical framework. pVAAST outperformed linkage and rare-variant association tests in simulations and identified disease-causing genes from whole-genome sequence data in three human pedigrees with dominant, recessive and de novo inheritance patterns. The approach is robust to incomplete penetrance and locus heterogeneity and is applicable to a wide variety of genetic traits. pVAAST maintains high power across studies of monogenic, high-penetrance phenotypes in a single pedigree to highly polygenic, common phenotypes involving hundreds of pedigrees.

[1]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[2]  Antonio Ciampi,et al.  Adjusted Sequence Kernel Association Test for Rare Variants Controlling for Cryptic and Family Relatedness , 2013, Genetic epidemiology.

[3]  Michael P. Epstein,et al.  A permutation procedure to correct for confounders in case-control studies, including tests of rare variation. , 2012, American journal of human genetics.

[4]  Jun Wang,et al.  Genomic Diversity and Evolution of the Head Crest in the Rock Pigeon , 2013, Science.

[5]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[6]  Robert B. Hartlage,et al.  This PDF file includes: Materials and Methods , 2009 .

[7]  J. Yim,et al.  Immune deficiencies, infection, and systemic immune disordersDominant gain-of-function STAT1 mutations in FOXP3 wild-type immune dysregulation–polyendocrinopathy–enteropathy–X-linked–like syndrome , 2013 .

[8]  J. Casanova,et al.  New and recurrent gain-of-function STAT1 mutations in patients with chronic mucocutaneous candidiasis from Eastern and Central Europe , 2013, Journal of Medical Genetics.

[9]  Mark Yandell,et al.  Epistatic and Combinatorial Effects of Pigmentary Gene Mutations in the Domestic Pigeon , 2014, Current Biology.

[10]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[11]  O. Ohara,et al.  Chronic Mucocutaneous Candidiasis Caused by a Gain-of-Function Mutation in the STAT1 DNA-Binding Domain , 2012, The Journal of Immunology.

[12]  J. Doebley,et al.  Population structure and genetic diversity of New World maize races assessed by DNA microsatellites. , 2008, American journal of botany.

[13]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[14]  Mark Yandell,et al.  VAAST 2.0: Improved Variant Classification and Disease-Gene Identification Using a Conservation-Controlled Amino Acid Substitution Matrix , 2013, Genetic epidemiology.

[15]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[16]  Mark H. Wright,et al.  Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa , 2011, Nature communications.

[17]  M. Southey,et al.  Design Considerations for Massively Parallel Sequencing Studies of Complex Human Disease , 2011, PloS one.

[18]  Jonathan C. Cohen,et al.  GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5 , 2003, Nature.

[19]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[20]  W. McMahon,et al.  Genetic risk factors in two Utah pedigrees at high risk for suicide , 2013, Translational Psychiatry.

[21]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[22]  D. Weeks,et al.  Gene‐dropping vs. empirical variance estimation for allele‐sharing linkage statistics , 2006, Genetic epidemiology.

[23]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[24]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2013 , 2012, Nucleic Acids Res..

[25]  Dan Geiger,et al.  Exact genetic linkage computations for general pedigrees , 2002, ISMB.

[26]  H. Muller,et al.  Our load of mutations. , 1950, American journal of human genetics.

[27]  Y. Hirose,et al.  Autosomal-Dominant Chronic Mucocutaneous Candidiasis with STAT1-Mutation can be Complicated with Chronic Active Hepatitis and Hypothyroidism , 2012, Journal of Clinical Immunology.

[28]  Bernard R. Rosner,et al.  Fundamentals of Biostatistics. , 1992 .

[29]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[30]  Daniel J Schaid,et al.  Multiple Genetic Variant Association Testing by Collapsing and Kernel Methods With Pedigree or Population Structured Data , 2013, Genetic epidemiology.

[31]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[32]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[33]  E. Wijsman,et al.  Identification of Rare Variants from Exome Sequence in a Large Pedigree with Autism , 2013, Human Heredity.

[34]  S. Henikoff,et al.  Predicting the effects of amino acid substitutions on protein function. , 2006, Annual review of genomics and human genetics.

[35]  M. G. Reese,et al.  A probabilistic disease-gene finder for personal genomes. , 2011, Genome research.

[36]  J. Casanova,et al.  Inborn errors of human STAT1: allelic heterogeneity governs the diversity of immunological and infectious phenotypes , 2012, Current Opinion in Immunology.

[37]  I B Borecki,et al.  Linkage and association: basic concepts. , 2001, Advances in genetics.

[38]  Christian Gilissen,et al.  De novo mutations of SETBP1 cause Schinzel-Giedion syndrome , 2010, Nature Genetics.

[39]  David B. Goldstein,et al.  De novo mutations in ATP1A3 cause alternating hemiplegia of childhood , 2012, Nature Genetics.

[40]  Kenny Q. Ye,et al.  Strong Association of De Novo Copy Number Mutations with Autism , 2007, Science.

[41]  A. Fischer,et al.  Gain-of-function human STAT1 mutations impair IL-17 immunity and underlie chronic mucocutaneous candidiasis , 2011, The Journal of experimental medicine.

[42]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[43]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[44]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[45]  A. Hoischen,et al.  STAT1 mutations in autosomal dominant chronic mucocutaneous candidiasis. , 2011, The New England journal of medicine.

[46]  Huanming Yang,et al.  Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants , 2010, Nature Genetics.

[47]  Karen Eilbeck,et al.  A standard variation file format for human genome sequences , 2010, Genome Biology.