Inferring Disease Risk Genes from Sequencing Data in Multiplex Pedigrees Through Sharing of Rare Variants

We previously demonstrated how sharing of rare variants (RVs) in distant affected relatives can be used to identify variants causing a complex and heterogeneous disease. This approach tested whether single RVs were shared by all sequenced affected family members. However, as with other study designs, joint analysis of several RVs (e.g. within genes) is sometimes required to obtain sufficient statistical power. Further, phenocopies can lead to false negatives for some causal RVs if complete sharing among affecteds is required. Here we extend our methodology (Rare Variant Sharing, RVS) to address these issues. Specifically, we introduce gene-based analyses, refine RV definition based on haplotypes, and introduce a partial sharing test based on RV sharing probabilities for subsets of affected family members. RVS also has the desirable features of not requiring external estimates of variant frequency or control samples, provides functionality to assess and address violations of key assumptions, and is available as open source software for genome-wide analysis. Simulations including phenocopies, based on the families of an oral cleft study, revealed the partial and complete sharing versions of RVS achieved similar statistical power compared to alternative methods (RareIBD and the Gene-Based Segregation Test), and had superior power compared to the pedigree Variant Annotation, Analysis and Search Tool (pVAAST) linkage statistic. In studies of multiplex cleft families, analysis of rare single nucleotide variants in the exome of 151 affected relatives from 54 families revealed no significant excess sharing in any one gene, but highlighted different patterns of sharing revealed by the complete and partial sharing tests.

[1]  Ingo Ruczinski,et al.  Gene‐based segregation method for identifying rare variants in family‐based sequencing studies , 2017, Genetic epidemiology.

[2]  T. Beaty,et al.  Analysis of sequence data to identify potential risk variants for oral clefts in multiplex families , 2017, Molecular Genetics & Genomic Medicine.

[3]  Ross M. Fraser,et al.  A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness , 2014, PLoS genetics.

[4]  Ingo Ruczinski,et al.  Inferring rare disease risk variants based on exact probabilities of sharing by multiple affected relatives , 2014, Bioinform..

[5]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[6]  Iuliana Ionita-Laza,et al.  Finding disease variants in Mendelian disorders by using sequence data: methods and applications. , 2011, American journal of human genetics.

[7]  M. Ban,et al.  Genetic burden in multiple sclerosis families , 2013, Genes and Immunity.

[8]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[9]  M. King,et al.  Inherited breast and ovarian cancer. , 1995, Human molecular genetics.

[10]  S. Redline,et al.  Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. , 2016, American journal of human genetics.

[11]  G. Abecasis,et al.  Rare-variant association analysis: study designs and statistical tests. , 2014, American journal of human genetics.

[12]  R. T. Lie,et al.  A multi-ethnic genome-wide association study identifies novel loci for non-syndromic cleft lip with or without cleft palate on 2p24.2, 17q23 and 19q13. , 2016, Human molecular genetics.

[13]  Karen Nuytemans,et al.  Whole Exome Sequencing , 2020, Definitions.

[14]  J. Shendure,et al.  Exome sequencing as a tool for Mendelian disease gene discovery , 2011, Nature Reviews Genetics.

[15]  Larry N. Singh,et al.  Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes. , 2012, American journal of human genetics.

[16]  S. Gabriel,et al.  Analysis of 6,515 exomes reveals a recent origin of most human protein-coding variants , 2012, Nature.

[17]  V. Meininger,et al.  Contribution of TARDBP mutations to sporadic amyotrophic lateral sclerosis , 2008, Journal of Medical Genetics.

[18]  M. Stratton,et al.  Recent advances in understanding of genetic susceptibility to breast cancer. , 1996, Human molecular genetics.

[19]  Amy E. Hawkins,et al.  DNA sequencing of a cytogenetically normal acute myeloid leukemia genome , 2008, Nature.

[20]  T. Beaty,et al.  Whole Exome Sequencing of Distant Relatives in Multiplex Families Implicates Rare Variants in Candidate Genes for Oral Clefts , 2014, Genetics.

[21]  Steven E. Bayer,et al.  A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. , 1994, Science.

[22]  Gustavo Glusman,et al.  A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data , 2014, Nature Biotechnology.

[23]  T Takahashi,et al.  Clinical features of ovarian cancer in Japanese women with germ-line mutations of BRCA1. , 1998, Clinical cancer research : an official journal of the American Association for Cancer Research.

[24]  N. Risch Linkage strategies for genetically complex traits. I. Multilocus models. , 1990, American journal of human genetics.

[25]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[26]  A. Uitterlinden,et al.  Burden of genetic risk variants in multiple sclerosis families in the Netherlands , 2016, Multiple sclerosis journal - experimental, translational and clinical.

[27]  F. Markowetz,et al.  Molecular diagnosis. , 1968, The New England journal of medicine.

[28]  Ingo Ruczinski,et al.  Detection of rare disease variants in extended pedigrees using RVS , 2018, Bioinform..

[29]  B. Nordestgaard,et al.  Molecular diagnosis of intermediate and severe alpha(1)-antitrypsin deficiency: MZ individuals with chronic obstructive pulmonary disease may have lower lung function than MM individuals. , 2001, Clinical chemistry.

[30]  Eric M Reiman,et al.  Genetics, transcriptomics, and proteomics of Alzheimer's disease. , 2006, The Journal of clinical psychiatry.