Assessing the pathogenicity, penetrance and expressivity of putative disease-causing variants in a population setting

Over 100,000 genetic variants are classified as disease-causing in public databases. However, the true penetrance of many of these rare alleles is uncertain and may be over-estimated by clinical ascertainment. As more people undergo genome sequencing there is an increasing need to assess the true penetrance of alleles. Until recently, this was not possible in a population-based setting. Here, we use data from 388,714 UK Biobank (UKB) participants of European ancestry to assess the pathogenicity and penetrance of putatively clinically important rare variants. Although rare variants are harder to genotype accurately than common variants, we were able to classify 1,244 of 4,585 (27%) putatively clinically relevant rare variants genotyped on the UKB microarray as high-quality. We defined “rare” as variants with a minor allele frequency of <0.01, and “clinically relevant” as variants that were either classified as pathogenic/likely pathogenic in ClinVar or are in genes known to cause two specific monogenic diseases in which we have some expertise: Maturity-Onset Diabetes of the Young (MODY) and severe developmental disorders (DD). We assessed the penetrance and pathogenicity of these high-quality variants by testing their association with 401 clinically-relevant traits available in UKB. We identified 27 putatively clinically relevant rare variants associated with a UKB trait but that exhibited reduced penetrance or variable expressivity compared with their associated disease. For example, the P415A PER3 variant that has been reported to cause familial advanced sleep phase syndrome is present at 0.5% frequency in the population and associated with an odds ratio of 1.38 for being a morning person (P=2×10-18). We also observed novel associations with relevant traits for heterozygous carriers of some rare recessive conditions, e.g. heterozygous carriers of the R799W ERCC4 variant that causes Xeroderma pigmentosum were more susceptible to sunburn (one extra sunburn episode reported, P=2×10-8). Within our two disease subsets, we were able to refine the penetrance estimate for the R114W HNF4A variant in diabetes (only ~10% by age 40yrs) and refute the previous disease-association of RNF135 in developmental disorders. In conclusion, this study shows that very large population-based studies will help refine the penetrance estimates of rare variants. This information will be important for anyone receiving information about their health based on putatively pathogenic variants.

[1]  K. Van Steen,et al.  The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. , 2018, Annals of translational medicine.

[2]  Samuel E. Jones,et al.  Genome-wide association analyses of chronotype in 697,828 individuals provides new insights into circadian rhythms in humans and links to disease , 2018, bioRxiv.

[3]  Amalio Telenti,et al.  Identification of misclassified ClinVar variants using disease population prevalence , 2016, bioRxiv.

[4]  J. Tyrrell,et al.  Mosaic Turner syndrome shows reduced phenotypic penetrance in an adult population study compared to clinically ascertained cases , 2018 .

[5]  P. Stankiewicz,et al.  An estimation of the prevalence of genomic disorders using chromosomal microarray data , 2018, Journal of Human Genetics.

[6]  Joshua C. Denny,et al.  Phenotype risk scores identify patients with unrecognized Mendelian disease patterns , 2018, Science.

[7]  David R. FitzPatrick,et al.  Paediatric genomics: diagnosing rare disease in children , 2018, Nature Reviews Genetics.

[8]  Maili Liu,et al.  Structural insights into the impact of two holoprosencephaly-related mutations on human TGIF1 homeodomain. , 2018, Biochemical and biophysical research communications.

[9]  Birgit Funke,et al.  Adaptation and validation of the ACMG/AMP variant classification framework for MYH7-associated inherited cardiomyopathies: recommendations by ClinGen’s Inherited Cardiomyopathy Expert Panel , 2018, Genetics in Medicine.

[10]  M. Rivas,et al.  Medical relevance of protein-truncating variants across 337,205 individuals in the UK Biobank study , 2018, Nature Communications.

[11]  J. Tyrrell,et al.  Phenotypes associated with female X chromosome aneuploidy in UK Biobank: an unselected, adult, population-based cohort , 2017, bioRxiv.

[12]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[13]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[14]  C. Sudlow,et al.  Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population , 2017, American journal of epidemiology.

[15]  S. Seal,et al.  Mutations in Epigenetic Regulation Genes Are a Major Cause of Overgrowth with Intellectual Disability , 2017, American journal of human genetics.

[16]  Eleazar Eskin,et al.  Widespread allelic heterogeneity in complex traits , 2016, bioRxiv.

[17]  A. Hattersley,et al.  Precision diabetes: learning from monogenic diabetes , 2017, Diabetologia.

[18]  Radek Szklarczyk,et al.  Rapid Resolution of Blended or Composite Multigenic Disease in Infants by Whole‐Exome Sequencing , 2017, The Journal of pediatrics.

[19]  Deciphering Developmental Disorders Study,et al.  Prevalence and architecture of de novo mutations in developmental disorders , 2017, Nature.

[20]  K. Boycott,et al.  When One Diagnosis Is Not Enough. , 2017, The New England journal of medicine.

[21]  N. Katsanis The continuum of causality in human genetic disorders , 2016, Genome Biology.

[22]  M. McCarthy,et al.  The Common p.R114W HNF4A Mutation Causes a Distinct Clinical Subtype of Monogenic Diabetes , 2016, Diabetes.

[23]  P. Visscher,et al.  Genetic pleiotropy in complex traits and diseases: implications for genomic medicine , 2016, Genome Medicine.

[24]  P. Visscher,et al.  A plethora of pleiotropy across complex traits , 2016, Nature Genetics.

[25]  Annie Niehaus,et al.  Using ClinVar as a Resource to Support Variant Interpretation , 2016, Current protocols in human genetics.

[26]  Christopher R. Jones,et al.  A PERIOD3 variant causes a circadian phenotype and is associated with a seasonal mood trait , 2016, Proceedings of the National Academy of Sciences.

[27]  Patrick F. Sullivan,et al.  Quantifying prion disease penetrance using large population control cohorts , 2016, Science Translational Medicine.

[28]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[29]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[30]  C. Herold,et al.  Adjusting heterogeneous ascertainment bias for genetic association analysis with extended families , 2015, BMC Medical Genetics.

[31]  Blair H. Smith,et al.  Generation Scotland , 2015 .

[32]  B. Shields,et al.  Recognition and Management of Individuals With Hyperglycemia Because of a Heterozygous Glucokinase Mutation , 2015, Diabetes Care.

[33]  M. Bastepe,et al.  GNAS Spectrum of Disorders , 2015, Current Osteoporosis Reports.

[34]  Alejandro Sifrim,et al.  Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data , 2015, The Lancet.

[35]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[36]  G. Lettre,et al.  Rare variant association studies: considerations, challenges and opportunities , 2015, Genome Medicine.

[37]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[38]  E. Minikel,et al.  Ascertainment bias causes false signal of anticipation in genetic prion disease. , 2014, American journal of human genetics.

[39]  Nazneen Rahman,et al.  Breast-cancer risk in families with mutations in PALB2. , 2014, The New England journal of medicine.

[40]  Caroline F. Wright,et al.  DECIPHER: database for the interpretation of phenotype-linked plausibly pathogenic sequence and copy-number variation , 2013, Nucleic Acids Res..

[41]  E. Birney,et al.  Policy challenges of clinical genome sequencing , 2013, BMJ.

[42]  W. Horner-Johnson,et al.  Assessing Understanding and Obtaining Consent from Adults with Intellectual Disabilities for a Health Promotion Study. , 2013, Journal of policy and practice in intellectual disabilities.

[43]  Michael Krawczak,et al.  Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease , 2013, Human Genetics.

[44]  M. Weedon,et al.  Improved genetic testing for monogenic diabetes using targeted next-generation sequencing , 2013, Diabetologia.

[45]  A. Utani,et al.  Malfunction of nuclease ERCC1-XPF results in diverse clinical manifestations and causes Cockayne syndrome, xeroderma pigmentosum, and Fanconi anemia. , 2013, American journal of human genetics.

[46]  Lun Yang,et al.  Quantitative assessment of the effect of LRRK2 exonic variants on the risk of Parkinson's disease: a meta-analysis. , 2012, Parkinsonism & related disorders.

[47]  J. Carpten,et al.  Germline mutations in HOXB13 and prostate-cancer risk. , 2012, The New England journal of medicine.

[48]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[49]  Raffaella Origa,et al.  BETA THALASSEMIA , 2018, The Professional Medical Journal.

[50]  M. King,et al.  Genetic Heterogeneity in Human Disease , 2010, Cell.

[51]  S. Ellard,et al.  Update on mutations in glucokinase (GCK), which cause maturity‐onset diabetes of the young, permanent neonatal diabetes, and hyperinsulinemic hypoglycemia , 2009, Human mutation.

[52]  Peter Kraft,et al.  Replication in genome-wide association studies. , 2009, Statistical science : a review journal of the Institute of Mathematical Statistics.

[53]  Frank Reimann,et al.  TAC3 and TACR3 mutations in familial hypogonadotropic hypogonadism reveal a key role for Neurokinin B in the central control of reproduction , 2009, Nature Genetics.

[54]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[55]  B. Lorenz,et al.  Mutation analysis in a family with oculocutaneous albinism manifesting in the same generation of three branches. , 2007, Molecular vision.

[56]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[57]  N. Rahman,et al.  Mutations in RNF135, a gene within the NF1 microdeletion region, cause phenotypic abnormalities including overgrowth , 2007, Nature Genetics.

[58]  S. Bale,et al.  Loss-of-function mutations in the gene encoding filaggrin cause ichthyosis vulgaris , 2006, Nature Genetics.

[59]  Carlos D Bustamante,et al.  Ascertainment bias in studies of human genome-wide polymorphism. , 2005, Genome research.

[60]  C. Summers,et al.  MC1R mutations modify the classic phenotype of oculocutaneous albinism type 2 (OCA2). , 2003, American journal of human genetics.

[61]  M. Owen,et al.  The W546X mutation of the thyrotropin receptor gene: potential major contributor to thyroid dysfunction in a Caucasian population. , 2003, The Journal of clinical endocrinology and metabolism.

[62]  G. Mollet,et al.  Structure of the human type IV collagen gene COL4A3 and mutations in autosomal Alport syndrome. , 2001, Journal of the American Society of Nephrology : JASN.

[63]  H. Dodge,et al.  Random versus volunteer selection for a community-based study. , 1998, The journals of gerontology. Series A, Biological sciences and medical sciences.

[64]  H. Smeets,et al.  Autosomal dominant Alport syndrome linked to the type IV collage alpha 3 and alpha 4 genes (COL4A3 and COL4A4). , 1997, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[65]  I. Hughes,et al.  Functional analysis of six androgen receptor mutations identified in patients with partial androgen insensitivity syndrome. , 1996, Human molecular genetics.