Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation

Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = −1.09,σ = 0.163, P = 8.2 × 10−11) and a second loss of function mutation, rs138326449 (β = −1.17,σ = 0.188, P = 1.14 × 10−9). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10−31, n = 13 480).

[1]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[2]  E. Zeggini,et al.  A rare functional cardioprotective APOC3 variant has risen in frequency in distinct population isolates , 2013, Nature Communications.

[3]  Richard G. Lee,et al.  Antisense Oligonucleotide Inhibition of Apolipoprotein C-III Reduces Plasma Triglycerides in Rodents, Nonhuman Primates, and Humans , 2013, Circulation research.

[4]  J. Marchini,et al.  Genotype Imputation with Thousands of Genomes , 2011, G3: Genes | Genomes | Genetics.

[5]  Chun-Fang Xu,et al.  Association between genetic variation at the APO AI‐CIII‐AIV gene cluster and familial combined hyperlipidaemia , 1994, Clinical genetics.

[6]  Eric Boerwinkle,et al.  Analysis of loss-of-function variants and 20 risk factor phenotypes in 8,554 individuals identifies loci influencing chronic disease , 2015, Nature Genetics.

[7]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[8]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[9]  J. O’Connell,et al.  A Null Mutation in Human APOC3 Confers a Favorable Plasma Lipid Profile and Apparent Cardioprotection , 2008, Science.

[10]  Gil McVean,et al.  Genetic characterization of Greek population isolates reveals strong genetic drift at missense and trait-associated variants , 2014, Nature Communications.

[11]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[12]  Alireza Moayyeri,et al.  The UK Adult Twin Registry (TwinsUK Resource) , 2012, Twin Research and Human Genetics.

[13]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[14]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[15]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[16]  A. Bobik,et al.  Apolipoprotein Ciii and Atherosclerosis beyond Effects on Lipid Metabolism Editorial , 2022 .

[17]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[18]  Gail Clement,et al.  A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans , 2014, Nature Communications.

[19]  M. McPeek,et al.  Are common disease susceptibility alleles the same in outbred and founder populations? , 2004, European Journal of Human Genetics.

[20]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[21]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[22]  He Zhang,et al.  Loss-of-function mutations in APOC3, triglycerides, and coronary disease. , 2014, The New England journal of medicine.

[23]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[24]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[25]  E. Zeggini,et al.  Using population isolates in genetic association studies , 2014, Briefings in functional genomics.

[26]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[27]  M. Pembrey,et al.  ALSPAC--the Avon Longitudinal Study of Parents and Children. I. Study methodology. , 2001, Paediatric and perinatal epidemiology.

[28]  M. Daly,et al.  Searching for missing heritability: Designing rare variant association studies , 2014, Proceedings of the National Academy of Sciences.