Identification of cis-regulatory variation influencing protein abundance levels in human plasma.

Proteins are central to almost all cellular processes, and dysregulation of expression and function is associated with a range of disorders. A number of studies in human have recently shown that genetic factors significantly contribute gene expression variation. In contrast, very little is known about the genetic basis of variation in protein abundance in man. Here, we assayed the abundance levels of proteins in plasma from 96 elderly Europeans using a new aptamer-based proteomic technology and performed genome-wide local (cis-) regulatory association analysis to identify protein quantitative trait loci (pQTL). We detected robust cis-associations for 60 proteins at a false discovery rate of 5%. The most highly significant single nucleotide polymorphism detected was rs7021589 (false discovery rate, 2.5 × 10(-12)), mapped within the gene coding sequence of Tenascin C (TNC). Importantly, we identified evidence of cis-regulatory variation for 20 previously disease-associated genes encoding protein, including variants with strong evidence of disease association show significant association with protein abundance levels. These results demonstrate that common genetic variants contribute to the differences in protein abundance levels in human plasma. Identification of pQTLs will significantly enhance our ability to discover and comprehend the biological and functional consequences of loci identified from genome-wide association study of complex traits. This is the first large-scale genetic association study of proteins in plasma measured using a novel, highly multiplexed slow off-rate modified aptamer (SOMAmer) proteomic platform.

[1]  D. Stephan,et al.  A survey of genetic human cortical gene expression , 2007, Nature Genetics.

[2]  S. Hunt,et al.  Genome-Wide Associations of Gene Expression Variation in Humans , 2005, PLoS genetics.

[3]  L. Almasy,et al.  Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes , 2007, Nature Genetics.

[4]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[5]  S. Gygi,et al.  Correlation between Protein and mRNA Abundance in Yeast , 1999, Molecular and Cellular Biology.

[6]  E. Lange,et al.  Genome‐Wide Association Study of Anthropometric Traits and Evidence of Interactions With Age and Study Year in Filipino Women , 2011, Obesity.

[7]  Loreto Gesualdo,et al.  Genome-wide association study identifies susceptibility loci for IgA nephropathy , 2011, Nature Genetics.

[8]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  D. G. Clark,et al.  Common variants in MS4A4/MS4A6E, CD2uAP, CD33, and EPHA1 are associated with late-onset Alzheimer’s disease , 2011, Nature Genetics.

[10]  G. Mills,et al.  Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1 , 2008, Nature Genetics.

[11]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[12]  H. Hakonarson,et al.  Genome‐wide association studies (GWAS): impact on elucidating the aetiology of diabetes , 2011, Diabetes/metabolism research and reviews.

[13]  E. Schadt,et al.  Genomic analysis of metabolic pathway gene expression in mice , 2005, Genome Biology.

[14]  P. Poulsen,et al.  Heritability of Type II (non-insulin-dependent) diabetes mellitus and abnormal glucose tolerance – a population-based twin study , 1999, Diabetologia.

[15]  C. Gieger,et al.  Genome-Wide Association Study Identifies Two Novel Regions at 11p15.5-p13 and 1p31 with Major Impact on Acute-Phase Serum Amyloid A , 2010, PLoS genetics.

[16]  Hiroki Nagase,et al.  Genetic architecture of murine skin inflammation and tumor susceptibility , 2016 .

[17]  G. Page,et al.  Identification of Quantitative Trait Loci Underlying Proteome Variation in Human Lymphoblastoid Cells* , 2010, Molecular & Cellular Proteomics.

[18]  Alex Stewart,et al.  Automation of the SomaLogic Proteomics Assay: A Platform for Biomarker Discovery , 2009 .

[19]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[20]  Leonid Kruglyak,et al.  Local Regulatory Variation in Saccharomyces cerevisiae , 2005, PLoS genetics.

[21]  Andrew J. Lees,et al.  Identification of common variants influencing risk of the tauopathy Progressive Supranuclear Palsy , 2011, Nature Genetics.

[22]  Matti Pirinen,et al.  Dissection of the genetics of Parkinson's disease identifies an additional association 5′ of SNCA and multiple associated haplotypes at 17q21 , 2010, Human molecular genetics.

[23]  P. Deloukas,et al.  Common Regulatory Variation Impacts Gene Expression in a Cell Type–Dependent Manner , 2009, Science.

[24]  V. Turk,et al.  High‐molecular‐weight kininogen binds two molecules of cysteine proteinases with different rate constants , 1996, FEBS letters.

[25]  R. Spielman,et al.  Natural variation in human gene expression assessed in lymphoblastoid cells , 2003, Nature Genetics.

[26]  Mark I. McCarthy,et al.  A Genome-Wide Association Study Identifies Protein Quantitative Trait Loci (pQTLs) , 2008, PLoS genetics.

[27]  Eric E Schadt,et al.  Cis-acting expression quantitative trait loci in mice. , 2005, Genome research.

[28]  S. Horvath,et al.  Variations in DNA elucidate molecular networks that cause disease , 2008, Nature.

[29]  Qiong Yang,et al.  Genome-wide association and linkage analyses of hemostatic factors and hematological phenotypes in the Framingham Heart Study , 2007, BMC Medical Genetics.

[30]  Aaron Y. Lee,et al.  Genome-wide association study of advanced age-related macular degeneration identifies a role of the hepatic lipase gene (LIPC) , 2010, Proceedings of the National Academy of Sciences.

[31]  Nick C Fox,et al.  Common variants in ABCA7, MS4A6A/MS4A4E, EPHA1, CD33 and CD2AP are associated with Alzheimer’s disease , 2011, Nature Genetics.

[32]  Ajit Varki,et al.  Siglecs and their roles in the immune system , 2007, Nature Reviews Immunology.

[33]  M. Jarvelin,et al.  Identification of IL6R and chromosome 11q13.5 as risk loci for asthma , 2011, The Lancet.

[34]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[35]  Yusuke Nakamura,et al.  A genome-wide association study identifies three new susceptibility loci for ulcerative colitis in the Japanese population , 2009, Nature Genetics.

[36]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[37]  John D. Storey,et al.  Mapping the Genetic Architecture of Gene Expression in Human Liver , 2008, PLoS biology.

[38]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[39]  P. Elliott,et al.  Meta-Analysis of Genome-Wide Association Studies in >80 000 Subjects Identifies Multiple Loci for C-Reactive Protein Levels , 2011, Circulation.

[40]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[41]  D. Postma,et al.  Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction , 2009, Nature Genetics.

[42]  Tracy R. Keeney,et al.  Aptamer-based multiplexed proteomic technology for biomarker discovery , 2010, Nature Precedings.

[43]  P. Ridker,et al.  Novel Loci, Including Those Related to Crohn Disease, Psoriasis, and Inflammation, Identified in a Genome-Wide Association Study of Fibrinogen in 17 686 Women: The Women's Genome Health Study , 2009, Circulation. Cardiovascular genetics.

[44]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[45]  L. Liang,et al.  A genome-wide association study of global gene expression , 2007, Nature Genetics.

[46]  L. Liang,et al.  Mapping complex disease traits with global gene expression , 2009, Nature Reviews Genetics.

[47]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[48]  E. Schadt Exploiting naturally occurring DNA variation and molecular profiling data to dissect disease and drug response traits. , 2005, Current opinion in biotechnology.

[49]  H. Stefánsson,et al.  Genetics of gene expression and its effect on disease , 2008, Nature.

[50]  Tariq Ahmad,et al.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47 , 2011, Nature Genetics.

[51]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[52]  V. Nizet,et al.  Molecular mimicry of host sialylated glycans allows a bacterial pathogen to engage neutrophil Siglec-9 and dampen the innate immune response. , 2009, Blood.

[53]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.