Genomic atlas of the human plasma proteome

Although plasma proteins have important roles in biological processes and are the direct targets of many drugs, the genetic factors that control inter-individual variation in plasma protein levels are not well understood. Here we characterize the genetic architecture of the human plasma proteome in healthy blood donors from the INTERVAL study. We identify 1,927 genetic associations with 1,478 proteins, a fourfold increase on existing knowledge, including trans associations for 1,104 proteins. To understand the consequences of perturbations in plasma protein levels, we apply an integrated approach that links genetic variation with biological pathway, disease, and drug databases. We show that protein quantitative trait loci overlap with gene expression quantitative trait loci, as well as with disease-associated loci, and find evidence that protein biomarkers have causal roles in disease using Mendelian randomization analysis. By linking genetic factors to diseases via specific proteins, our analyses highlight potential therapeutic targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.A genetic atlas of the human plasma proteome, comprising 1,927 genetic associations with 1,478 proteins, identifies causes of disease and potential drug targets.

[1]  Tracy R. Keeney,et al.  Aptamer-based multiplexed proteomic technology for biomarker discovery , 2010, Nature Precedings.

[2]  A. Kibel Multiple newly identified loci associated with prostate cancer susceptibility , 2009 .

[3]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[4]  Panos Deloukas,et al.  Genetically distinct subsets within ANCA-associated vasculitis. , 2012, The New England journal of medicine.

[5]  L. Kruglyak,et al.  The role of regulatory variation in complex traits and disease , 2015, Nature Reviews Genetics.

[6]  John D. Storey,et al.  Mapping the Genetic Architecture of Gene Expression in Human Liver , 2008, PLoS biology.

[7]  N. Timpson,et al.  Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors , 2015, European Journal of Epidemiology.

[8]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[9]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[10]  Stephen Burgess,et al.  Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods , 2015, Statistics in medicine.

[11]  Eric Boerwinkle,et al.  Whole-genome sequencing study of serum peptide levels: the Atherosclerosis Risk in Communities study , 2017, Human molecular genetics.

[12]  Ruedi Aebersold,et al.  Quantitative variability of 342 plasma proteins in a human twin population , 2015 .

[13]  Christian Gieger,et al.  Connecting genetic risk to disease end points through the human blood plasma proteome , 2016, Nature Communications.

[14]  Jonathan Mant,et al.  The INTERVAL trial to determine whether intervals between blood donations can be safely and acceptably decreased to optimise blood supply: study protocol for a randomised controlled trial , 2014, Trials.

[15]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[16]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[17]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[18]  Magda Tsolaki,et al.  Circulating Proteomic Signatures of Chronological Age , 2014, The journals of gerontology. Series A, Biological sciences and medical sciences.

[19]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[20]  D. Lomas,et al.  The molecular and cellular pathology of α₁-antitrypsin deficiency. , 2014, Trends in molecular medicine.

[21]  Magda Tsolaki,et al.  Alzheimer's disease biomarker discovery using SOMAscan multiplexed protein technology , 2014, Alzheimer's & Dementia.

[22]  J. Danesh,et al.  Efficiency and safety of varying the frequency of whole blood donation (INTERVAL): a randomised trial of 45 000 donors , 2017, The Lancet.

[23]  William J. Astle,et al.  Allelic Landscape of Human Blood Cell Trait Variation and Links , 2016 .

[24]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[25]  Martin Eklund,et al.  Prostate cancer screening in men aged 50-69 years (STHLM3): a prospective population-based diagnostic study. , 2015, The Lancet. Oncology.

[26]  Christian Gieger,et al.  Genome-wide Association Study Of Plasma Proteins Identifies Putatively Causal Genes, Proteins, And Pathways For Cardiovascular Disease , 2017, bioRxiv.

[27]  Naoto Hirano,et al.  Identification of Functional and Expression Polymorphisms Associated With Risk for Antineutrophil Cytoplasmic Autoantibody–Associated Vasculitis , 2017, Arthritis & rheumatology.

[28]  Sergio Contrino,et al.  InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data , 2012, Bioinform..

[29]  Jonathan M. Cairns,et al.  Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters , 2016, Cell.

[30]  Larry Gold,et al.  Nucleic Acid Ligands With Protein-like Side Chains: Modified Aptamers and Their Use as Diagnostic and Therapeutic Agents , 2014, Molecular therapy. Nucleic acids.

[31]  Hailiang Huang,et al.  Fine-mapping inflammatory bowel disease loci to single variant resolution , 2017, Nature.

[32]  Niku Oksala,et al.  A Novel MMP12 Locus Is Associated with Large Artery Atherosclerotic Stroke Using a Genome-Wide Age-at-Onset Informed Approach , 2014, PLoS genetics.

[33]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[34]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[35]  Ronald Dahl,et al.  Effects of an oral MMP-9 and -12 inhibitor, AZD1236, on biomarkers in moderate/severe COPD: a randomised controlled trial. , 2012, Pulmonary pharmacology & therapeutics.

[36]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[37]  A. Hingorani,et al.  Nature's randomised trials , 2005, The Lancet.

[38]  P. Deloukas,et al.  Patterns of Cis Regulatory Variation in Diverse Human Populations , 2012, PLoS genetics.

[39]  S. Thompson,et al.  Multivariable Mendelian Randomization: The Use of Pleiotropic Genetic Variants to Estimate Causal Effects , 2015, American journal of epidemiology.

[40]  William D Fraser,et al.  Genome wide association study identifies variants at CSF1, OPTN and TNFRSF11A as genetic risk factors for Paget’s disease of bone , 2010, Nature Genetics.

[41]  Robert Gentleman,et al.  Software for Computing and Annotating Genomic Ranges , 2013, PLoS Comput. Biol..

[42]  Stephen Burgess,et al.  PhenoScanner: a database of human genotype–phenotype associations , 2016, Bioinform..

[43]  David Stacey,et al.  ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci , 2018, Nucleic acids research.

[44]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[45]  Manuel A. R. Ferreira,et al.  Multi-ethnic genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis , 2015, Nature Genetics.

[46]  F. Dudbridge,et al.  Re: "Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects". , 2015, American journal of epidemiology.

[47]  Mark R Segal,et al.  Development and Validation of a Protein-Based Risk Score for Cardiovascular Outcomes Among Patients With Stable Coronary Heart Disease. , 2016, JAMA.

[48]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[49]  N. Jørgensen,et al.  Paget's disease of the bone after treatment with Denosumab: a case report. , 2011, Bone.

[50]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[51]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[52]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[53]  Mulin Jun Li,et al.  Nature Genetics Advance Online Publication a N a Ly S I S the Support of Human Genetic Evidence for Approved Drug Indications , 2022 .

[54]  Allissa Dillman,et al.  Genome-wide screen identifies rs646776 near sortilin as a regulator of progranulin levels in human plasma. , 2010, American journal of human genetics.

[55]  Ulf Gyllensten,et al.  Strong effects of genetic and lifestyle factors on biomarker variation and use of personalized cutoffs , 2014, Nature Communications.

[56]  Ewan Birney,et al.  GARFIELD - GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction , 2016, bioRxiv.

[57]  Alexis Battle,et al.  Impact of regulatory variation from RNA to protein , 2015, Science.

[58]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[59]  Silke Szymczak,et al.  Genetics and Beyond – The Transcriptome of Human Monocytes and Disease Susceptibility , 2010, PloS one.

[60]  Martin Lundberg,et al.  Homogeneous antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood , 2011, Nucleic acids research.

[61]  Xu Shi,et al.  Aptamer-Based Proteomic Profiling Reveals Novel Candidate Biomarkers and Pathways in Cardiovascular Disease , 2016, Circulation.

[62]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[63]  Joel Dudley,et al.  High-Throughput Characterization of Blood Serum Proteomics of IBD Patients with Respect to Aging and Genetic Factors , 2017, PLoS genetics.

[64]  David Stacey,et al.  ProGeM: A framework for the prioritisation of candidate causal genes at molecular quantitative trait loci , 2017, bioRxiv.

[65]  Dan Xie,et al.  Variation and Genetic Control of Protein Abundance in Humans , 2013, Nature.

[66]  Michela Traglia,et al.  TMPRSS6 rs855791 modulates hepcidin transcription in vitro and serum hepcidin levels in normal individuals. , 2011, Blood.

[67]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[68]  J. Danesh,et al.  A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease , 2016 .

[69]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[70]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[71]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..