Consequences of natural perturbations in the human plasma proteome

Proteins are the primary functional units of biology and the direct targets of most drugs, yet there is limited knowledge of the genetic factors determining inter-individual variation in protein levels. Here we reveal the genetic architecture of the human plasma proteome, testing 10.6 million DNA variants against levels of 2,994 proteins in 3,301 individuals. We identify 1,927 genetic associations with 1,478 proteins, a 4-fold increase on existing knowledge, including trans associations for 1,104 proteins. To understand consequences of perturbations in plasma protein levels, we introduce an approach that links naturally occurring genetic variation with biological, disease, and drug databases. We provide insights into pathogenesis by uncovering the molecular effects of disease-associated variants. We identify causal roles for protein biomarkers in disease through Mendelian randomization analysis. Our results reveal new drug targets, opportunities for matching existing drugs with new disease indications, and potential safety concerns for drugs under development.

[1]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[2]  Tracy R. Keeney,et al.  Aptamer-based multiplexed proteomic technology for biomarker discovery , 2010, Nature Precedings.

[3]  IGFALS gene dosage effects on serum IGF-I and glucose metabolism, body composition, bone growth in length and width, and the pharmacokinetics of recombinant human IGF-I administration. , 2014, The Journal of clinical endocrinology and metabolism.

[4]  Stephen Burgess,et al.  Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods , 2015, Statistics in medicine.

[5]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[6]  N. Timpson,et al.  Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors , 2015, European Journal of Epidemiology.

[7]  Martin Lundberg,et al.  Homogeneous antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood , 2011, Nucleic acids research.

[8]  Ulf Gyllensten,et al.  Strong effects of genetic and lifestyle factors on biomarker variation and use of personalized cutoffs , 2014, Nature Communications.

[9]  Larry Gold,et al.  Nucleic Acid Ligands With Protein-like Side Chains: Modified Aptamers and Their Use as Diagnostic and Therapeutic Agents , 2014, Molecular therapy. Nucleic acids.

[10]  Florian Hahne,et al.  Visualizing Genomic Data Using Gviz and Bioconductor , 2016, Statistical Genomics.

[11]  S. Ralston,et al.  Clinical improvement in a patient with monostotic melorheostosis after treatment with denosumab: a case report , 2018, Journal of Medical Case Reports.

[12]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[13]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[14]  Jonathan M. Cairns,et al.  Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters , 2016, Cell.

[15]  Ali Amin Al Olama,et al.  Multiple newly identified loci associated with prostate cancer susceptibility , 2008, Nature Genetics.

[16]  John D. Storey,et al.  Mapping the Genetic Architecture of Gene Expression in Human Liver , 2008, PLoS biology.

[17]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[18]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[19]  Christian Gieger,et al.  Connecting genetic risk to disease end points through the human blood plasma proteome , 2016, Nature Communications.

[20]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[21]  M. Persson,et al.  Elevated Plasma Levels of MMP-12 Are Associated With Atherosclerotic Burden and Symptomatic Cardiovascular Disease in Subjects With Type 2 Diabetes , 2015, Arteriosclerosis, thrombosis, and vascular biology.

[22]  P. Deloukas,et al.  Patterns of Cis Regulatory Variation in Diverse Human Populations , 2012, PLoS genetics.

[23]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[24]  William J. Astle,et al.  Allelic Landscape of Human Blood Cell Trait Variation and Links , 2016 .

[25]  Niku Oksala,et al.  A Novel MMP12 Locus Is Associated with Large Artery Atherosclerotic Stroke Using a Genome-Wide Age-at-Onset Informed Approach , 2014, PLoS genetics.

[26]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[27]  L. Kruglyak,et al.  The role of regulatory variation in complex traits and disease , 2015, Nature Reviews Genetics.

[28]  Mark I. McCarthy,et al.  A Genome-Wide Association Study Identifies Protein Quantitative Trait Loci (pQTLs) , 2008, PLoS genetics.

[29]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[30]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[31]  Jonathan Mant,et al.  The INTERVAL trial to determine whether intervals between blood donations can be safely and acceptably decreased to optimise blood supply: study protocol for a randomised controlled trial , 2014, Trials.

[32]  J. Danesh,et al.  A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease , 2016 .

[33]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[34]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[35]  Christian Gieger,et al.  New gene functions in megakaryopoiesis and platelet formation , 2011, Nature.

[36]  Martin Eklund,et al.  Prostate cancer screening in men aged 50-69 years (STHLM3): a prospective population-based diagnostic study. , 2015, The Lancet. Oncology.

[37]  Ruedi Aebersold,et al.  Quantitative variability of 342 plasma proteins in a human twin population , 2015 .

[38]  Panos Deloukas,et al.  Genetically distinct subsets within ANCA-associated vasculitis. , 2012, The New England journal of medicine.

[39]  Manuel A. R. Ferreira,et al.  Multi-ethnic genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis , 2015, Nature Genetics.

[40]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[41]  E. Dermitzakis,et al.  From expression QTLs to personalized transcriptomics , 2011, Nature Reviews Genetics.

[42]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[43]  Mark R Segal,et al.  Development and Validation of a Protein-Based Risk Score for Cardiovascular Outcomes Among Patients With Stable Coronary Heart Disease. , 2016, JAMA.

[44]  Allissa Dillman,et al.  Genome-wide screen identifies rs646776 near sortilin as a regulator of progranulin levels in human plasma. , 2010, American journal of human genetics.

[45]  Joel Dudley,et al.  High-Throughput Characterization of Blood Serum Proteomics of IBD Patients with Respect to Aging and Genetic Factors , 2017, PLoS genetics.

[46]  L. Patthy,et al.  Both WFIKKN1 and WFIKKN2 Have High Affinity for Growth and Differentiation Factors 8 and 11* , 2008, Journal of Biological Chemistry.

[47]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[48]  William D Fraser,et al.  Genome wide association study identifies variants at CSF1, OPTN and TNFRSF11A as genetic risk factors for Paget’s disease of bone , 2010, Nature Genetics.

[49]  Xu Shi,et al.  Aptamer-Based Proteomic Profiling Reveals Novel Candidate Biomarkers and Pathways in Cardiovascular Disease , 2016, Circulation.

[50]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[51]  Dan Xie,et al.  Variation and Genetic Control of Protein Abundance in Humans , 2013, Nature.

[52]  Ewan Birney,et al.  GARFIELD - GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction , 2016, bioRxiv.

[53]  D. C. Henckel,et al.  Case report. , 1995, Journal.

[54]  N. Jørgensen,et al.  Paget's disease of the bone after treatment with Denosumab: a case report. , 2011, Bone.

[55]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[56]  Stephen Burgess,et al.  PhenoScanner: a database of human genotype–phenotype associations , 2016, Bioinform..

[57]  Mulin Jun Li,et al.  Nature Genetics Advance Online Publication a N a Ly S I S the Support of Human Genetic Evidence for Approved Drug Indications , 2022 .

[58]  F. Dudbridge,et al.  Re: "Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects". , 2015, American journal of epidemiology.

[59]  S. Thompson,et al.  Multivariable Mendelian Randomization: The Use of Pleiotropic Genetic Variants to Estimate Causal Effects , 2015, American journal of epidemiology.

[60]  David C. Wilson,et al.  Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease , 2012, Nature.

[61]  J. Danesh,et al.  Efficiency and safety of varying the frequency of whole blood donation (INTERVAL): a randomised trial of 45 000 donors , 2017, The Lancet.

[62]  Robert Gentleman,et al.  Software for Computing and Annotating Genomic Ranges , 2013, PLoS Comput. Biol..

[63]  Norbert Schuff,et al.  Genetic studies of plasma analytes identify novel potential biomarkers for several complex traits , 2016, Scientific Reports.

[64]  Michela Traglia,et al.  TMPRSS6 rs855791 modulates hepcidin transcription in vitro and serum hepcidin levels in normal individuals. , 2011, Blood.

[65]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[66]  J. Stockman Myostatin Mutation Associated With Gross Muscle Hypertrophy in a Child , 2006 .

[67]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[68]  Eleazar Eskin,et al.  Local genetic effects on gene expression across 44 human tissues , 2016, bioRxiv.

[69]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[70]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[71]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[72]  N. Rosenthal,et al.  Macrophages in cardiac homeostasis, injury responses and progenitor cell mobilisation. , 2014, Stem cell research.

[73]  Silke Szymczak,et al.  Genetics and Beyond – The Transcriptome of Human Monocytes and Disease Susceptibility , 2010, PloS one.

[74]  Magda Tsolaki,et al.  Circulating Proteomic Signatures of Chronological Age , 2014, The journals of gerontology. Series A, Biological sciences and medical sciences.

[75]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[76]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[77]  D. Naot,et al.  Denosumab treatment for juvenile Paget's disease: results from two adult patients with osteoprotegerin deficiency ("Balkan" mutation in the TNFRSF11B gene). , 2014, The Journal of clinical endocrinology and metabolism.

[78]  Magda Tsolaki,et al.  Alzheimer's disease biomarker discovery using SOMAscan multiplexed protein technology , 2014, Alzheimer's & Dementia.

[79]  B. Javierre 3D lineage-specific genome architecture links regulatory elements and non-coding disease variants to target gene promoters , 2018 .

[80]  J. Mendell,et al.  Emerging drugs for Duchenne muscular dystrophy , 2012, Expert opinion on emerging drugs.

[81]  A. Hingorani,et al.  Nature's randomised trials , 2005, The Lancet.

[82]  J. Stenvang,et al.  Homogenous 96-Plex PEA Immunoassay Exhibiting High Sensitivity, Specificity, and Excellent Scalability , 2014, PloS one.

[83]  Sergio Contrino,et al.  InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data , 2012, Bioinform..

[84]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[85]  V. Hardman Author Information , 2021, Disability and Health Journal.