Large-scale comparison of immunoassay- and aptamer-based plasma proteomics through genetics and disease

High-throughput proteomics platforms measuring thousands of proteins in blood combined with genomic information have the power to bridge the gap between the genome and diseases and in that capture some of the environmental contributions to their risk and pathogenesis. Although such methods have already demonstrated their utility1–3, the validation of their actual protein targets is lacking. Here we present a large-scale analysis of levels of proteins in plasma and protein quantitative trait loci (pQTLs) detected using the Olink Explore 1536 (1,459 immunoassays) in 47,151 European participants from the UK Biobank with 57.7 million imputed sequence variants. We compared the results with those of a large-scale SomaScan v4 study2 (35,559 participants and 4,907 aptamer-based assays) in order to assess and compare the qualities of these two platforms. The correlation between levels of proteins targeted by the two platforms is modest (median Spearman correlation 0.46). The vast majority of assays on the Olink Explore platform had cis pQTLs, evidence that they correctly target their intended proteins (84%), while the assays on the SomaScan v4 platform were half as likely to have cis pQTLs (38%). We also highlight novel pQTLs discovered using the Olink Explore platform, not captured by SomaScan v4, and describe their colocalization with disease-associated sequence variants as well as associations between protein levels and diseases. Our results further underscore the value of proteomics data and highlight the major differences in quality between the two most commonly used high-throughput proteomics platforms.

[1]  Hannes P. Eggertsson,et al.  The sequences of 150,119 genomes in the UK Biobank , 2021, Nature.

[2]  Bjarni V. Halldórsson,et al.  Large-scale integration of the plasma proteome with genetics and disease , 2021, Nature Genetics.

[3]  A. Hingorani,et al.  Synergistic insights into human health from aptamer- and antibody-based proteomic profiling , 2021, Nature Communications.

[4]  E. Gamazon,et al.  Mapping the proteo-genomic convergence of human diseases , 2021, Science.

[5]  I. Grundberg,et al.  Proximity Extension Assay in Combination with Next-Generation Sequencing for High-throughput Proteome-wide Analysis , 2021, Molecular & cellular proteomics : MCP.

[6]  M. Rivas,et al.  A cross-population atlas of genetic associations for 220 human phenotypes , 2021, Nature Genetics.

[7]  J. Aerts,et al.  Towards Building a Quantitative Proteomics Toolbox in Precision Medicine: A Mini-Review , 2021, Frontiers in Physiology.

[8]  Qiaofei Liu,et al.  CD58 Immunobiology at a Glance , 2021, Frontiers in Immunology.

[9]  W. M. van der Flier,et al.  Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores , 2021, Nature Communications.

[10]  G. Bergström,et al.  Next generation plasma proteome profiling to monitor health and disease , 2021, Nature Communications.

[11]  Judy H. Cho,et al.  A Systematic Review of Monogenic Inflammatory Bowel Disease. , 2021, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[12]  Peter B. McGarvey,et al.  UniProt: the universal protein knowledgebase in 2021 , 2020, Nucleic Acids Res..

[13]  E. Gamazon,et al.  Genetic architecture of host proteins involved in SARS-CoV-2 infection , 2020, Nature Communications.

[14]  J. Danesh,et al.  Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals , 2020, Nature Metabolism.

[15]  M. McCarthy,et al.  Genetics meets proteomics: perspectives for large population-based studies , 2020, Nature reviews. Genetics.

[16]  D. Gudbjartsson,et al.  FLT3 stop mutation increases FLT3 ligand level and risk of autoimmune thyroid disease , 2020, Nature.

[17]  T. Nakashima,et al.  RANKL biology: bone metabolism, the immune system, and beyond , 2020, Inflammation and Regeneration.

[18]  William J. Astle,et al.  Trans-ethnic and Ancestry-Specific Blood-Cell Genetics in 746,667 Individuals from 5 Global Populations , 2020, Cell.

[19]  F. Finkernagel,et al.  Multi-platform Affinity Proteomics Identify Proteins Linked to Metastasis and Immune Suppression in Ovarian Cancer Plasma , 2019, Front. Oncol..

[20]  Simon C. Potter,et al.  Multiple sclerosis genomic map implicates peripheral immune cells and microglia in susceptibility , 2019, Science.

[21]  F. Finkernagel,et al.  Dual-platform affinity proteomics identifies links between the recurrence of ovarian carcinoma and proteins released into the tumor microenvironment , 2019, Theranostics.

[22]  Helen E. Parkinson,et al.  The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019 , 2018, Nucleic Acids Res..

[23]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[24]  R. Harris,et al.  ErbB4 deletion predisposes to development of metabolic syndrome in mice. , 2018, American journal of physiology. Endocrinology and metabolism.

[25]  Stephen Burgess,et al.  Genomic atlas of the human plasma proteome , 2018, Nature.

[26]  Samuel E. Jones,et al.  Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry , 2018, bioRxiv.

[27]  J. Schmid,et al.  Optimized plasma preparation is essential to monitor platelet-stored molecules in humans , 2017, PloS one.

[28]  Yuri Kotliarov,et al.  Assessment of Variability in the SOMAscan Assay , 2017, Scientific Reports.

[29]  Kari Stefansson,et al.  Graphtyper enables population-scale genotyping using pangenome graphs , 2017, Nature Genetics.

[30]  E. Zeggini,et al.  A Genome-wide Association Study of Dupuytren Disease Reveals 17 Additional Variants Implicated in Fibrosis , 2017, American journal of human genetics.

[31]  A. Morris,et al.  Mapping of 79 loci for 83 plasma protein biomarkers in cardiovascular disease , 2017, PLoS genetics.

[32]  T. Peakman,et al.  Comparison of DNA quantification methodology used in the DNA extraction protocol for the UK Biobank cohort , 2017, BMC Genomics.

[33]  Christian Gieger,et al.  Connecting genetic risk to disease end points through the human blood plasma proteome , 2016, Nature Communications.

[34]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[35]  Bjarni V. Halldórsson,et al.  Large-scale whole-genome sequencing of the Icelandic population , 2015, Nature Genetics.

[36]  B. Berger,et al.  Efficient Bayesian mixed model analysis increases association power in large cohorts , 2014, Nature Genetics.

[37]  M. Daly,et al.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies , 2014, Nature Genetics.

[38]  Larry Gold,et al.  Nucleic Acid Ligands With Protein-like Side Chains: Modified Aptamers and Their Use as Diagnostic and Therapeutic Agents , 2014, Molecular therapy. Nucleic acids.

[39]  J. Stenvang,et al.  Homogenous 96-Plex PEA Immunoassay Exhibiting High Sensitivity, Specificity, and Excellent Scalability , 2014, PloS one.

[40]  Martin Lundberg,et al.  Homogeneous antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood , 2011, Nucleic acids research.

[41]  A. Catto-Smith,et al.  Interaction of Crohn's Disease Susceptibility Genes in an Australian Paediatric Cohort , 2010, PloS one.

[42]  Tracy R. Keeney,et al.  Aptamer-based multiplexed proteomic technology for biomarker discovery , 2010, PloS one.

[43]  C. O'Morain,et al.  Evaluation of 6 candidate genes on chromosome 11q23 for coeliac disease susceptibility: a case control study , 2010, BMC Medical Genetics.

[44]  S. Murray,et al.  The IL-10R1 S138G loss-of-function allele and ulcerative colitis , 2009, Genes and Immunity.

[45]  Pall I. Olason,et al.  Detection of sharing by descent, long-range phasing and haplotype imputation , 2008, Nature Genetics.

[46]  P. Elliott,et al.  The UK Biobank sample handling and storage protocol for the collection, processing and archiving of human blood and urine. , 2008, International journal of epidemiology.

[47]  M. Burns,et al.  Case-Control Study , 2020, Definitions.

[48]  J. Burnett,et al.  Elevation of circulating and ventricular adrenomedullin in human congestive heart failure. , 1995, Circulation.

[49]  K. Rajewsky,et al.  Interleukin-10-deficient mice develop chronic enterocolitis , 1993, Cell.