Genetic feature engineering enables characterisation of shared risk factors in immune-mediated diseases

Background Genome-wide association studies (GWAS) have identified pervasive sharing of genetic architectures across multiple immune-mediated diseases (IMD). By learning the genetic basis of IMD risk from common diseases, this sharing can be exploited to enable analysis of less frequent IMD where, due to limited sample size, traditional GWAS techniques are challenging. Methods Exploiting ideas from Bayesian genetic fine-mapping, we developed a disease-focused shrinkage approach to allow us to distill genetic risk components from GWAS summary statistics for a set of related diseases. We applied this technique to 13 larger GWAS of common IMD, deriving a reduced dimension “basis” that summarised the multidimensional components of genetic risk. We used independent datasets including the UK Biobank to assess the performance of the basis and characterise individual axes. Finally, we projected summary GWAS data for smaller IMD studies, with less than 1000 cases, to assess whether the approach was able to provide additional insights into genetic architecture of less common IMD or IMD subtypes, where cohort collection is challenging. Results We identified 13 IMD genetic risk components. The projection of independent UK Biobank data demonstrated the IMD specificity and accuracy of the basis even for traits with very limited case-size (e.g. vitiligo, 150 cases). Projection of additional IMD-relevant studies allowed us to add biological interpretation to specific components, e.g. related to raised eosinophil counts in blood and serum concentration of the chemokine CXCL10 (IP-10). On application to 22 rare IMD and IMD subtypes, we were able to not only highlight subtype-discriminating axes (e.g. for juvenile idiopathic arthritis) but also suggest eight novel genetic associations. Conclusions Requiring only summary-level data, our unsupervised approach allows the genetic architectures across any range of clinically related traits to be characterised in fewer dimensions. This facilitates the analysis of studies with modest sample size by matching shared axes of both genetic and biological risk across a wider disease domain, and provides an evidence base for possible therapeutic repurposing opportunities.

[1]  Scott M. Williams,et al.  The Missing Diversity in Human Genetic Studies , 2019, Cell.

[2]  Hui Guo,et al.  VSEAMS: a pipeline for variant set enrichment analysis using summary GWAS data identifies IKZF3, BATF and ESRRA as key transcription factors in type 1 diabetes , 2014, Bioinform..

[3]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[4]  Xavier Estivill,et al.  Genetic architecture distinguishes systemic juvenile idiopathic arthritis from other forms of juvenile idiopathic arthritis: clinical and therapeutic implications , 2016, Annals of the rheumatic diseases.

[5]  Noel R. Rose,et al.  Eosinophils in Autoimmune Diseases , 2017, Front. Immunol..

[6]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[7]  Siddhant Yadav,et al.  Inflammatory bowel disease is associated with an increased risk of melanoma: a systematic review and meta-analysis. , 2014, Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association.

[8]  Jianing Wang,et al.  Circulating levels of Th1 and Th2 chemokines in patients with ankylosing spondylitis. , 2016, Cytokine.

[9]  A. Hofman,et al.  Polygenic risk scores for schizophrenia and bipolar disorder predict creativity , 2015, Nature Neuroscience.

[10]  G. d’Annunzio,et al.  Serum Th1 (CXCL10) and Th2 (CCL2) chemokine levels in children with newly diagnosed Type 1 diabetes: a longitudinal study , 2008, Diabetic medicine : a journal of the British Diabetic Association.

[11]  Harminder Singh,et al.  Increased risk of nonmelanoma skin cancers among individuals with inflammatory bowel disease. , 2011, Gastroenterology.

[12]  Sampath Prahalad,et al.  Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis , 2013, Nature Genetics.

[13]  S. Berrih-Aknin,et al.  The chemokine CXCL13 is a key molecule in autoimmune myasthenia gravis. , 2006, Blood.

[14]  Chris Cotsapas,et al.  Immune-mediated disease genetics: the shared basis of pathogenesis. , 2013, Trends in immunology.

[15]  Helen E. Parkinson,et al.  The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019 , 2018, Nucleic Acids Res..

[16]  Luca Chiovato,et al.  Expression of IP-10/CXCL10 and MIG/CXCL9 in the thyroid and increased levels of IP-10/CXCL10 in the serum of patients with recent-onset Graves' disease. , 2002, The American journal of pathology.

[17]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[18]  Dennis McGonagle,et al.  A Proposed Classification of the Immunological Diseases , 2006, PLoS medicine.

[19]  Robert E Handsaker,et al.  A whole-genome sequence study identifies genetic risk factors for neuromyelitis optica , 2018, Nature Communications.

[20]  Kasper Lage,et al.  Pervasive Sharing of Genetic Effects in Autoimmune Disease , 2011, PLoS genetics.

[21]  M. Fornage,et al.  A Phenomics-Based Strategy Identifies Loci on APOC1, BRAP, and PLCG1 Associated with Metabolic Syndrome Phenotype Domains , 2011, PLoS genetics.

[22]  Francis Guillemin,et al.  Prevalence and incidence of juvenile idiopathic arthritis: a systematic review. , 2014, Joint, bone, spine : revue du rhumatisme.

[23]  William J. Astle,et al.  Genome-wide association study of eosinophilic granulomatosis with polyangiitis reveals genomic loci stratified by ANCA status , 2019, Nature Communications.

[24]  P. Venge,et al.  Eosinophil involvement in rheumatoid arthritis as reflected by elevated serum levels of eosinophil cationic protein. , 1985, Clinical and experimental immunology.

[25]  Michael Benatar,et al.  A genome-wide association study of myasthenia gravis. , 2015, JAMA neurology.

[26]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[27]  Y Oikawa,et al.  Elevated serum IP-10 levels observed in type 1 diabetes. , 2001, Diabetes care.

[28]  B. Toh,et al.  Pathophysiology and laboratory diagnosis of pernicious anemia , 2017, Immunologic research.

[29]  William J. Astle,et al.  Allelic Landscape of Human Blood Cell Trait Variation and Links , 2016 .

[30]  N. Eriksson,et al.  Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections , 2016, Nature Communications.

[31]  Christian Gieger,et al.  Association of CLEC16A with human common variable immunodeficiency disorder and role in murine B cells , 2015, Nature Communications.

[32]  Christopher M. DeBoever,et al.  Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology , 2019, Nature Communications.

[33]  W. Sandborn,et al.  Anti-IP-10 antibody (BMS-936557) for ulcerative colitis: a phase II randomised study , 2013, Gut.

[34]  K. Kendler,et al.  Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities , 2019, bioRxiv.

[35]  Alon Keinan,et al.  Principal Component Analysis Characterizes Shared Pathogenetics from Genome-Wide Association Studies , 2014, PLoS Comput. Biol..

[36]  Magalie S Leduc,et al.  Clinical whole-exome sequencing for the diagnosis of mendelian disorders. , 2013, The New England journal of medicine.

[37]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[38]  Jon Wakefield,et al.  Bayes factors for genome‐wide association studies: comparison with P‐values , 2009, Genetic epidemiology.

[39]  Zoltán Kutalik,et al.  Evaluation and application of summary statistic imputation to discover new height-associated loci , 2017, bioRxiv.

[40]  C. Lindgren,et al.  Using human genetics to guide the repurposing of medicines. , 2020, International journal of epidemiology.

[41]  Stephan Ripke,et al.  A genome-wide association study identifies a functional ERAP2 haplotype associated with birdshot chorioretinopathy. , 2014, Human molecular genetics.

[42]  William J. Astle,et al.  Whole-genome sequencing of rare disease patients in a national healthcare system , 2019, bioRxiv.

[43]  Daniel Roig,et al.  Genetic variation at the glycosaminoglycan metabolism pathway contributes to the risk of psoriatic arthritis but not psoriasis , 2018, Annals of the rheumatic diseases.

[44]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[45]  Markus Perola,et al.  Genome-wide Association Study Identifies 27 Loci Influencing Concentrations of Circulating Cytokines and Growth Factors. , 2017, American journal of human genetics.

[46]  M. V. von Herrath,et al.  Among CXCR3 Chemokines, IFN-γ-Inducible Protein of 10 kDa (CXC Chemokine Ligand (CXCL) 10) but Not Monokine Induced by IFN-γ (CXCL9) Imprints a Pattern for the Subsequent Development of Autoimmune Disease 1 , 2003, The Journal of Immunology.

[47]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[48]  Mary D Fortune,et al.  simGWAS: a fast method for simulation of large scale case–control GWAS summary statistics , 2018, bioRxiv.

[49]  K O Kong,et al.  Enhanced expression of interferon‐inducible protein‐10 correlates with disease activity and clinical manifestations in systemic lupus erythematosus , 2008, Clinical and experimental immunology.

[50]  John S. Witte,et al.  An efficient Bayesian meta-analysis approach for studying cross-phenotype genetic associations , 2017, bioRxiv.

[51]  Mario Roederer,et al.  The Genetic Architecture of the Human Immune System: A Bioresource for Autoimmunity and Disease Pathogenesis , 2015, Cell.

[52]  Huanzhong Shi Eosinophils in asthma. , 2004, Chinese medical journal.

[53]  Borbala Mifsud,et al.  Genome-wide association study of response to methotrexate in early rheumatoid arthritis patients , 2018, The Pharmacogenomics Journal.

[54]  Yoav Benjamini,et al.  Approaches to multiplicity issues in complex research in microarray analysis , 2006 .

[55]  Neil M. Walker,et al.  Statistical Colocalization of Genetic Risk Variants for Related Autoimmune Diseases in the Context of Common Controls , 2015, Nature Genetics.

[56]  Jing Ning,et al.  Identification of Novel Autoantibodies Associated With Psoriatic Arthritis , 2019, Arthritis & rheumatology.

[57]  Richard C. Dubes,et al.  Stability of a hierarchical clustering , 1980, Pattern Recognit..

[58]  Gad Getz,et al.  Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis , 2018, PLoS medicine.

[59]  Li-an Xu,et al.  A phase II, randomized, double-blind, placebo-controlled study evaluating the efficacy and safety of MDX-1100, a fully human anti-CXCL10 monoclonal antibody, in combination with methotrexate in patients with rheumatoid arthritis. , 2012, Arthritis and rheumatism.

[60]  S Al-Haddad,et al.  The role of eosinophils in inflammatory bowel disease , 2005, Gut.

[61]  M. Picardo,et al.  Vitiligo: Focus on Clinical Aspects, Immunopathogenesis, and Therapy , 2018, Clinical Reviews in Allergy & Immunology.

[62]  Soumya Raychaudhuri,et al.  Risk for myasthenia gravis maps to a 151Pro→Ala change in TNIP1 and to human leukocyte antigen‐B*08 , 2012, Annals of neurology.

[63]  M. Suarez‐Almazor,et al.  International League of Associations for Rheumatology: International League of Associations for Rheumatology classification of juvenile idiopathic arthritis: second revision, Edmonton, 2001 , 2004 .

[64]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[65]  S D Thompson,et al.  Fine-mapping the MHC locus in juvenile idiopathic arthritis (JIA) reveals genetic heterogeneity corresponding to distinct adult inflammatory arthritic diseases , 2016, Annals of the rheumatic diseases.